Feasibility of handheld mid-infrared spectroscopy to predict particle size distribution: influence of soil field condition and utilisation of existing spectral libraries
Leslie J. Janik A C , José M. Soriano-Disla A B and Sean T. Forrester AA CSIRO Environmental Contaminant Mitigation and Technologies Program, CSIRO Land and Water, Waite Campus, Waite Road, Urrbrae, 5064, South Australia, Australia.
B Present address: Technology Centre for Energy and Environment (CETENMA), Polígono Industrial Cabezo Beaza, C/ Sofía 6–13, 30353, Cartagena, Spain.
C Corresponding author. Email: les.janik@csiro.au
Soil Research 58(6) 528-539 https://doi.org/10.1071/SR20097
Submitted: 6 April 2020 Accepted: 9 June 2020 Published: 29 July 2020
Journal Compilation © CSIRO 2020 Open Access CC BY NC ND
Abstract
Partial least-squares regression (PLSR), using spectra from a handheld mid-infrared instrument (the ExoScan), was tested for the prediction of particle size distribution. Soils were sampled from agricultural sites in the Eyre Peninsula under field conditions and with varying degrees of soil preparation. Issues relevant to field sampling were identified, such as sample heterogeneity, micro-aggregate size and moisture content. The PLSR models for particle size distribution were derived with the varying degrees of preparation. Cross-validation of clay content in the as-received in situ soils resulted in low accuracy: coefficient of determination (R2) = 0.55 and root mean square error (RMSE) = 7%. This was improved by manual mixing, drying, sieving to < 2 mm and fine grinding, resulting in R2 values of 0.64, 0.75 and 0.81, and RMSE of 6%, 5% and 4% respectively; less improvement resulted for sand, with corresponding R2 values of 0.82, 0.88, 0.91 and 0.89, and RMSE of 10%, 8%, 6% and 7%. Predictions for silt remained poor. Where only archival benchtop calibration models were available, predictions of clay contents for spectra scanned with the handheld ExoScan spectrometer resulted in high error because of spectral intensity mismatch between benchtop and handheld spectra (R2 = 0.72, RMSE = 24.2% and bias = 21%). Pre-processing the benchtop spectra by piecewise direct standardisation resulted in more successful predictions (R2 = 0.73, RMSE = 6.7% and bias = –1.5%), confirming the advantage of piecewise direct standardisation for prediction from archival spectral libraries.
Additional keywords: DRIFT, partial least-squares regression, particle size analysis, piecewise direct standardisation.
Introduction
Particle size distribution (PSD) is an important parameter for many soil physical and chemical properties including soil texture, hydraulic properties and soil reactivity (Fooladmand 2008; Hu et al. 2011). However, routine laboratory analysis of PSD can be prohibitively expensive for large numbers of soil samples. Infrared spectroscopy offers an alternative, less expensive method, particularly if it can be utilised in the field. Successful use of portable near-infrared (NIR) spectrometers for in-field PSD analysis has been reported (Viscarra Rossel et al. 2009; Knadel et al. 2013), showing only slight differences between predictions of clay content from laboratory- or field-scanned spectra in the visible (vis)-NIR region, with the latter slightly more accurate. One method that Viscarra Rossel et al. (2009) successfully used, to extend the range of their spectral library in order to enable better prediction of moist field soils, was to spike their calibrations with several field spectra.
Comparison of performance between benchtop and handheld vis-NIR instruments by Bricklemyer and Brown (2010) showed that with the benchtop instrument was considerably better for predictions of clay content. This was thought to be partly due to the inability of the field-based instruments to capture all the subtle spectral variability of soils in that study. Other studies confirmed previous reports on the suitability of partial least-squares regression (PLSR) and mid-infrared (MIR) spectroscopy for PSD analysis using benchtop instrumentation (Reeves et al. 2001; Bowman and Hutka 2002; McKenzie et al. 2002; Janik et al. 2007, 2009, 2016a, 2016b; Viscarra Rossel and Webster 2012; Soriano-Disla et al. 2014).
In principle, there is no reason why portable MIR instrumentation, with PLSR analysis, should not be able to predict PSD data on soils with comparable accuracy to that of benchtop MIR spectrometers (Soriano-Disla et al. 2017). However, field use of MIR instrumentation has been hampered, until recently, by the lack of lightweight, wide spectral range and energy-efficient MIR spectrometers by the inherent problems of MIR sensitivity to soil moisture under field conditions and the small spot size of the incident MIR beam onto the soil sample in relation to soil heterogeneity. Consequently, there have been few reported studies on the use of handheld MIR spectroscopy for PSD analysis (Reeves et al. 2010; Ji et al. 2016; Hutengs et al. 2018).
Availability of lightweight and energy-efficient field-portable MIR spectrometers has now opened up the potential to predict PSD in the field. However, requirements first need to be met for handheld MIR instrumentation to be suitable for field use (Poggio et al. 2017; Zhang et al. 2017). First, the handheld instrument should be lightweight and have sufficient stable battery power to last the duration of sampling. Previous portable MIR spectrometers have been relatively bulky and required large power supplies but current technology has now overcome this.
Second, a handheld spectrometer should perform with similar accuracy to benchtop spectrometers. Comparisons exist for the MIR between handheld and benchtop spectrometers, initially by Soriano-Disla et al. (2017) followed by Hutengs et al. (2018), who showed as good as or better performance of handheld instrumentation for predicting clay content.
Third, the handheld instrument should be able to be used in the field under natural environmental conditions (Zhang et al. 2017), and covering a broad range of soil types, sample texture and particle sizes (Janik et al. 2016b). Some of the environmental issues, including sample heterogeneity and moisture content, have been addressed for NIR (Barthès et al. 2006; Brunet et al. 2007). For MIR scanning in the field, there are serious problems in representing sample heterogeneity (Soriano-Disla et al. 2018), due partly to the small sampling area of the beam relative to the size of aggregates in the sample (inter-particulate heterogeneity) and to poor access of the interior of sample aggregates for the MIR beam (intra-particulate heterogeneity). Access to the interior of the sample particles by the infrared beam is thought to be greater in the NIR due to the increased penetration depth of the radiation.
Finally, there is the issue involving the variable and often high soil moisture encountered in the field (Soriano-Disla et al. 2014; Poggio et al. 2017). Soil water can act as an analytical diluent and, if excessive, can act as a reflective surface film thus reducing the MIR absorbance of the underlying soil material (Janik et al. 2016b; Soriano-Disla et al. 2018). Apart from recent articles (e.g. Hutengs et al. 2019), these issues have been rarely addressed in the field where samples may have highly variable water contents dependent on the current environmental conditions, or are available only as highly heterogeneous intact samples making the accuracy of in-field scanning questionable.
While not restricted only to handheld MIR operation, a significant impediment to the widespread uptake of handheld MIR instrumentation for PSD and other analytes is the paucity of analytical calibration models available for carrying out predictions on handheld instruments. There are, of course, many instances of the development of MIR calibration models, some using very large numbers of samples based on laboratory benchtop spectra, but there is no guarantee that these are suitable for use in current handheld instruments. One possibility is to re-scan all samples used in the benchtop calibration models, but this is not always practical because of time constraints and unavailability of archival samples. With regard to the use of pre-existing benchtop calibration models, there are almost always problems due to differences in spectral point spacing and range, and the possibility of differences in spectral response in various spectral regions between spectrometer types. Any of these differences between benchtop and handheld instruments would render the calibrations performed on the benchtop instrument inaccurate for the handheld system.
It would be a considerable benefit to be able to use existing archival calibration models for use with handheld MIR spectrometers (Peng et al. 2014). Calibration transfer software may be required, for example, so that a particular calibration developed on a ‘master’ instrument can be used on a ‘slave’ instrument, or when calibrations need to be transferred from an older obsolete instrument. In particular, the development of such calibration transfer procedures can be highly advantageous where large and valuable archival spectra and data are available and are to be used with more recent portable or benchtop instrumentation.
In most situations, two different instruments can vary in their spectral response in the same spectral range. Attempts to predict analytes from spectra produced on a different instrument to that used to build the calibration may thus not be as accurate as expected. In the case where there is a close similarity between the master (e.g. archive instrument) and slave spectra, the data point number, interval and intensity matching of slave to master instrument by interpolation can be applied. Spectral intensity mismatch between instruments can be successfully corrected by spectral standardisation processes, provided that the differences are not excessive. Such corrections may involve a point-spacing spectrum matching process from slave to master instrument (e.g. by interpolation) followed by more refined spectral modification regression methods, e.g. piecewise direct standardisation (PDS).
It has been reported (Wang et al. 1991; Wang and Kowalski 1992; Ji et al. 2015; Viscarra Rossel et al. 2017) that direct standardisation, and in particular PDS, can be used to adjust the spectra of one instrument (the ‘slave’) to match those of a reference instrument (the ‘master’). Although this may be true for similar instruments and for samples with similar moisture contents and sieve size, it not known if PDS can be successfully applied to different types of instruments such as the benchtop versus handheld as in the present case.
The aims of this study were three-fold: to assess the effect of in-field issues such as sample heterogeneity and variable water content on MIR spectra, to compare spectra scanned with a laboratory based benchtop and in-field capable handheld spectrometers, and to examine the possibility of utilising archival library spectra. All three of these objectives have the overall aim of testing the feasibility and potential use of handheld devices in the field.
Materials and methods
Soils
Three sets of soils were scanned with a benchtop MIR spectrometer – 674 mostly Chromosols, Dermosols, Vertosols, Chromosols and Ferrosols from New South Wales (NSW); 135 samples, mostly Calcarosols, Chromosols, Dermosols, Sodosols and Vertosols from throughout South Australia, Western Australia and Victoria (sample set ACU); and 54 Tenosols, Chromosols and Calcarosols (topsoils under different land uses sampled in a more recent winter campaign) from the Eyre Peninsula (EP) in southern South Australia – representing the in-field samples under environmental conditions. Site locations are depicted in Fig. 1 and a summary of the soil analytical data and map coordinates are presented in Supplementary Tables S1 and S2.
The EP landscape is essentially flat, under winter cropping of wheat, barley, oilseed and pulses, plus wool and livestock production. The area is an arid to semi-arid zone with annual rainfall 238 mm (mostly in winter June–August) and winter–summer temperatures of 10.4–23.5°C. Ancient metamorphic rock forms the EP basement. The site locations are shown in the inset map of the southern EP in Fig. 1. The soils were analysed by laboratory methods described by Janik et al. (2016b) and data are presented in the Supplementary Table S2, along with site locations and soil descriptions.
To investigate the effect of field conditions on PSD prediction accuracy, the EP soils were sampled in winter, thus presenting a large range of water contents. It was therefore hoped to assess the potential of handheld MIR technology for predicting PSD under actual field conditions. After initial infrared scanning of the intact soils, the samples were homogenised in the field by hand-mixing (EPh) and scanned again. The soils were refrigerated at 4°C in sealed plastic bags to reduce sample degradation and loss of moisture. Soil subsamples were oven-dried at 40°C for 12 h, sieved to <2 mm and then fine-ground to <0.1 mm with a vibrating steel ring-mill (LabTechnics LM1-P, Analytical Equipment Company, South Australia) equipped with a 45-mm diameter, 440-g steel puck, for 60 s. Insufficient sample remained for the ACU set for fine grinding. For only the EP soils, unprocessed samples of the stored intact soil samples (field samples EPi), were retained and sub-sampled in triplicate before drying and sieving to <2 mm. In order to reduce analytical costs, only 30 of the 54 EP samples were selected for laboratory PSD analysis according to their MIR spectra using the Kennard–Stone algorithm (Kennard and Stone 1969) on the spectral information described below. This sample reduction was justified because the relevant spectral information for the 54 samples was captured in the 30 samples by use of the Kennard–Stone algorithm. However, due to the small size of the 30 EP sample set for modelling, the subsequent models using the EP samples were used for comparative purposes only, and any supporting evidence found in their spectra under different preparation was used to highlight the impact of field conditions.
Carbonate data were not available for the NSW set but could be predicted well by a previously developed in-house MIR-PLSR model. In most cases, MIR-predicted carbonate contents in the NSW soil set were negligible, so these soils required no correction for carbonate. In this case, the sum of PSD fractions could be simply normalised to give a sum of mass fraction of 100% without the need to account for carbonate content.
Infrared spectra
Approximately 70-mg subsamples were scanned in duplicate for 60 s with a benchtop Spectrum One (PE) spectrometer (Perkin Elmer Inc., USA), equipped with an AutoDiff diffuse reflectance (DRIFT) accessory in the frequency range 7800–450 cm–1, a resolution of 8 cm–1 and point spacing of 2 cm–1. The ACU and EP samples were also scanned, in duplicate, with a handheld ExoScan spectrometer (A2 Technologies, USA, now rebadged as the Agilent 4100) for 15 s in the frequency range 6001–649 cm–1, resolution of 8 cm–1 and an average point spacing of 1.86 cm–1. After benchtop infrared scanning of the NSW soils by the NSW Office of Environment and Heritage laboratories, those soils had been archived and were not available for further scanning with the ExoScan. Only the MIR spectral region of 4000–700 cm–1 was used for PLSR modelling due to high spectral noise in the ExoScan spectra above 4000 cm–1 and below 700 cm–1. Spectra from the NSW and ACU soils were combined into a set of 809 and referred to as the ‘library’ set. These soils were previously described by Janik et al. (2016b).
Silicon carbide (SiC) reference discs (Perkin Elmer Inc.) were used as background references for both spectrometers; a fine-grain SiC disc (bright) for the benchtop and a coarse-grain SiC disc (darker) for the handheld spectrometer. The darker SiC reference was required for the reference and sample scans because of the fixed maximum detector gain of 255 units in the ExoScan required for the relatively dark soil samples (the benchtop instrument had an auto gain capability). The spectra were expressed in pseudo absorbance (A) units, calculated from the reflectance spectra of the sample (Rs), where A = log10(R0/Rs) and R0 is background reflectivity. Spectral assignments for major soil components were made with reference to Van der Marel and Beutelspacher (1976) and Nguyen et al. (1991).
In order to compare similarity between pairs of spectra, Pearson’s correlation coefficient (r) can be used, where 1 is a perfect match and 0 indicates no match. However, in the case where only relatively small variations between spectra occur (as in this study), r is relatively insensitive, with values generally only ranging within 0.90–1.00. In order to increase the apparent sensitivity, we propose using a modification of the Pearson metric, previously used (unpublished to our knowledge) for comparing differences between vis-NIR spectra of grape material. The function, indicating the comparative indication of replicate sample heterogeneity, due to soil water, composition or aggregate size was made through an empirical spectral repeatability function (Sr):
where r is calculated between scans of replicate samples and Sr increases with decreasing heterogeneity (i.e. reduced replication variance). Although there is no specific defined threshold, values of Sr can range, for example, from 10 to 20 and 100 for r = 0.90, 0.95 and 0.99 respectively.
Data analysis
Spectra were imported into Unscrambler® X 10.3 software (Camo, Norway) and baseline corrected with a linear baseline offset. The PDS was carried out within the Unscrambler application (Wang and Kowalski 1992) in the range 4000–700 cm–1. The ACU ExoScan spectra were first allocated as the ‘master’ to derive a standardisation model for converting the ACU PE spectra (as the ‘slave’) into ExoScan format. This same standardisation model was then used to standardise the set of library PE spectra into ExoScan format. This ExoScan standardised library set of spectra was finally used for building PSD calibration models for predicting from the EP ExoScan-scanned spectra.
Data point interpolation to account for data point mismatch between the ExoScan and PE was achieved using the R-script ‘spc.loess.R’ (program package HyperSpec version 0.99–20180627 in CRAN). The ExoScan spectra were converted to PE data point spacing by interpolation from the native 1803 data points at a point spacing of 1.86 cm–1 and range of 649.2–4001.1 cm–1 to that of the benchtop 1676 points, with a point spacing of 2 cm–1 and range of 4000–650 cm–1. Following the interpolation, various clay calibration models were derived from the benchtop PE library spectra and then used to predict clay content from the EP ExoScan spectra.
The PLSR was performed using the Unscrambler PLSR application using leave-X-out cross-validation to derive the optimum calibration models (Geladi and Kowalski 1986). For the EP and ACU samples X was 1 (LOOCV), and was 20 for the NSW samples. Before carrying out PLSR, two-thirds of the soils were randomly allocated to the respective NSW and ACU calibration sample sets for internal cross-validation training and to derive the PLSR calibration models. The remaining one-third were allocated to a validation set for testing the calibration models. The EP samples were treated differently, in that there were only 30 samples available for analysis, so that only LOOCV was tested. Also, because of the small EP sample set size, no samples were omitted as outliers, even though removal of any identified outliers might have significantly improved the models.
The regression statistics for infrared predictions were reported in terms of r, coefficient of determination (R2 – calculated as the square of r according to the Excel RSQ function), non-bias corrected root mean square error (RMSE, with RMSE of cross-validation (RMSECV) and RMSE of prediction (RMSEP)) and the ratio of s.d. of the reference values to the RMSEP (RPD) (Williams 1987). Some authors have used RPD to help indicate the quality of PLSR for prediction; values <1.5 are considered poor, 1.5–1.9 suggest indicator quality, 2.0–2.9 suggest good quality and ≥3.0 are of analytical quality (Janik et al. 2009); however, this is a generalised classification for comparative purposes only. The RPD depends on the s.d. of the dataset so it should not be used alone to assess the performance of predictions, but in combination with others such as calculated errors and R2. Some negative predicted values are predicted by PLSR but these are often within the ranges of calculated RMSE and can in some cases be interpreted as zero. We leave it to the reader to make such decisions depending on the nature of their own applications.
Results
Field samples
Qualitative spectral analysis
As discussed above, the impact of variable and often high moisture contents in the EP soils from in-field studies highlighted issues in deriving accurate quantitative infrared models. The DRIFT spectra should provide a meaningful qualitative depiction of soil composition including moisture content (Janik et al. 1998). The ExoScan MIR spectra of three soils (EP #07, #15 and #17) for intact as-received and field-moist, hand homogenised, dried, <2-mm sieved and fine-ground are shown in Fig. 2. These three samples were selected by the Kennard–Stone method as representing the major range of spectral variability in the EP sample set spectra. The spectra showed that the soils ranged in composition from smectite, carbonate and quartz and were used to demonstrate the effects of soil water, sieving and grinding.
The ExoScan spectra of the intact, as-received field-moist EP soils #07 and #15 showed strong water absorption feature in the NIR at 5250 cm–1 (~1900 nm) and near 3450–3350 cm–1 in the MIR, both of which can partly mask the –OH stretching vibrations due to clay minerals and organic matter. Most of the spectral detail within 4000–1300 cm–1 in the wet soils was significantly reduced, particularly for the clay Al–OH bands near 3700–3600 cm–1 and for carbonate near 2960–2900, 2520 and 1800 cm–1. Drying the samples, as shown by the spectra for the <2-mm dried and sieved samples, improved the spectral detail, with the kaolinite, carbonate and quartz peaks now more clearly resolved.
A significant improvement in spectral resolution resulted from a reduction in aggregate size by fine grinding. This was in agreement with Janik et al. (2016b) for the prediction of particle size and by Soriano-Disla et al. (2018) for the prediction of cyanide concentrations using portable MIR DRIFT spectra. Spectra of the dried and <0.1 mm fine-ground (fine) samples, in comparison to those dried and <2-mm sieved, showed a major improvement in spectral detail of the silicate peaks in the 1500–1000 cm–1 frequency regions. The values of Sr for replicate scans for the EP soil spectra, following the various sample pre-treatments, are presented in Table 1.
There was considerable variation for the moist, intact samples, with r values in the range of 0.765–0.996. The Sr function values were much more sensitive indicators for similarities than r and ranged within 4–251. Replicate variation was attributed almost entirely due to compositional and soil water heterogeneity of the intact samples. The highest heterogeneity was for the intact soils, with average Sr value of 67. Slightly better were the hand-homogenised field-moist samples, with average Sr of 94. Drying and sieving to <2 mm further improved the replication, resulting in an average Sr of 144, but maximum replication accuracy was achieved by drying and fine grinding, with an average Sr of 293. This final treatment apparently minimised the effects of variable water contents and inter- and intra-particle heterogeneity, thus providing the optimum conditions for accurate PLSR prediction performance.
Effect of sample preparation on PSD predictions
Results of the PLSR analysis of the EP samples, using the ExoScan spectra (Table 2) generally confirmed improvements in PLSR accuracy with decreasing moisture and increasing sample homogenisation (from intact to fine ground). This was consistent with the improvements in spectral detail observed in Fig. 2. Clay cross-validation accuracy for untreated intact soil, using LOOCV, was marginal (R2 = 0.55, RMSECV = 7%), and was barely improved by manual homogenisation (R2 = 0.64, RMSECV = 6%). A slightly improved accuracy for clay resulted from drying and <2-mm sieving (R2 = 0.75, RMSECV = 5), and even further by drying and fine grinding (R2 = 0.81, RMSECV = 4%).
For silt, intact and hand-homogenisation of the field-moist EP soils resulted in poor LOOCV prediction accuracies (R2 = 0.66 and RMSECV = 2%, and R2 = 0.57 and RMSECV = 3% respectively). Sieving to <2 mm after drying improved the calibration accuracy (R2 = 0.68, RMSECV = 2%), with a further slight improvement for fine grinding (R2 = 0.71, RMSECV = 2%). The R2 values for sand content calibration did not consistently improve by drying, homogenising and sieving (R2 = 0.82 and RMSECV = 10%, R2 = 0.88 and RMSECV = 8%, and R2 = 0.91 and RMSECV = 6% respectively). However, improvement resulted from fine grinding (R2 = 0.89 and RMSECV = 7%).
Spectral standardisation
As discussed above, spectral standardisation can be used to process pre-existing archival benchtop-scanned spectra so that they can become useful for building calibrations of soil properties from handheld-scanned spectra. Following spectral standardisation, PLSR calibrations can be built for clay content from the original and PDS-processed PE-scanned library spectra, and then predict clay content using these calibrations for the unprocessed PE- and ExoScan-scanned EP spectra. Although only clay content was examined here, the other particle size fractions could be treated in a similar fashion.
Comparison between benchtop and handheld instrument EP spectra
There was a close visual similarity between <2-mm sieved EP sample spectra produced by the benchtop and handheld devices. However, there were minor distortions, or variations in spectral intensities for the same frequencies, especially at the high and low ends of the frequency range (Supplementary Fig. S1). The ExoScan spectra were less intense than the PE spectra at high frequencies and more intense at low frequencies. Such changes in reflectivity were thought to be due to the different background discs. The ExoScan and PE spectra matched much more closely when the same dark background was used in both instruments. The dark SiC reference disc allowed the instrument gain to be maximised for scanning relatively dark soils.
The average ExoScan and PE spectra for the <2-mm EP samples (Fig. 3a) and ExoScan versus PE ratio plot (Fig. 3b) showed that most ratios at frequencies in the 1300–500 cm–1 region were >1. That is, the ExoScan spectra were more intense in this frequency region. Further examples of these differences in intensity across the MIR spectral range, due to the two instruments, are illustrated in Supplementary Fig. S2 for a selection of five samples (e.g. #15, #20, #22, #26 and #28) scanned with the PE and ExoScan spectrometers. Again, the ratios of intensities between the two types of spectra showed values between 1.0 and 1.3 in the 1300–500 cm–1 range.
The relative spectral variabilities between samples for the benchtop and handheld instruments are further illustrated in the principal components analysis (PCA) score maps (Fig. 4). These variations are based solely on the spectral data without reference to any analytical data. The PCA scores of the ExoScan EP spectra in Fig. 4a projected onto the NSW plus ACU scores, scanned with the PE spectrometer, showed a shift of the EP scores from near the centre to the lower right quadrant. This separation between the library spectra and the EP ExoScan spectra resulted in a less-than-ideal coverage of the EP samples with respect to the NSW and ACU samples, partly explaining the high non-bias-corrected prediction error.
Comparison between PDS-processed benchtop and raw ExoScan EP spectra
The spectra in Fig. 3c, and further illustrated for samples #15, #20, #22, #26 and #28) in Supplementary Fig. S3, showed a close match between the PDS-processed <2-mm EP benchtop spectra and raw <2-mm ExoScan spectra. The average ExoScan versus PDS-modified PE ratio plot in Fig. 3d showed less variation than in Fig. 3b, although all ratios were <1 but more linear. This observation was further supported by the PCA scores plot (Fig. 4b), which illustrated the improved coverage of the EP ExoScan spectra by the PDS-processed library spectra. This improvement from the scores suggested that prediction of clay in the handheld EP spectra from the PDS-modified library benchtop spectra should be better than that of the unmodified library spectra.
Effect of spectral standardisation on prediction accuracy
In order to set a benchmark for the prediction ability of calibrations built from the library spectra, a calibration derived from the raw PE-scanned library set was used to predict the PE-scanned EP samples, expecting to provide the highest prediction accuracy (Fig. 5a). The prediction error for benchtop-scanned EP samples using the raw library calibration (RMSEP = 8.2%) was similar to that of cross-validation of clay for the entire library set (RMSECV = 8.5%). Cross-validation of the benchtop-scanned EP samples resulted in an RMSEP = 4.4% with sample #11 omitted as an outlier (see regression plots in Supplementary Fig. S4a and b).
Predictions from raw PE library spectra
The calibration from the PE-scanned library, used to predict clay content for the EP samples also scanned with the PE instrument, resulted in an accuracy of R2 = 0.76, RMSEP = 8.2% and RPD = 1.2 (see statistics presented in Table 3). This was slightly better than that from cross-validation accuracy for the PE-scanned samples, even after removal of outlier #11 (R2 = 0.73, RMSECV = 4.7%, and RPD = 2.1) (regression plots shown in Supplementary Fig. S4a and b). Cross-validation of the ExoScan-scanned EP spectra resulted in similar accuracy (R2 = 0.74, RMSECV = 5.2%, and RPD = 1.9). There was a further improvement after omitting sample #17 from the ExoScan-scanned EP spectra as an outlier (R2 = 0.81, RMSEP = 4.4% and RPD = 2.2) (regression plots shown in Supplementary Fig. S4c and d).
Predictions of clay contents for the ExoScan-scanned EP spectra, using calibrations from the raw PE-scanned library calibration (Fig. 5b) showed considerable scatter in the 10–30% range and a high slope (1.6) and intercept (11.8). Regression statistics are presented in Table 3. Regression accuracy, even after omitting EP sample #17 as an outlier for the handheld EP spectral set, was poor, with an acceptable R2 = 0.72, but a very high non-bias corrected RMSEP = 23.6. Interestingly, the bias-corrected standard error was only 5.3% compared to 4.9% for the prediction of the PE-scanned EP samples, suggesting that a simple bias correction may lead to quire a reasonable regression. While it may seem reasonable to suggest correcting for this bias in any predicted value for clay, given the relatively good R2 value and considering the much lower bias-corrected standard error of only 6.5%, such a correction may be unreliable in that it might not be applicable to other datasets.
Predictions from PDS-processed benchtop library calibration
Prediction of clay content for the ExoScan-scanned samples from the PDS-modified spectra resulted in an R2 = 0.73, RMSEP = 6.7% and bias-corrected standard error = 5.2% (Table 3 and Fig. 5c). This was a major improvement of RMSEP over that of the unmodified, although the R2 values were almost the same. This was a very encouraging result as it demonstrated that PDS pre-processing of the library spectra was the most efficient at matching the spectral characteristics between benchtop and handheld spectrometers. The regression plot (Fig. 5c) for the PDS-processed spectra now showed reduced scatter, reduced offset and an intercept closer to zero.
Discussion
The results from the present study confirmed that portable MIR instrumentation, with PLSR analysis, was able to predict PSD data on soils with comparable accuracy to that of benchtop MIR spectrometers. There are, however, several issues that need to be addressed to make this possible in the field, including sample moisture, heterogeneity and aggregate size. These issues have previously alluded to by several workers (Reeves et al. 2010; Forrester et al. 2015; Janik et al. 2016a, 2016b; Ji et al. 2016; Hutengs et al. 2018, 2019). Crucial to the assessment of handheld FTIR instrumentation for prediction of field samples soils is to compare the accuracy of calibrations derived from archival or other studies with accuracies from the current sample sets. This is not a straightforward task. According to Soriano-Disla et al. (2014), reported regression results can be affected by the presence of carbonate minerals, sample heterogeneity, varying analytical methods, sample particle size and moisture. Reference to prediction accuracies from a review of previous studies (Soriano-Disla et al. 2014) showed that median R2 values were 0.80, 0.83 and 0.63 for clay, sand and silt respectively. Our cross-validation predictions compare well, with corresponding R2 values of 0.55–0.81, 0.82–0.91 and 0.66–0.71.
In terms of the effect of variable water content on infrared spectra, although most of the NIR spectral region recorded by the PE instrument was relatively unaffected by moisture (apart from the strong response at 5250 cm–1), severe distortion of peaks in the 3700–2700 cm–1 spectral region was observed (Fig. 2). This may have been partly due to the masking of mineral and organic matter peaks by variable water peaks, thus degrading the quantitative performance of multivariate models, e.g. clay size distribution and organic matter. This distortion and the strong reductions in spectral peak intensities in the MIR renders prediction models, especially for clay content, inaccurate compared to laboratory dried soils (Janik et al. 2016a).
In some cases, the moisture and heterogeneity effects in field soils can be partly addressed by ensuring that the variations in sample heterogeneity and moisture of the field-moist samples are covered in the calibration samples. In other cases, this is not enough and relatively poor models are still obtained. For example, in this present study, poor cross-validation accuracies for clay content were obtained for the field-moist and highly heterogeneous intact and hand-homogenised soils (R2 = 0.55 and 0.64 respectively), compared to the dried <2-mm and fine-ground soils (R2 = 0.75 and 0.81 respectively).
Calibrations for clay and sand content could be marginally improved by manually homogenising the soils in the field, but better improvement can be achieved by drying the samples. For example, Reeves et al. (2010) showed that even sun-drying soil samples before in-field scanning improved regression accuracy and robustness. Fruzangohar et al. (2017) demonstrated that drying of a series of intact soil cores increased the average PLSR calibration R2, using a handheld ExoScan spectrometer for a range of soil properties, to 0.83–0.90 for in-field moist samples. Further confirmation of improved FTIR-MIR regression performance by drying soil samples was reported by Hutengs et al. (2019) in a field study using a similar handheld instrument. They showed improvement in cross-validation accuracy for soil organic carbon, with R2 increasing from 0.63 for in situ field-moist soils to 0.79 after drying, and a further improvement to R2 = 0.86 for dried and ground samples. A combination of sample drying, sieving and grinding was the most effective for optimum predictions.
While calibration models could be derived directly from the EP field samples spectra, the availability of sufficient number of field samples with reference data from the EP soil set was a problem. Expanding the calibration set by using archival spectra and data from the large PE dataset, with samples with similar sieve size and moisture contents, appeared to offer a solution. Unfortunately, the spectral variations observed in this present study between the ExoScan and PE spectra impacted adversely on the prediction accuracy of the archival derived calibrations for prediction with the handheld instrument.
The variation between benchtop and handheld spectra was thought to be largely due to the use of different reference background discs; a bright, fine-grained, SiC disc as background for the PE instrument, and a darker, large grained, SiC reference disc as background for the ExoScan spectrometer. The difficulty here was that the PE NSW spectra were archived spectra and could not be easily re-scanned with the same background as the ExoScan due to the samples being no longer available. The reason for using the dark reference disc in the ExoScan was that this particular version of instrument uses the same detector gain for both reference and relatively dark soil samples.
Unmodified spectra scanned on a benchtop spectrometer could be used with some success for predictions from handheld spectrometer <2-mm sieved sample spectra, albeit with high non-bias corrected errors. The use of pre-processing methods, such as PDS, reduced prediction errors close to those of the benchmark spectra accuracies. The PDS has been shown to dramatically improve calibration transfer between vis-NIR instruments (Xue-Ying et al. 2018), although reports on the use of PDS for MIR spectra have not been identified by the present authors. It was felt, however, that PDS would be an advantage for MIR spectra as well as for vis-NIR.
This study showed that successful use of <2-mm dry sieved archival soils can be made after spectral standardisation. While this may hold for handheld spectra from soils with similar moisture content, in this case air-dry, it is not feasible to attempt to use air-dry archival soil data to predict for typical in situ moist samples in spring and autumn with the handheld spectrometer. Experience has shown, however, that soil samples can be either easily dried in the field or sampling carried out under drier conditions in summer and autumn. The effect on spectra by sieving and fine grinding appears to be less important, although sample heterogeneity may require averaging several replicate scans. Direct scanning of dry soils in the field may thus be possible, but further studies are required to test the applicability of spectral standardisation.
Conclusions
The overall objective of this article was to test the feasibility and potential use of a MIR handheld device in the field. The results suggest that handheld FTIR-MIR spectrometers are useful for field use, potentially capable of significant savings in analytical time and cost. However, variations in moisture content and sample heterogeneity, at the dimensional scales typical of FTIR-MIR spectroscopy, should still be taken into account. Our results suggested that reducing soil moisture was an important step in optimising predictions. In terms of the performance of FTIR-MIR benchtop versus portable instruments for the prediction of PSD, accuracies are comparable. This study has also demonstrated the possibility of successfully utilising archival library spectra following the use of spectral standardisation methods such as PDS. Thus, the modification of spectra into a format compatible with the handheld spectrometer using spectral standardisation opens up the feasibility of utilising valuable archival data for field scanning with handheld MIR devices
Conflicts of interest
The authors declare no conflicts of interest.
Acknowledgements
The authors wish to acknowledge financial and material support from the CSIRO Technology Accelerator Fund Project (1136.6 – EOP 76507, 1163) and sponsoring by Ziltek Pty Ltd. The authors also wish to acknowledge the support of Dr Andrew Rawson from the NSW Office of Environment and Heritage for providing the NSW soils spectra and data, and the CSIRO Land and Water Flagship, Urrbrae, Analytical Chemistry Unit.
References
Barthès BG, Brunet D, Ferrer H, Chotte JL, Feller C (2006) Determination of total carbon and nitrogen content in a range of tropical soils using near infrared spectroscopy: Influence of replication and sample grinding and drying. Journal of Near Infrared Spectroscopy 14, 341–348.| Determination of total carbon and nitrogen content in a range of tropical soils using near infrared spectroscopy: Influence of replication and sample grinding and drying.Crossref | GoogleScholarGoogle Scholar |
Bowman G, Hutka J (2002) Particle size analysis. In ‘Soil physical measurement and interpretation for land evaluation’. (Eds N McKenzie, K Coughlan, H Cresswell) pp. 224–239. (CSIRO Publishing: Melbourne, Vic.)
Bricklemyer RS, Brown DJ (2010) On-the-go VisNIR: Potential and limitations for mapping soil clay and organic carbon. Computers and Electronics in Agriculture 70, 209–216.
| On-the-go VisNIR: Potential and limitations for mapping soil clay and organic carbon.Crossref | GoogleScholarGoogle Scholar |
Brunet D, Barthès BG, Chotte JL, Feller C (2007) Determination of carbon and nitrogen contents in Alfisols, Oxisols and Ultisols from Africa and Brazil using NIRS analysis: effects of sample grinding and set heterogeneity. Geoderma 139, 106–117.
| Determination of carbon and nitrogen contents in Alfisols, Oxisols and Ultisols from Africa and Brazil using NIRS analysis: effects of sample grinding and set heterogeneity.Crossref | GoogleScholarGoogle Scholar |
Fooladmand HR (2008) Estimating cation exchange capacity using soil textural data and soil organic matter content: a case study for the south of Iran. Archives of Agronomy and Soil Science 54, 381–386.
| Estimating cation exchange capacity using soil textural data and soil organic matter content: a case study for the south of Iran.Crossref | GoogleScholarGoogle Scholar |
Forrester ST, Janik LJ, Soriano-Disla JM, Mason S, Burkitt L, Moody P, Gourley CJP, McLaughlin MJ (2015) Use of handheld mid-infrared spectroscopy and partial least-squares regression for the prediction of the phosphorus buffering index in Australian soils. Soil Research 53, 67–80.
| Use of handheld mid-infrared spectroscopy and partial least-squares regression for the prediction of the phosphorus buffering index in Australian soils.Crossref | GoogleScholarGoogle Scholar |
Fruzangohar M, Janik L, McLaughlin M (2017) Direct comparison between selected field infrared instruments for the prediction of soil properties in grain cropping soils. Final report, GRDC project CSO00045: Soil infrared capability, GRDC.
Geladi P, Kowalski BR (1986) Partial least-squares regression: a tutorial. Analytica Chimica Acta 185, 1–17.
| Partial least-squares regression: a tutorial.Crossref | GoogleScholarGoogle Scholar |
Hu HC, Tian FQ, Hu HP (2011) Soil particle size distribution and its relationship with soil water and salt under mulched drip irrigation in Xinjiang of China. Science China. Technological Sciences 54, 1568–1574.
| Soil particle size distribution and its relationship with soil water and salt under mulched drip irrigation in Xinjiang of China.Crossref | GoogleScholarGoogle Scholar |
Hutengs C, Ludwig B, Jung A, Eisele A, Vohland M (2018) Comparison of portable and bench-top spectrometers for mid-infrared diffuse reflectance measurements of soils. Sensors 18, 993
| Comparison of portable and bench-top spectrometers for mid-infrared diffuse reflectance measurements of soils.Crossref | GoogleScholarGoogle Scholar |
Hutengs C, Seidel M, Oertel F, Ludwig B, Vohland M (2019) In situ and laboratory soil spectroscopy with portable visible-to-near-infrared and mid-infrared instruments for the assessment of organic carbon in soils. Geoderma 355, 113900
| In situ and laboratory soil spectroscopy with portable visible-to-near-infrared and mid-infrared instruments for the assessment of organic carbon in soils.Crossref | GoogleScholarGoogle Scholar |
Janik LJ, Merry RH, Skjemstad JO (1998) Can mid infrared diffuse reflectance analysis replace soil extractions? Australian Journal of Experimental Agriculture 38, 681–696.
| Can mid infrared diffuse reflectance analysis replace soil extractions?Crossref | GoogleScholarGoogle Scholar |
Janik LJ, Merry RH, Forrester ST, Lanyon DM, Rawson A (2007) Rapid prediction of soil water retention using mid infrared spectroscopy. Soil Science Society of America Journal 71, 507–514.
| Rapid prediction of soil water retention using mid infrared spectroscopy.Crossref | GoogleScholarGoogle Scholar |
Janik LJ, Forrester ST, Rawson A (2009) The prediction of soil chemical and physical properties from mid-infrared spectroscopy and combined partial least-squares regression and neural networks (PLS-NN) analysis. Chemometrics and Intelligent Laboratory Systems 97, 179–188.
| The prediction of soil chemical and physical properties from mid-infrared spectroscopy and combined partial least-squares regression and neural networks (PLS-NN) analysis.Crossref | GoogleScholarGoogle Scholar |
Janik LJ, Soriano-Disla JM, Forrester ST, McLaughlin MJ (2016a) Moisture effects on diffuse reflection infrared spectra of contrasting minerals and soils: a mechanistic interpretation. Vibrational Spectroscopy 86, 244–252.
| Moisture effects on diffuse reflection infrared spectra of contrasting minerals and soils: a mechanistic interpretation.Crossref | GoogleScholarGoogle Scholar |
Janik LJ, Soriano-Disla JM, Forrester ST, McLaughlin MJ (2016b) Effects of soil composition and preparation on the prediction of particle size distribution using mid-infrared spectroscopy and partial least-squares regression. Soil Research 54, 889–904.
| Effects of soil composition and preparation on the prediction of particle size distribution using mid-infrared spectroscopy and partial least-squares regression.Crossref | GoogleScholarGoogle Scholar |
Ji W, Viscarra Rossel RA, Shi Z (2015) Improved estimates of organic carbon using proximally sensed vis–NIR spectra corrected by piecewise direct standardization. European Journal of Soil Science 66, 670–678.
| Improved estimates of organic carbon using proximally sensed vis–NIR spectra corrected by piecewise direct standardization.Crossref | GoogleScholarGoogle Scholar |
Ji W, Adamchuk VI, Biswas A, Dhawale NM, Sudarsan B, Zhang Y, Viscarra Rossel RA, Shi Z (2016) Assessment of soil properties in situ using a prototype portable MIR spectrometer in two agricultural fields. Biosystems Engineering 152, 14–27.
| Assessment of soil properties in situ using a prototype portable MIR spectrometer in two agricultural fields.Crossref | GoogleScholarGoogle Scholar |
Kennard RW, Stone LA (1969) Computer aided design of experiments. Technometrics 11, 137–148.
| Computer aided design of experiments.Crossref | GoogleScholarGoogle Scholar |
Knadel M, Stenberg B, Deng F, Thomsen A, Greve MH (2013) Comparing predictive abilities of three visible-near infrared spectrophotometers for soil organic carbon and clay determination. Journal of Near Infrared Spectroscopy 21, 67–80.
| Comparing predictive abilities of three visible-near infrared spectrophotometers for soil organic carbon and clay determination.Crossref | GoogleScholarGoogle Scholar |
McKenzie N, Coughlan K, Cresswell H (Eds) (2002) In ‘Soil physical measurement and interpretation for land evaluation.’ pp 224–239. (CSIRO Publishing: Melbourne)
Nguyen TT, Janik LJ, Raupach M (1991) Diffuse reflectance infrared Fourier transform (DRIFT) spectroscopy in soil studies. Australian Journal of Soil Research 29, 49–67.
| Diffuse reflectance infrared Fourier transform (DRIFT) spectroscopy in soil studies.Crossref | GoogleScholarGoogle Scholar |
Peng Y, Knadel M, Gislum R, Schelde K, Thomsen A, Greve MH (2014) Quantification of SOC and clay content using visible near-infrared reflectance–mid-infrared reflectance spectroscopy with jack-knifing partial least squares regression. Soil Science 179, 325–332.
| Quantification of SOC and clay content using visible near-infrared reflectance–mid-infrared reflectance spectroscopy with jack-knifing partial least squares regression.Crossref | GoogleScholarGoogle Scholar |
Poggio M, Brown DJ, Bricklemyer RS (2017) Comparison of V is–NIR on in situ, intact core and dried, sieved soil to estimate clay content at field to regional scales. European Journal of Soil Science 68, 434–448.
| Comparison of V is–NIR on in situ, intact core and dried, sieved soil to estimate clay content at field to regional scales.Crossref | GoogleScholarGoogle Scholar |
Reeves JB, McCarty GW, Reeves VB (2001) Mid-infrared diffuse reflectance spectroscopy for the quantitative analysis of agricultural soils. Journal of Agricultural and Food Chemistry 49, 766–772.
| Mid-infrared diffuse reflectance spectroscopy for the quantitative analysis of agricultural soils.Crossref | GoogleScholarGoogle Scholar | 11262026PubMed |
Reeves JB, McCarty GW, Hively WD (2010) Mid-versus near-infrared spectroscopy for on-site analysis of soil. In ‘Proximal soil sensing’. (Eds RA Viscarra-Rossel, AB McBratney, B Minasny) pp. 133–142. (Springer Science+Business Media: New York)
Soriano-Disla JM, Janik LJ, Viscarra Rossel RA, McDonald LM, McLaughlin MJ (2014) The performance of visible, near and mid–infrared spectroscopy for prediction of soil physical, chemical and biological properties. Applied Spectroscopy Reviews 49, 139–186.
| The performance of visible, near and mid–infrared spectroscopy for prediction of soil physical, chemical and biological properties.Crossref | GoogleScholarGoogle Scholar |
Soriano-Disla JM, Janik LJ, Allen DJ, McLaughlin MJ (2017) Evaluation of the performance of portable visible-infrared instruments for the prediction of soil properties. Biosystems Engineering 161, 24–36.
| Evaluation of the performance of portable visible-infrared instruments for the prediction of soil properties.Crossref | GoogleScholarGoogle Scholar |
Soriano-Disla JM, Janik LJ, McLaughlin MJ (2018) Assessment of cyanide contamination in soils with a handheld mid-infrared spectrometer. Talanta 178, 400–409.
| Assessment of cyanide contamination in soils with a handheld mid-infrared spectrometer.Crossref | GoogleScholarGoogle Scholar | 29136840PubMed |
Van der Marel HW, Beutelspacher H (Eds) (1976) Clay and related minerals. In ‘Atlas of infrared spectroscopy of clay minerals and their admixtures’. (Elsevier Scientific: Amsterdam)
Viscarra Rossel RA, Webster R (2012) Predicting soil properties from the Australian soil visible-near infrared spectroscopic database. European Journal of Soil Science 63, 848–860.
| Predicting soil properties from the Australian soil visible-near infrared spectroscopic database.Crossref | GoogleScholarGoogle Scholar |
Viscarra Rossel RA, Cattle SR, Ortega A, Fouad Y (2009) In situ measurements of soil colour, mineral composition and clay content by vis–NIR spectroscopy. Geoderma 150, 253–266.
| In situ measurements of soil colour, mineral composition and clay content by vis–NIR spectroscopy.Crossref | GoogleScholarGoogle Scholar |
Viscarra Rossel RA, Lobsey CR, Sharman C, Flick P, McLachlan G (2017) Novel proximal sensing for monitoring soil organic C stocks and condition. Environmental Science & Technology 51, 5630–5641.
| Novel proximal sensing for monitoring soil organic C stocks and condition.Crossref | GoogleScholarGoogle Scholar |
Wang Y, Kowalski BR (1992) Calibration transfer and measurement stability of near-infrared spectrometers Applied Spectroscopy 46, 764–771.
| Calibration transfer and measurement stability of near-infrared spectrometersCrossref | GoogleScholarGoogle Scholar |
Wang Y, Veltkamp DJ, Kowalski BR (1991) Multivariate instrument standardization. Analytical Chemistry 63, 2750–2756.
| Multivariate instrument standardization.Crossref | GoogleScholarGoogle Scholar |
Williams PC (1987) Variables affecting near-infrared reflectance spectroscopy. In ‘Near-infrared technology in the agricultural and food industries’. (Eds PC Williams, KH Norris) pp. 143–167. (American Association of Cereal Chemists Inc.: St Paul, MN, USA)
Xue-Ying L, Yan L, Mei-Rong L, Yan Z, Ping-Ping F (2018) Calibration transfer of soil total carbon and total nitrogen between two different types of soils based on visible-near-infrared reflectance spectroscopy. Hindawi Journal of Spectroscopy 2018, 1–10.
| Calibration transfer of soil total carbon and total nitrogen between two different types of soils based on visible-near-infrared reflectance spectroscopy.Crossref | GoogleScholarGoogle Scholar |
Zhang Y, Biswas A, Ji W, Adamchuk VI (2017) Depth-specific prediction of soil properties in situ using vis-NIR spectroscopy. Soil Science Society of America Journal 81, 993–1004.
| Depth-specific prediction of soil properties in situ using vis-NIR spectroscopy.Crossref | GoogleScholarGoogle Scholar |