4.4 R code and study case
Main preprocessing steps using the eemR package are illustrated using a subset of three EEMs from (Massicotte and Frenette 2011). Briefly, these EEMs (see Fig. 1 for an example) have been sampled in the St. Lawrence River, one of the largest rivers in North America. Fluorescence matrices of DOM were measured on a Cary Eclipse spectrofluorometer (Varian, Mississauga, Ontario, Canada) over excitation wavelengths between 220 and 450 nm (5-nm increment) and emission wavelengths between 230 and 600 nm (2-nm increment). All functions from the package start with the prefix 'eem_'
.
library(eemR)
ls("package:eemR")
## [1] "absorbance" "eem_bind"
## [3] "eem_biological_index" "eem_coble_peaks"
## [5] "eem_cut" "eem_export_matlab"
## [7] "eem_extract" "eem_fluorescence_index"
## [9] "eem_humification_index" "eem_inner_filter_effect"
## [11] "eem_names" "eem_names<-"
## [13] "eem_raman_normalisation" "eem_read"
## [15] "eem_remove_blank" "eem_remove_scattering"
## [17] "eem_set_wavelengths"
4.4.1 Data importation and plotting
Importation of EEMs into R is done using the eem_read()
function. Given that fluorescence files are dependent on the spectrofluorometer used, eemR
will determine automatically from which manufacturer the files are from and load them accordingly.
file <- system.file("extdata/cary/scans_day_1", package = "eemR")
eems <- eem_read(file)
The generic summary()
function displays useful information such as: (1) the wavelength ranges used in both emission and excitation modes, (2) the manufacturer from which the file was read and (3) the state of the EEM which indicate which corrections have been applied.
summary(eems)
## sample ex_min ex_max em_min em_max is_blank_corrected
## 1 nano 220 450 230 600 FALSE
## 2 sample1 220 450 230 600 FALSE
## 3 sample2 220 450 230 600 FALSE
## 4 sample3 220 450 230 600 FALSE
## is_scatter_corrected is_ife_corrected is_raman_normalized manufacturer
## 1 FALSE FALSE FALSE Cary Eclipse
## 2 FALSE FALSE FALSE Cary Eclipse
## 3 FALSE FALSE FALSE Cary Eclipse
## 4 FALSE FALSE FALSE Cary Eclipse
A surface plot of EEMs is made using the plot(x, which = 1)
function where which
is the index of the EEM to be plotted (see Fig. 3).
plot(eems, which = 3)
Interactive plots using a simple shiny app can be lunched to interactively browse EEMs.
plot(eems, interactive = TRUE)
4.4.2 Blank subtraction
Subtraction of a water blank from the measured samples may help to reduce scattering (Murphy et al. 2013; Colin A Stedmon and Bro 2008). In eemR
, this is done using the eem_remove_blank(eem, blank)
function where eem
is a list of EEMs and blank
is a water blank.
file <- system.file("extdata/cary/scans_day_1", "nano.csv", package = "eemR")
blank <- eem_read(file)
eems <- eem_remove_blank(eems, blank)
4.4.3 Raman and Rayleigh scattering removal
Scattering removal (Equation (??) and Equation (??)) is performed using the eem_remove_scattering(eem, type, order, width)
function where eem
is a list of EEMs, type
is the scattering type (raman
or rayleigh
), order
is the order of the scattering (1 or 2) and width
the width in nanometers of the slit windows to be removed. In the following example, only first order and Raman and Rayleigh scattering are removed using a bandwidth of 10 nm (Fig. 3).
eems <- eem_remove_scattering(eems, "rayleigh", 1, 10) %>%
eem_remove_scattering("raman", 1, 10)
plot(eems, which = 3)
4.4.4 Inner-filter effect correction
IFE correction requires the use of absorbance data (Equation (??)). For each EEM, an absorbance spectra must be supplied. The easiest way to provide absorbance is to use a data frame with column names matching EEMs names. In the following data frame, the first column represents the wavelengths at which absorbance have been measured whereas the remaining columns are absorbance spectra for sample1
, sample2
and sample3
.
data("absorbance")
head(absorbance)
## wavelength sample1 sample2 sample3
## 1 190 0.89674 1.02927 1.19405
## 2 191 0.84894 0.96381 1.13721
## 3 192 0.77267 0.85339 1.04520
## 4 193 0.70967 0.75627 0.96782
## 5 194 0.65459 0.67145 0.90092
## 6 195 0.61371 0.60745 0.85054
Note that EEM names can be obtained using the eem_sample_names()
function.
eem_names(eems)
## [1] "nano" "sample1" "sample2" "sample3"
IFE correction is performed using the eem_inner_filter_effect(eem, absorbance, pathlength)
function where eem
is a list of EEMs, absorbance
is a data frame containing absorbance spectra and pathlength
is the absorbance cuvette pathlength expressed in \(cm\) (Fig. 4B). For each EEM contained in eem
, the ranges spanned by the IFE correction factors and total absorbance \(A_{\text{total}}\) (Equation (??)) are displayed to the user. This can serve as diagnostic tool to determine if the mathematical correction was the appropriate method to use to handle IFE.
eems <- eem_inner_filter_effect(eem = eems,
absorbance = absorbance,
pathlength = 1)
## Warning: Absorbance spectrum for nano was not found. Returning uncorrected
## EEM.
## sample1
## Range of IFE correction factors: 1.0112 1.5546
## Range of total absorbance (Atotal) : 0.0096 0.3832
##
## sample2
## Range of IFE correction factors: 1.0061 1.3124
## Range of total absorbance (Atotal) : 0.0053 0.2362
##
## sample3
## Range of IFE correction factors: 1.016 2.3713
## Range of total absorbance (Atotal) : 0.0138 0.75
plot(eems, which = 3)
Fig. 4 presents intermediate results obtained for the correction of sample3
. Note the nonlinearity of the correction with higher effect at lower wavelengths (bottom-left corner in panel C). The corrected EEM is presented in Fig. 4D which is the result of the operation of dividing matrix in 4A by 4C.
4.4.5 Raman normalization
The last step of the correction process consist to calibrate fluorescence intensities using the Raman scatter peak of water (Lawaetz and Stedmon 2009). This is performed using the eem_raman_normalisation(eem, blank)
function where eem
is a list of EEMs and blank
is a water blank measured the same day. Here, the same water-blank is used for the three EEMs. Note that the value of the Raman area (\(A_{\text{rp}}\), Equation(??)) is printed.
eems <- eem_raman_normalisation(eems, blank)
## Raman area: 9.540904
## Raman area: 9.540904
## Raman area: 9.540904
plot(eems, which = 3)
At this stage, all corrections have been performed and EEMs are ready to be exported into MATLAB for PARAFAC analysis. The state of the EEMs can be verified using the summary()
function.
summary(eems)
## sample ex_min ex_max em_min em_max is_blank_corrected
## 1 nano 220 450 230 600 FALSE
## 2 sample1 220 450 230 600 TRUE
## 3 sample2 220 450 230 600 TRUE
## 4 sample3 220 450 230 600 TRUE
## is_scatter_corrected is_ife_corrected is_raman_normalized manufacturer
## 1 TRUE FALSE FALSE Cary Eclipse
## 2 TRUE TRUE TRUE Cary Eclipse
## 3 TRUE TRUE TRUE Cary Eclipse
## 4 TRUE TRUE TRUE Cary Eclipse
4.4.6 Exporting to MATLAB
The drEEM MATLAB toolbox (Murphy et al. 2013) used to perform PARAFAC analysis requires data in a specific format (structure). The eem_export_matlab(file, ...)
function can be used to export corrected EEMs into a PARAFAC ready format. The first file
argument is the mat file where to export the structure and the second argument ...
is one or more eem
object.
eem_export_matlab("myfile.mat", eems)
Once exported, one can simply import the generated mat file in MATLAB using load('myfile.mat');
.
4.4.7 Metric extraction
Coble’s peaks can be extracted using the eem_coble_peaks(eem)
function. Note that for peaks A, M, C, the maximum fluorescence intensity in the range of emission region is returned.
file <- system.file("extdata/cary/scans_day_1", package = "eemR")
eems <- eem_read(file)
eem_coble_peaks(eems, verbose = FALSE)
## sample b t a m c
## 1 nano 0.8745673 0.1401188 0.140175 0.09653326 0.1255788
## 2 sample1 1.5452981 1.0603312 3.731836 2.42409567 1.8149415
## 3 sample2 1.2629968 0.6647042 1.583489 1.02359302 0.7709074
## 4 sample3 1.4740862 1.3162812 8.416034 6.06335506 6.3179129
Fluorescence (FI), humification (HIX) and biological (BIX) indices can be extracted as follow.
eem_fluorescence_index(eems, verbose = FALSE)
## sample fi
## 1 nano -0.5932057
## 2 sample1 1.2647823
## 3 sample2 1.4553330
## 4 sample3 1.3294132
eem_humification_index(eems, verbose = FALSE)
## sample hix
## 1 nano 0.5568136
## 2 sample1 6.3795618
## 3 sample2 4.2548483
## 4 sample3 13.0246234
eem_biological_index(eems, verbose = FALSE)
## sample bix
## 1 nano 2.6812045
## 2 sample1 0.7062640
## 3 sample2 0.8535423
## 4 sample3 0.4867927
It should be noted that different excitation and emission wavelengths are often used to measure EEMs. Hence, it is possible to have mismatch between measured wavelengths and wavelengths used to calculate specific metrics. In these circumstances, EEMs are interpolated using the the pracma
package (Borchers 2015). A message warning the user will be displayed if data interpolation is performed. This behavior can be controlled using the verbose = TRUE/FALSE
parameter.
References
Massicotte, Philippe, and Jean-Jacques Frenette. 2011. “Spatial connectivity in a large river system: resolving the sources and fate of dissolved organic matter.” Ecological Applications 21 (7): 2600–2617. doi:10.1890/10-1475.1.
Murphy, Kathleen R., Colin a. Stedmon, Daniel Graeber, and Rasmus Bro. 2013. “Fluorescence spectroscopy and multi-way techniques. PARAFAC.” Analytical Methods 5 (23): 6557. doi:10.1039/c3ay41160e.
Stedmon, Colin A, and Rasmus Bro. 2008. “Characterizing dissolved organic matter fluorescence with parallel factor analysis: a tutorial.” Limnology and Oceanography: Methods 6 (11): 572–79. doi:10.4319/lom.2008.6.572.
Lawaetz, A J, and C A Stedmon. 2009. “Fluorescence Intensity Calibration Using the Raman Scatter Peak of Water.” Applied Spectroscopy 63 (8): 936–40. doi:10.1366/000370209788964548.
Borchers, Hans Werner. 2015. “pracma: Practical Numerical Math Functions.” https://cran.r-project.org/package=pracma.