Statistical Analysis of Biofuels Using Gas Chromatography
Edward J. Soares, Ph.D.
College of the Holy Cross
Gas chromatography (GC) is a technique used in analytical chemistry for separation and analysis of compounds that can be vaporized without decomposition. In particular, separation of different components of a mixture or the relative amounts of such components may be achieved. When applied to biodiesel fuels, GC allows us to identify the fatty acid methyl esters present in the sample, whose presence depends on the underlying feedstock type.
Typically, a set of replicate chromatograms is measured for each sample, and several biofuel classes are compared together. Each chromatogram quantifies molecular abundance at each of several retention times, which in turn correspond to particular carbon chains. Variation in peak height relative to retention time serves to differentiate biofuel classes. However, inherent in GC is measurement error in the form of variation in peak location (known as drift), which must be removed prior to chemometric analysis. This is usually accomplished by aligning each chromatogram to that of a reference sample.
In the first part of this talk, I will discuss a method of optimizing chromatogram alignment to maximize class separability using the Hotelling trace criterion (HTC). The HTC can be thought of as the multivariate and multi-class extension of the square of the two-sample t-statistic. Large values for the HTC correspond to better separation of groups of principal component scores for each biofuel class. In the second part of this talk, I apply the same alignment methodology to the task to building the optimal regression model to predict heat of combustion of the set of biofuels. In particular, I focus on principal components regression and partial least squares analysis, and use Pearson’s correlation coefficient to assess model quality
Dr. Edward Soares is an Associate Professor in the Department of Mathematics and Computer Science at College of the Holy Cross. He obtained a B.A. in mathematics from Providence College in 1983 and a Ph.D. in applied mathematics from the University of Arizona in 1994. His research interests are in the areas of medical image processing, statistical analysis of biomedical data, gas chromatography and chemometrics, and texture analysis of images derived from scanning electron microscopy.