|
Objective. To obtain a position utilizing my skills in
multivariate data analysis and data mining, not necessarily limited to
the chemometrics, chemoinformatics, or bioinformatics
fields. Particularly interested in algorithm development and software
prototyping.
Skills. Analytical chemist knowledgeable with
chemometric/statistical software such as Pirouette, SCAN, and Minitab,
but I prefer writing my own programs in MATLAB. Knowledgeable with
molecular simulation packages Spartan, QUANTA, and MOE. Experienced
in Windows and Linux operating systems, HTML, and LATEX; some
experience with SVL, C++, Perl, and Javascript. Able to acquire new
skills quickly.
Experience
Research Assistant, 2000-2003
Clarkson University, Potsdam, NY Developed a software
package for multivariate data analysis based on projection methods
(e.g., principal component analysis, partial least squares, Kohonen
neural networks) and genetic algorithms in MATLAB. The idea underlying
this system is that all pattern recognition methods will work well
when the problem is simple. By identifying the appropriate features, a
difficult problem can be reduced to a simple one. This research
included the following topics:
Supervised Learning Approach to Pattern
Recognition: Principal components analysis (PCA) reduces the
dimensionality of multivariate data by finding a new set of axes that
tracks independant sources of variation. The integration of PCA with
the genetic algorithm (GA) allows for the identification of features
whose information is primarily about class differences. Other
projection methods that take advantage of certain data structure have
been substituted for PCA, such as partial least squares (PLS), and
Kohonen self-organizing maps (SOM). The SOM enables successful
handling of outliers and nonlinear relationships within the data.
Spectroscopic and chromatographic chemical data, as well as
biological and physical data sets have been successfully
analyzed. Examples from spectroscopy include the classification of
hard, soft, and tropical woods, classification of recyclable
plastics, and quality control of pharmaceutical tablets by near IR
spectroscopy. Problems in chemical communication and fuel spill
identification show the applicability of this methodology to
chromatographic data. Examples of biological data include DNA
microarray/gene expression data sets.
Unsupervised Learning Approach: A GA designed
to maximize clustering of the data has been designed. This tool
serves as a data microscope, elucidating interesting structure and
relationships within the data, and helping to identify confounding
variables. This methodology aids the analysis of QSAR data, and
enables the analysis of exceedingly large and complex data sets from
molecular dynamics (MD) simulations without relying upon previous
knowledge. This method can potentialy identify atoms involved in
substrate binding, and has been validated using sperm whale myoglobin
simulation data.
Transverse Learning Approach: A combination of
supervised and unsupervised learning approaches improves
classification and prediction of multivariate data sets, especially
those that contain missing or uncertain class labels. This method
searches for features that contain information about class
differences and maximizes clustering in the projection space,
allowing the information present in unlabeled data to be used to
guide classification, and prevent overfitting. Transverse learning
also allows for the identification of sub-classes in the data.
The transverse learning GA has aided the analysis of DNA Microarray
data. It also serves an important role in a larger methodology for
QSAR analysis. Past QSAR analyses suffer from fragment-based
descriptors that are satisfactory for homologous molecules, but which
are less effective when applied to data sets with a great deal of
structural variation. Difficulties in modeling the underlying
chemical process, which can be quite complex, also contribute to the
mixed success of past QSAR efforts. This new methodology, involving
an enhanced version of Breneman's Transferable Atom Equivalent (TAE)
descriptors that contain pertinent shape and electronic information,
uses the transverse learning GA to help determine the underlying
chemical processes involved. This methodology has been applied in
musk odor-structure relationship studies.
Ordinal Pattern Recognition: Some pattern
recognition problems involve the differentiation of classes that are
related to each other, e.g., good, better, best. A fitness function
for the pattern recognition GA was developed that captures information
about the existence of these types of relationships in the data. This
ordinal fitness function serves as a bridge between pattern
recognition and calibration.
Multivariate Curve Resolution: Research
independant of the GA originally designed for deconvolution of
co-eluting species in chromatography has been adapted to image
analysis. A Varimax extended rotation (VER) followed by alternating
least squares (ALS) estimates the concentration and spectral profiles
of each component. The efficacy of this approach has been demonstrated
through the analysis of Raman image data of water in oil
emulsions.
Teaching Assistant, 2000-2003
Clarkson University, Potsdam, NY
Taught general level chemistry, Spectroscopy, and Instrumental
laboratories.
Received the Outstanding Teaching Award for Graduate Students for the
2002-2003 school year. Managed the revision and updating of the
Instrumental laboratory manual.
Back to top
Education
2000-2003: Clarkson University, Potsdam, NY
Ph.D., Analytical Chemistry, with minor in Informatics. Dissertation: "Genetic
Algorithms for Data Mining and Multivariate Data Analysis."
Advisor: Barry K. Lavine. View dissertaion contents (HTML or PDF)
GPA: 3.9/4.0
1996-1999: Clarkson University, Potsdam, NY
B.S., Chemistry, with Distinction, ACS Accredited. GPA: 3.6/4.0, 3.7/4.0 in major.
Received the George L. Jones, Jr., Award for
Excellence in Chemistry, and the CRC Science Achievment Award.
Three-time Presidential Scholar.
Back to top
Activities
Exchange Student at Monash University, Melbourne, VIC, Australia.
Member of the Clarkson University Hearing Committee on Discipline &
Disorders.
Graduate Advisor and Webmaster for the Racquetball Club.
Attended the Intercollegiate Racquetball National Tournaments, 2000-2003.
Back to top
Publications
- B. K. Lavine, D. Brzozowski, A. J. Moores, C. E. Davidson, and
H.T. Mayfield, "Genetic Algorithm for Fuel Spill Identification,"
Anal. Chim. Acta, 2001, 437(2), 233-246.
- B. K. Lavine, C. E. Davidson, A. J. Moores, and P. R. Griffiths,
"Raman Spectroscopy and Genetic Algorithms for the Classification
of Wood Types," Appl. Spectrosc., 2001, 55(8), 960-966.
- B. K. Lavine, C. E. Davidson, and A. J. Moores, "Innovative
Genetic Algorithms for Chemoinformatics," Chemom. Intell. Lab. Syst., 2002, 60(1), 161-171.
- B. K. Lavine, C. E. Davidson, and A. J. Moores, "Genetic
Algorithms for Spectral Pattern Recognition," Vib. Spectrosc., 2002, 28(1), 83-95.
- B. K. Lavine, C. E. Davidson, Robert K. Vander Meer, S. Lahav,
V. Soroker, and A. Hefetz, "Genetic Algorithms for Deciphering the
Complex Chemosensory Code of Social Insects," Chemom. Intell. Lab. Syst., 2003, 66(1), 51-62.
- B. K. Lavine, C. E. Davidson, C. Breneman, and W. Katt,
"Electronic Van der Waals Surface Property Descriptors and Genetic
Algorithms for Developing Structure-Activity Correlations in
Olfactory Databases," J. Chem. Inf. Comput. Sci., 2003,
43, 1890-1905.
Accepted for
Publication
- B. K. Lavine, C. E. Davidson, C. Breneman, and W. Katt,
"Genetic Algorithms for Clustering and Classification of
Olfactory Stimulants," in Chemoinformatics: Methods and Protocols,
J. Bajorath (Ed.), Humana Press, IN PRESS.
- B. K. Lavine and C. E. Davidson, "Classification and Pattern
Recognition," in Practical Handbook of Chemometrics, 2nd Edition,
Paul Gemperline (Ed.), Marcel Dekker Press, IN PRESS.
- B. K. Lavine, C. E. Davidson, J. P. Ritter, D. Westover, and T. Hancewicz, "Varimax
Extended Rotation Applied to Multivariate Spectroscopic Image
Analysis," Microchem. J., IN PRESS.
Submitted for
Publication
- B. K. Lavine, C. E. Davidson, S. Hawkins, and T. M. Hancewicz,
"Denoising of Ballistometer Data Using Fourier Filtering and
Pattern Recognition Techniques," International Journal of
Cosmetic Science, submitted.
- B. K. Lavine, C. E. Davidson, C. Breneman, and W. Katt,
"Development of Structure-Activity Olfactory Correlations using
Electronic Van der Waals Surface Property Descriptors and Genetic
Algorithms," in Chemometrics and Chemoinformatics, B. K. Lavine
(Ed.), ACS Symposium Series, submitted.
- B. K. Lavine, J. Workman, and C. E. Davidson, "Chemometrics:
Past, Present, and Future," in Chemometrics and Chemoinformatics,
B. K. Lavine (Ed.), ACS Symposium Series, submitted.
- B. K. Lavine, C. E. Davidson, and W. T. Rayens, "Machine Learning Based
Pattern Recognition Applied to Microarray Data," Combinatorial
Chemistry & High Throughput Screening, submitted.
In Preparation
- B. K. Lavine and C. E. Davidson, "Learning from Expression
Data," Bioinformatics, in preparation.
- B. K. Lavine, C. E. Davidson, and W. T. Rayens, "Genetic
Algorithms for Data Mining--Profiting from the Past," J. Chemom., in preparation.
- B. K. Lavine and C. E. Davidson, "Genetic Algorithms That
Emulate Human Pattern Recognition Through Machine Learning for
Database Mining and Knowledge Discovery," J. Chem. Inf. Comput.
Sci., in preparation.
- B. K. Lavine, C. E. Davidson, R. K. Vander Meer, D. Carlson,
S. Lahav, V. Soroker, and A. Hefetz, "Gas Chromatography/Pattern
Recognition Techniques Applied to Taxonomy and Chemical
Communication," Microchem. J., in preparation.
Back to top
Presentations
Author and Presenter
- "Innovative Genetic Algorithms for Pattern Recognition of
Chemical Data," at the 220th National Meeting of the American
Chemical Society, Washington, DC, 2000, and at the Clarkson
University Department of Chemistry Seminar, 2000.
- "Genetic Algorithms," at Clarkson University, part of the
Freshman Seminar Series, 2001.
- "Gene Expression and DNA Microarray Technology," Clarkson
University Department of Chemistry Seminar, 2003.
- "Mining Microarray Data," Clarkson University Department of
Chemistry Seminar, 2003.
- "Genetic Algorithms for Data Mining and multivariate Data
Analysis," Dissertation Defense, Clarkson University, Dec. 2003.
Co-Author
- "Genetic Algorithms for Pattern Recognition and
Multivariate Calibration," Federation of Analytical Chemistry and
Spectroscopy Societies (FACSS) Conference, Detroit, MI, October 11, 2001.
- "New Methods in Multivariate Spectrometric Image Analysis
I," FACSS, Detroit, MI, October 12, 2001.
- "Genetic Algorithms for Database Mining," Center for Process
Analytical Chemistry (CPAC) Meeting,
Seattle, WA, November 6, 2001.
- "Genetic Algorithms for Developing Structure-Activity
Correlations in Large Olfactory Databases," National ACS Meeting,
Boston, MA, August 22, 2002.
- "Varimax Extended Rotation (VER) Applied to Multivariate
Spectroscopic Image Analysis," Chemometrics in Analytical Chemistry
(CAC) Conference, Seattle, WA, September 23, 2002.
- "Genetic Algorithms for Pattern Recognition and Multivariate
Calibration," CAC
Conference, Seattle, WA, September 24, 2002.
- "Supervised Learning from Gene Expression Data," Advanced
Topics in Microarray Analysis, NIH Workshop, January 22,
2003 (Poster Presentation).
- "Genetic Algorithms that Emulate Human Pattern Recognition
Through Machine Learning," National ACS Meeting, New Orleans, LA,
March 26, 2003.
Back to top
References
References are available on file, upon request. Please contact the
Clarkson University Career Center:
Career Center
8 Clarkson Avenue
Potsdam, NY 13699-5620
day: 315-268-6477
fax: 315-268-7616
career@clarkson.edu
For more details, please contact the following people,
directly. They are familiar with my work and character.
Associate Professor Dr. Barry K. Lavine, Analytical
Chemistry, Clarkson University. [Advisor]
View contact information
Research Associate Professor Dr. Dan V. Goia, Center for Advanced
Materials Processing, Clarkson University.
View contact information
|