To develop strategies for pre-processing and statistical analysis in clinical spectroscopy
Leads
Goals
- Identify and down select appropriate methodologies for the pre-processing and statistical analysis of data originating from a range of samples and techniques via a network forum
- Provide a resource that contains educational materials, tools and example datasets to highlight issues and appropriate strategies for data analysis in clinical spectroscopy
- Develop and adopt minimum reporting requirements for the data treatment workflow used in the analysis of clinical spectroscopic data
- Provide an agreed spectroscopic peak assignment resource, editable based upon future studies.
Reasons and added value
In both infrared and Raman studies, data pre-processing such as baseline correction, background subtraction, vector normalisation, EMSC, RMieS-EMSC, first or second derivative etc. are used sometimes with very little justification other than it seems to work best for that particular sample. Similarly, a very wide range of multivariate data analysis methods are used, PCA, PLS, LDA, SVMs, ANNs etc, again often with little justification and in some cases it would appear that some authors are using these methods as a black box with no real understanding of what that particular analysis is doing.
Many of the network members are spectroscopists or clinicians but not experts in chemometrics. This is a major problem in the field. The key added value here is that current and new network members will have access to the latest current thinking regarding pre-processing and chemometrics via the website and educational tools.