Blind trials of computer-assisted structure elucidation software
1 Advanced Chemistry Development, Toronto Department, 110 Yonge Street, 14th floor, Toronto, Ontario, M5C 1T4, Canada
2 Advanced Chemistry Development, Moscow Department, 6 Akademik Bakulev Street, Moscow 117513, Russian Federation
3 Royal Society of Chemistry, 904 Tamaras Circle, Wake Forest, NC, 27587, USA
Journal of Cheminformatics 2012, 4:5 doi:10.1186/1758-2946-4-5Published: 9 February 2012
One of the largest challenges in chemistry today remains that of efficiently mining through vast amounts of data in order to elucidate the chemical structure for an unknown compound. The elucidated candidate compound must be fully consistent with the data and any other competing candidates efficiently eliminated without doubt by using additional data if necessary. It has become increasingly necessary to incorporate an in silico structure generation and verification tool to facilitate this elucidation process. An effective structure elucidation software technology aims to mimic the skills of a human in interpreting the complex nature of spectral data while producing a solution within a reasonable amount of time. This type of software is known as computer-assisted structure elucidation or CASE software. A systematic trial of the ACD/Structure Elucidator CASE software was conducted over an extended period of time by analysing a set of single and double-blind trials submitted by a global audience of scientists. The purpose of the blind trials was to reduce subjective bias. Double-blind trials comprised of data where the candidate compound was unknown to both the submitting scientist and the analyst. The level of expertise of the submitting scientist ranged from novice to expert structure elucidation specialists with experience in pharmaceutical, industrial, government and academic environments.
Beginning in 2003, and for the following nine years, the algorithms and software technology contained within ACD/Structure Elucidator have been tested against 112 data sets; many of these were unique challenges. Of these challenges 9% were double-blind trials. The results of eighteen of the single-blind trials were investigated in detail and included problems of a diverse nature with many of the specific challenges associated with algorithmic structure elucidation such as deficiency in protons, structure symmetry, a large number of heteroatoms and poor quality spectral data.
When applied to a complex set of blind trials, ACD/Structure Elucidator was shown to be a very useful tool in advancing the computer's contribution to elucidating a candidate structure from a set of spectral data (NMR and MS) for an unknown. The synergistic interaction between humans and computers can be highly beneficial in terms of less biased approaches to elucidation as well as dramatic improvements in speed and throughput. In those cases where multiple candidate structures exist, ACD/Structure Elucidator is equipped to validate the correct structure and eliminate inconsistent candidates. Full elucidation can generally be performed in less than two hours; this includes the average spectral data processing time and data input.