A unified approach to the applicability domain problem of QSAR models

Horvath, Dragos; Marcou, Gilles; Varnek, Alexandre

doi:10.1186/1758-2946-2-S1-O6

Volume 2 Supplement 1

5th German Conference on Cheminformatics: 23. CIC-Workshop

Oral presentation
Open access
Published: 04 May 2010

A unified approach to the applicability domain problem of QSAR models

Dragos Horvath¹,
Gilles Marcou¹ &
Alexandre Varnek¹

Journal of Cheminformatics volume 2, Article number: O6 (2010) Cite this article

2089 Accesses
6 Citations
Metrics details

The present work proposes a unified conceptual framework to describe and quantify the important issue of the Applicability Domains (AD) of Quantitative Structure-Activity Relationships (QSARs). AD models are conceived as meta-models designed to associate an untrustworthiness score to any molecule M subject to property prediction by a QSAR model. Untrustworthiness scores or "AD metrics" are an expression of the relationship between M (represented by its descriptors in chemical space) and the space zones populated by the training molecules at the basis of model μ. Scores integrating some of the classical AD criteria (similarity-based, box-based) were considered in addition to newly invented terms, such as the dissimilarity to outlier-free training sets and the correlation breakdown count.

A loose correlation is expected to exist between this untrustworthiness and the error affecting the predicted property. While high untrustworthiness does not preclude correct predictions, inaccurate predictions at low untrustworthiness must be imperatively avoided. This kind of relationship is characteristic for the Neighborhood Behavior (NB) problem: dissimilar molecule pairs may or may not display similar properties, but similar molecule pairs with different properties are explicitly "forbidden". Therefore, statistical tools developed to tackle this latter aspect were applied, and lead to a unified AD metric benchmarking scheme.

A first use of untrustworthiness scores resides in prioritization of predictions, without need to specify a hard AD border. Moreover, if a significant set of external compounds is available, the formalism allows optimal AD borderlines to be fitted. Eventually, consensus AD definitions were built by means of a nonparametric mixing scheme of two AD metrics of comparable quality, and shown to outperform their respective parents.

Author information

Authors and Affiliations

Laboratoire d'InfoChimie, Université de Strasbourg - CNRS, Institut de Chimie, 4 rue Blaise Pascal, 67000, Strasbourg, France
Dragos Horvath, Gilles Marcou & Alexandre Varnek

Authors

Dragos Horvath
View author publications
You can also search for this author in PubMed Google Scholar
Gilles Marcou
View author publications
You can also search for this author in PubMed Google Scholar
Alexandre Varnek
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License ( https://creativecommons.org/licenses/by-nc/2.0 ), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Reprints and permissions

About this article

Cite this article

Horvath, D., Marcou, G. & Varnek, A. A unified approach to the applicability domain problem of QSAR models. J Cheminform 2 (Suppl 1), O6 (2010). https://doi.org/10.1186/1758-2946-2-S1-O6

Download citation

Published: 04 May 2010
DOI: https://doi.org/10.1186/1758-2946-2-S1-O6

5th German Conference on Cheminformatics: 23. CIC-Workshop

A unified approach to the applicability domain problem of QSAR models

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Journal of Cheminformatics

Contact us

5th German Conference on Cheminformatics: 23. CIC-Workshop

A unified approach to the applicability domain problem of QSAR models

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Journal of Cheminformatics

Contact us