Multiple conformational states in retrospective virtual screening – homology models vs. crystal structures: beta-2 adrenergic receptor case study

Mordalski, Stefan; Witek, Jagna; Smusz, Sabina; Rataj, Krzysztof; Bojarski, Andrzej J

doi:10.1186/s13321-015-0062-x

Research article
Open access
Published: 09 April 2015

Multiple conformational states in retrospective virtual screening – homology models vs. crystal structures: beta-2 adrenergic receptor case study

Stefan Mordalski¹,
Jagna Witek¹,
Sabina Smusz^1,2,
Krzysztof Rataj¹ &
…
Andrzej J Bojarski¹

Journal of Cheminformatics volume 7, Article number: 13 (2015) Cite this article

3183 Accesses
10 Citations
3 Altmetric
Metrics details

Abstract

Background

Distinguishing active from inactive compounds is one of the crucial problems of molecular docking, especially in the context of virtual screening experiments. The randomization of poses and the natural flexibility of the protein make this discrimination even harder. Some of the recent approaches to post-docking analysis use an ensemble of receptor models to mimic this naturally occurring conformational diversity. However, the optimal number of receptor conformations is yet to be determined.

In this study, we compare the results of a retrospective screening of beta-2 adrenergic receptor ligands performed on both the ensemble of receptor conformations extracted from ten available crystal structures and an equal number of homology models. Additional analysis was also performed for homology models with up to 20 receptor conformations considered.

Results

The docking results were encoded into the Structural Interaction Fingerprints and were automatically analyzed by support vector machine. The use of homology models in such virtual screening application was proved to be superior in comparison to crystal structures. Additionally, increasing the number of receptor conformational states led to enhanced effectiveness of active vs. inactive compounds discrimination.

Conclusions

For virtual screening purposes, the use of homology models was found to be most beneficial, even in the presence of crystallographic data regarding the conformational space of the receptor. The results also showed that increasing the number of receptors considered improves the effectiveness of identifying active compounds by machine learning methods.

Background

G protein-coupled receptors (GPCRs) constitute a large superfamily of signaling proteins that share a common topology of 7 transmembrane (7TM) helices and transduce signals across the cell membrane. Because GPCRs are responsible for most of a cell’s communication with its environment, their malfunctions are associated with various disease states, mainly those related to the central nervous system (CNS). For this reason, GPCRs are a very important target base for drugs [1-3].

The beta-2 adrenergic (B2AR) receptor, the subject of this case study, is representative of the class A GPCRs and is involved in mediating the relaxation of smooth muscle, glycogenolysis and glucogenesis in the liver and regulation of the metabolism of cells in skeletal muscle. Β2AR is also responsible for increased cardiac output, facilitation of the release of neurotransmitters, and regulation of various other physiological processes [4-7]. B2AR is also one of the most studied 7TM structures; it was first crystalized in 2007 [8], and as of November 2014, 16 crystals with a variety of structurally and functionally unique ligands are available via the Protein Data Bank (PDB), making this receptor a strong base for in silico structural studies.

Our previous study applying Machine Learning (ML) to post-docking analysis used Structural Interaction Fingerprint (SIFt) profiles created upon three different crystalline conformations of receptors [9,10]. It showed the applicability of this approach to ligand-protein complexes evaluation for Virtual Screening (VS). In addition to the issue of the applicability of crystal structures in VS, this study also investigates the influence of the number of conformations used in per-ligand interaction profiles, for both crystal structures and homology models, on retrospective screening performance. The VS setup consisted of four groups of compounds: active, inactive, DUD (Directory of Useful Decoys) decoys, and random ZINC subsets; three sets of experiments were prepared to discriminate between active and inactive compounds from each of the decoy collections.

The support vector machine (SVM) was the classification algorithm chosen and the VS performance was measured with the Matthews Correlation Coefficient (MCC).

Results and discussion

Crystal structures vs. homology models

Because the number of crystal templates used for homology models construction would affect the clarity of the presented results, the comparison of homology models and crystal structures is shown for the templates providing the best (M2R) and the worst (D3R) results (in terms of the discrimination between actives and true inactives) – Figure 1; the outcomes for the remaining templates are available in the Additional files section (Additional file 1: Figure S1). The use of vast numbers of templates for homology modeling follows the protocols used in previously published data and ensures maximum VS performance [11]. Due to a limited number of available crystal structures, the maximum number of receptor conformations in this comparison is restricted to 10 (starting from 3).

The results of retrospective VS (Figure 1) show that homology model-based screening significantly outperforms experiments conducted for the collection of crystal structures, with MCC improvement of 0.4 for the best set of conformations. In addition, all types of classification experiments (actives/true inactives, actives/DUDs, and actives/ZINC cmds) confirm this dependency. The MCC spreads for different templates were of little significance: variation between the best and the worst performing template ranged from 0.1 for actives/true inactives experiments to less than 0.05 for the other two VS scenarios.

For homology models, MCC values obtained for actives/true inactives discrimination were the lowest (~0.5 – 0.55). However, for actives/DUDs and actives/ZINC cmds classifications, MCC values exceeded 0.8, with a slight preference towards actives/ZINC experiments.

On the other hand, studies performed for crystal structures resulted in MCC of 0.2 for actives/true inactives (this best result was obtained for the SIFt profile composed of 8 receptor conformations), 0.47 for actives/DUDs (6 conformations) and 0.55 for actives/ZINC (9 conformations).

The obtained results show that the conformational flexibility provided by homology models allows for better accommodation of diverse ligands and therefore better screening performance in this interaction-centric type of experiments. Because crystal structures are limited in terms of chemical space of co-crystalized ligands, they are not yet able to provide a sufficient conformational landscape for efficient identification of active compounds.

Influence of the number of considered conformations on screening performance for homology models

Due to the substantial amount of data, a detailed analysis was conducted only for the best performing set of SVM parameters, in terms of MCC value, and also for the best and the worst template only (M2R and D3R_, respectively); full datasets are included within the Additional files (Additional file 2: Figure S2). The MCC values related to different numbers of considered conformations are presented in Figure 2.

In addition, differential graphs illustrating MCC change after including subsequent models (adding one-by-one- from 3 to 20 forming at the end 20-models-based profile) were prepared (Figure 3, Additional file 3: Figure S3). In each case, the addition of the model that was characterized by the highest area under the ROC curve (AUROC) at the stage of models evaluation was highlighted – in Figure 2 by red frame, in Figure 3 by the application of brighter colour.

The general outcome emerging from the results (Figure 2; Figure 3) aligns with the results obtained for the comparison of crystal structures with homology models. An increased number of conformations included in a SIFt profile leads to an improvement of VS performance, however, for some isolated cases, the contribution of subsequent models may be negative. The number of models providing the highest MCC was 20 in the majority of cases (as shown in Table 1), and the worst performing set of conformations was 3 for all but two sets of models. The improvement of MCC was not linear; however, its values were noticeably lower for a low number of conformations considered.

Table 1 The optimal and the worst number of models included in the SIFt profiles in terms of classification effectiveness

Full size table

MCC fluctuations occurring in the actives/true inactives classification stage of the experiment were the highest out of all three scenarios. This scenario also had several situations where additional considered conformations lowered the screening performance. On the other hand, filtering actives against DUD and random ZINC decoys led to a clear dependency between MCC and the number of models: the higher the number of models included in the profile, the higher the MCC values.

The impact of receptors bearing the highest AUROC values during the model selection step (conformation 4 and 19 for D3R and M2R templates, respectively) proves that the performance of individual homology models has little influence on the obtained results and, in some cases (conformation 19 based on M2R in screening against ZINC subset – Figure 2c), can even lower VS performance.

Although the MCC changes induced by including new conformations to the ligand profiles seem to be negligible, the cumulative effect for VS experiments leads to a significant improvement of screening performance by up to 20% (Additional file 4: Figure S4). Interestingly, the absolute values of MCC difference oscillated at approximately 0.1, regardless of the scheme of the experiment.

Influence of the number of considered conformations on screening performance for crystal structures

The results show that the experiments using multiple crystalline conformations are significantly more prone to screening performance fluctuations (Figure 4). The amplitude of these fluctuations ranges from a 0.3 improvement to a 0.4 decrease in terms of MCC value. This variation of the results obtained for crystal structures is connected with the specificity of the individual crystallized with the proteins. The drop in MCC, observed after adding the last two conformations (3P0G [12] and 3PDS [13]), is a consequence of the crystals being agonist bound. The conformation of an activated GPCR results in limited ligand accessible volume, significantly reducing the quality of docking results. Following this lead, the influence of activation state of the crystal template on classification efficiency was also examined for beta-2 homology models constructed on activated and deactivated M2R template. The direction of changes when adding subsequent SIFts to the profile was preserved but the results were slightly (~2-3%) better for models constructed on deactivated template (Additional file 5: Figure S5).

Conclusions

In this study we compared the performance of the collection of crystal structures with corresponding sets of homology models in retrospective VS experiments designed to consider multiple conformations of a target receptor. The results demonstrated that the bundle of homology models significantly outperformed the crystalline-based approach in terms of MCC, which agrees with results from previous reports [14]. The main reason behind this difference in screening effectiveness is the limited conformational space of the crystal structures, which is a consequence of adaptation to the co-crystalized ligands, thus biasing the conformation of the complex. Shallow conformational landscapes of the crystal structures of the receptors are also caused by low structural diversity of the crystalline ligands, limiting the possible spatial orientations of residues.

The second component of this research investigated the effect of increasing the number of considered conformations. The conclusion emerging from all schemes of experiments (screening active compounds against truly inactive, DUD, and ZINC decoys) is that high coverage of the conformational space of the receptor models leads to more effective screening. A probable reason behind this observation is that the inclusion of more conformations into a docking protocol neglects the fluctuations of docking poses and provides a more coherent binding mode for a given ligand, therefore enabling a clearer discrimination between active and inactive compounds. Extending the population of conformations would most likely increase the MCC up to the limit defined by the number of compounds that were not docked into any receptor model, yet the increasing computational cost of such tests may render the results not worth the effort. Although there is no actual boundary for the number of conformations to include, the results shown here prove that three models/crystals leave sufficient space to improve VS performance.

Methods

To maintain the coherence of the ligand data, the compounds of known activity (divided into sets ‘actives’ and ‘true inactives’) were extracted using a strict protocol. All structures with verified activity towards the B2AR were selected from the ChEMBL database [15]. Only those compounds whose activity was quantified in K_i or IC₅₀ (with the assumption that K_i = IC₅₀/2) and that were tested on human cloned, rat cloned, or native receptors were taken into account. A compound was considered active when the K_i value assigned to it was lower than 100 nM, and the compound was considered inactive when this activity parameter was higher than 1000 nM. The compounds were clustered with Canvas [16], and the number of clusters was set to approximately 30% of the total number of compounds from a particular group. Cluster centroids were used for the primary evaluation of homology models. In addition, two sets of decoy compounds were generated, one following the DUD methodology [17] and one random subset of the ZINC database [18]. Both of the decoy collections contained 2000 compounds; DUDs were randomly picked to narrow the count of the set.

The homology models of the B2AR were constructed. Nine crystal templates were used for this purpose: serotonin receptors 5-HT1B and 5-HT2B, adenosine receptor A2A, adrenergic receptor beta-1, chemokine receptor CXCR4, dopamine receptor D3, histamine receptor H1, and muscarinic receptors M2 and M3 (Table 2). The sequence alignment was performed manually and only for the transmembrane helices. Loops were not modelled. For each template, 20 models were generated with Modeller 9v13 software [19] and were evaluated by AUROC in the docking of cluster centroids from actives and true inactives sets (Table 3). Additional homology models were also prepared in the same manner for deactivated M2R structure (4MQS).

Table 2 Crystal structures used as templates for homology modeling of beta-2 adrenergic receptor

Full size table

Table 3 Number of compounds used for the preevaluation of homology models

Full size table

The three dimensional structures of the compounds, along with protonation states and atom types, were assigned with LigPrep software [20]. For some compounds, several protonation states were generated what increased the initial number of instances. The docking was performed with GLIDE 5.0, with the number of output poses limited to one for both homology models (Table 4), and a collection of B2AR crystal structures (Table 5).

Table 4 Compound counts for retrospective screening scenarios

Full size table

Table 5 Crystal structures of beta-2 adrenergic receptor used in the study

Full size table

After this initial models evaluation, all compounds from a particular group of molecules (actives, true inactives, DUDs, and ZINC) were docked into the constructed homology models and crystal structures. Ligand-receptor complexes received from the docking procedure were represented by the Structural Interaction Fingerprint [9] which have a type of a binary string that describes the interaction of a ligand with each of the amino acids of the protein; the string is divided into nine-bit chunks that refer to particular amino acid residues. The type of interactions that are taken into account include the presence of any interaction, an interaction with the main chain, an interaction with a side chain, a polar interaction, a hydrophobic interaction, a hydrogen bond acceptor, a hydrogen bond donor, an aromatic interaction, and a charged bond.

For each compound that had at least one pose in a population of receptor conformations, the SIFt profile was calculated. On each position in the string, the values were averaged over all models/crystal structures for the given conformational landscape considered (per ligand SIFt profile). The number of receptor conformations used in the experiments ranged from 3 to 20 (10 for the crystal structures).

The per ligand SIFt profiles were input for machine learning experiments conducted with the use of the WEKA package [21]. The task of the ML algorithm was to distinguish active from inactive or decoy compounds. Support vector machines algorithm [22] was used as a classification method with linear function as a kernel. This model was developed by Vapnik [22] with a core concept of seeking the hyperplane separating the binary-labeled data with the maximum possible margin. This can be written as the following optimization problem:

$$ \underset{w,\kern0.1em b}{ \min \mathrm{imize}}\kern0.1em \frac{1}{2}\parallel w{\parallel}^2+C\kern0.1em \sum_i^N\kern0.1em {\xi}_i $$

$$ \mathrm{subject}\kern0.5em \mathrm{t}\mathrm{o}\kern0.5em {y}_i\left(\left\langle w,\kern0.1em {x}_i\right\rangle -b\right)\kern0.5em \ge \kern0.5em 1\kern0.5em -\kern0.5em {\xi}_i\kern0.5em -\kern0.5em \mathbf{v}\mathbf{a}\mathbf{r}\left({\boldsymbol{\alpha}}_{\boldsymbol{i}}\right){\boldsymbol{\xi}}_{\boldsymbol{i},}i\kern0.5em =\kern0.5em 1,\kern0.1em \dots \kern0.1em ,\kern0.1em N $$

with w being the normal vector to the hyperplane and y _i being the class to which the particular example is assigned (in case of binary labeled data, y _i ∈ {− 1, + 1}). C is the parameter that controls the tradeoff between the correct classification and large margin.

However, in real applications, the data are not usually linearly separable, and the application of the kernel trick is required. In our paper, the linear kernel ($ K\left({x}_i,\kern0.1em {x}_j\right)\kern0.5em =\kern0.5em \left\langle {x}_i,{x}_j\right\rangle \kern0.5em =\kern0.5em {\displaystyle \sum_{l=1}^d}\kern0.5em {x_i}_l{x}_{jl} $ for d-dimensional feature space) was applied.

The original optimization problem is transformed to the dual form with the use of Lagrange’s multipliers α _i:

$$ \underset{\alpha }{ \max \mathrm{imize}}\kern0.5em {\displaystyle \sum_{i=1}^N{\alpha}_i}\kern0.5em -\kern0.5em \frac{1}{2}\kern0.5em {\displaystyle \sum_{i,\kern0.1em j}^N{a}_i{a}_j{y}_i{y}_jK\left({x}_i,{x}_j\right)} $$

$$ \mathrm{subject}\kern0.5em \mathrm{t}\mathrm{o}\kern0.5em 0\kern0.5em \le \kern0.5em {\alpha}_i\kern0.5em \le \kern0.5em C,\kern0.5em i= 1,\dots, N $$

$$ \sum_{i=1}^N\kern0.5em {\alpha}_i{y}_i\kern0.5em =\kern0.5em 0 $$

α _i represents weights that are assigned to particular example x _i from the training data, and in the dual form, C constitutes the upper bound of α _i values.

The optimization of C values was performed (the following C values were checked: 0.01; 0.1; 1; 10; 100; 1 000; 10 000). The experiments were carried out in a 10-fold cross-validation mode.

The effectiveness of machine learning methods was measured with MCC being a balanced measure for such kind of experiments and expressed by the following formula:

$$ MCC=\frac{TP\cdot TN-FP\cdot FN}{\sqrt{\left(TP+FP\right)\cdot \left(TP+FN\right)\cdot \left(TN+FP\right)\cdot \left(TN+FN\right)}} $$

where:

TP – number of true positives,
TN – number of true negatives,
FP – number of false positives
FN – number of false negatives

Abbreviations

GPCRs:: G protein-coupled receptors
7TM:: 7 transmembrane
CNS:: Central nervous system
B2AR:: Beta-2 adrenergic
PDB:: Protein data bank
SIFt:: Structural interaction fingerprint
ML:: Machine learning
VS:: virtual screening
DUD:: Directory of useful decoys
SVM:: Support vector machine
MCC:: Matthews correlation coefficient

References

Klabunde T, Hessler G. Drug design strategies for targeting G-protein-coupled receptors. ChemBioChem. 2002;3(10):928–44.
Article CAS Google Scholar
Lagerström MC, Schiöth HB. Structural diversity of G protein-coupled receptors and significance for drug discovery. Nat Rev Drug Discov. 2008;7(4):339–57.
Article Google Scholar
Lundstrom K. An overview on GPCRs and drug discovery: structure-based drug design and structural biology on GPCRs. Methods Mol Biol. 2009;552:51–66.
Article CAS Google Scholar
Liggett SB. Molecular and genetic basis of beta2-adrenergic receptor function. J Allergy Clin Immunol. 1999;104(2 Pt 2):S42–6.
Article CAS Google Scholar
McGraw DW, Liggett SB. Molecular mechanisms of beta2-adrenergic receptor function and regulation. Proc Am Thorac Soc. 2005;2:292–6. discussion 311–312.
Article CAS Google Scholar
Kolb P, Rosenbaum DM, Irwin JJ, Fung JJ, Kobilka BK, Shoichet BK. Structure-based discovery of beta2-adrenergic receptor ligands. Proc Natl Acad Sci USA. 2009;106:6843–8.
Article CAS Google Scholar
Strosberg AD. Structure, function, and regulation of adrenergic receptors. Protein Sci. 1993;2(8):1198–209.
Article CAS Google Scholar
Cherezov V, Rosenbaum DM, Hanson MA, Rasmussen SGF, Thian FS, Kobilka TS, et al. High-resolution crystal structure of an engineered human beta2-adrenergic G protein-coupled receptor. Science. 2007;318(5854):1258–65.
Article CAS Google Scholar
Deng Z, Chuaqui C, Singh J. Structural interaction fingerprint (SIFt): a novel method for analyzing three-dimensional protein-ligand binding interactions. J Med Chem. 2004;47(2):337–44.
Article CAS Google Scholar
Witek J, Smusz S, Rataj K, Mordalski S, Bojarski AJ. An application of machine learning methods to structural interaction fingerprints - a case study of kinase inhibitors. Bioorg Med Chem Lett. 2014;24(2):580–5.
Article CAS Google Scholar
Rataj K, Witek J, Mordalski S, Kosciolek T, Bojarski AJ. Impact of template choice on homology model efficiency in virtual screening. J Chem Inf Model. 2014;54(6):1661–8.
Article CAS Google Scholar
Rasmussen SGF, Choi H-J, Fung JJ, Pardon E, Casarosa P, Chae PS, et al. Structure of a nanobody-stabilized active state of the β(2) adrenoceptor. Nature. 2011;469(7329):175–80.
Article CAS Google Scholar
Rosenbaum DM, Zhang C, Lyons JA, Holl R, Aragao D, Arlow DH, et al. Structure and function of an irreversible agonist-β(2) adrenoceptor complex. Nature. 2011;469(7329):236–40.
Article CAS Google Scholar
Tang H, Wang XS, Hsieh JH, Tropsha A. Do crystal structures obviate the need for theoretical models of GPCRs for structure-based virtual screening? Proteins. 2012;80(6):1503–21.
Article CAS Google Scholar
Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, et al. ChEMBL: a large-scale bioactivity database for drug discovery. Nucl Acids Res. 2011;40(Database issue):D1100–7.
Google Scholar
Canvas, version 1.3, Schrödinger, LLC, New York, NY, 2010.
Huang N, Shoichet BK, Irwin JJ. Benchmarking sets for molecular docking. J Med Chem. 2006;49(23):6789–801.
Article CAS Google Scholar
Irwin JJ, Shoichet BK. ZINC - a free database of commercially available compounds for virtual screening. J Chem Inf Model. 2005;45(1):177–82.
Article CAS Google Scholar
Sali A, Blundell TL. Comparative protein modelling by satisfaction of spatial restraints. J Mol Biol. 1993;234(3):779–815.
Article CAS Google Scholar
LigPrep, version 2.5, Schrödinger, LLC, New York, NY, 2011.
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH. The WEKA data mining software: an update. SIGKDD Explorations. 2009;11:10–8.
Article Google Scholar
Cortes C, Vapnik V. Support-vector networks. Mach Learn. 1995;20:273–97.
Google Scholar
Wang C, Jiang Y, Ma J, Wu H, Wacker D, Katritch V, et al. Structural basis for molecular recognition at serotonin receptors. Science. 2013;340(6132):610–4.
Article CAS Google Scholar
Wacker D, Wang C, Katritch V, Han GW, Huang X-P, Vardy E, et al. Structural features for functional selectivity at serotonin receptors. Science. 2013;340(6132):615–9.
Article CAS Google Scholar
Xu F, Wu H, Katritch V, Han GW, Jacobson KA, Gao Z-G, et al. Structure of an agonist-bound human A2A adenosine receptor. Science. 2011;332(6027):322–7.
Article CAS Google Scholar
Warne T, Moukhametzianov R, Baker JG, Nehmé R, Edwards PC, Leslie AGW, et al. The structural basis for agonist and partial agonist action on a β(1)-adrenergic receptor. Nature. 2011;469(7329):241–4.
Article CAS Google Scholar
Wu B, Chien EYT, Mol CD, Fenalti G, Liu W, Katritch V, et al. Structures of the CXCR4 chemokine GPCR with small-molecule and cyclic peptide antagonists. Science. 2010;330(6007):1066–71.
Article CAS Google Scholar
Chien EYT, Liu W, Zhao Q, Katritch V, Han GW, Hanson MA, et al. Structure of the human dopamine D3 receptor in complex with a D2/D3 selective antagonist. Science. 2010;330(6007):1091–5.
Article CAS Google Scholar
Shimamura T, Shiroishi M, Weyand S, Tsujimoto H, Winter G, Katritch V, et al. Structure of the human histamine H1 receptor complex with doxepin. Nature. 2011;475(7354):65–70.
Article CAS Google Scholar
Haga K, Kruse AC, Asada H, Yurugi-Kobayashi T, Shiroishi M, Zhang C, et al. Structure of the human M2 muscarinic acetylcholine receptor bound to an antagonist. Nature. 2012;482(7386):547–51.
Article CAS Google Scholar
Kruse AC, Hu J, Pan AC, Arlow DH, Rosenbaum DM, Rosemond E, et al. Structure and dynamics of the M3 muscarinic acetylcholine receptor. Nature. 2012;6027:552–6.
Article Google Scholar

Download references

Acknowledgements

The study was partially supported by the project “Diamentowy Grant” DI 2011 0046 41 financed by the Polish Ministry of Science and Higher Education and by a grant PRELUDIUM 2013/09/N/NZ2/01917 financed by the Polish National Science Centre (www.ncn.gov.pl). SM received funding for preparation of PhD thesis from Polish National Science Centre as part of PhD scholarship ETIUDA 2, decision DEC-2014/12/T/NZ2/00529

Author information

Authors and Affiliations

Department of Medicinal Chemistry, Institute of Pharmacology, Polish Academy of Sciences, 12 Smętna Street, Kraków, 31-343, Poland
Stefan Mordalski, Jagna Witek, Sabina Smusz, Krzysztof Rataj & Andrzej J Bojarski
Faculty of Chemistry, Jagiellonian University, 3 R. Ingardena Street, Kraków, 30-060, Poland
Sabina Smusz

Authors

Stefan Mordalski
View author publications
You can also search for this author in PubMed Google Scholar
Jagna Witek
View author publications
You can also search for this author in PubMed Google Scholar
Sabina Smusz
View author publications
You can also search for this author in PubMed Google Scholar
Krzysztof Rataj
View author publications
You can also search for this author in PubMed Google Scholar
Andrzej J Bojarski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andrzej J Bojarski.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

All authors designed the experiments. JW, SS, KR and SM performed the experiments. All authors analyzed the data, drew conclusions and wrote, read and approved the final manuscript.

Additional files

Additional file 1: Figure S1.

Comparison of MCC values obtained in the ML-based experiments for homology models and crystal structures for discrimination between a) actives/true inactives, b) actives/DUDs, and c) actives/ZINC. The figure presents the MCC values obtained for homology models of beta-2 adrenergic receptor (constructed on various templates) and for crystal structures of this receptor in experiments distinguishing the following class of compounds: (a) actives/true inactives, (b) actives/DUDs and (c) actives/ZINC.

Additional file 2: Figure S2.

MCC values obtained for various numbers of models included in the SIFt profile for models built on various templates for a) actives/true inactives, b) actives/DUDs, and c) actives/ZINC discrimination. The figure presents the MCC values obtained for various numbers of models included in the SIFt profile for homology models constructed on various templates in the form of the heat map for a) actives/true inactives, b) actives/DUDs, c) actives/ZINC cmds discrimination.

Additional file 3: Figure S3.

Difference in MCC caused by the inclusion of additional receptors in the profile for a) actives/true inactives, b) actives/DUDs, and c) actives/ZINC cmds discrimination. The figure presents the changes in MCC obtained after the inclusion of additional receptors in the SIFt profile for homology models.

Additional file 4: Figure S4.

Difference between the highest and the lowest MCC obtained for SIFt profiles construction for various numbers of conformations. The figure presents the scale of MCC changes associated with varying number of model conformations in the form of differences between the highest and the lowest. MCC values obtained for a given template/crystal structure.

Additional file 5: Figure S5.

Comparison of the actives/true inactives classification efficiency for beta-2 homology models constructed on activated and deactivated M2R template. The figure presents the differences between the results for beta-2 homology models that were constructed on crystal structure of the receptor with agonist (activated template) and on crystal structure in which the receptor was bound to antagonist (deactivated structure).

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit https://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Mordalski, S., Witek, J., Smusz, S. et al. Multiple conformational states in retrospective virtual screening – homology models vs. crystal structures: beta-2 adrenergic receptor case study. J Cheminform 7, 13 (2015). https://doi.org/10.1186/s13321-015-0062-x

Download citation

Received: 22 November 2014
Accepted: 17 March 2015
Published: 09 April 2015
DOI: https://doi.org/10.1186/s13321-015-0062-x

Multiple conformational states in retrospective virtual screening – homology models vs. crystal structures: beta-2 adrenergic receptor case study