Table 1

The current LODD datasets.Further information about content and accessibility (URIs, SPARQL endpoints) of these linked datasets can be found online at [27].

Name

Short Description

Size and coverage (rounded)

Sources

Provider (1. original dataset, 2. RDF version of dataset)


DrugBank

Chemical, pharmacological and pharmaceutical drug data; data about drug targets (e.g., sequences, structure, pathways)

767,000 triples; 4,800 drugs, 2,500 protein sequences

Aggregated from various biomedical and pharmaceutical databases

1. University of Alberta

2. Free University of Berlin


ClinicalTrials.gov/LinkedCT

Information about clinical trials

9.8 million triples, 80,000 trials

Data submitted by study sponsors or their representatives

1. US National Institute of Health

2. LinkedCT.org; University of Toronto


DailyMed

Information about approved prescription drugs, including FDA approved labels (package inserts)

164,000 triples; 4,000 drugs

Package inserts, data from the US food and drug administration (FDA)

1. US National Library of Medicine

2. Free University of Berlin


ChEMBL

Information on drugs, e.g., activity against drug targets such as proteins, chemical properties. Linked to primary literature

24 million triples; 8000 drug targets, 660,000 compounds

Aggregated from various biomedical and pharmaceutical databases

1. European Bioinformatics Institute

2. Uppsala University


Diseasome

Characteristics of disorders and disease genes linked by known disease-gene associations

91,000 triples; 2,600 genes

Generated from data in Online Mendelian Inheritance in Man (OMIM)

1. Consortium of several labs

2. Free University of Berlin


TCMGeneDIT/RDF-TCM

Gene-disease-drug associations mined from literature about Chinese medicine

117,000 triples

Mined from research articles

1. National Taiwan University

2. Oxford University


RxNorm

Prescription drugs, their ingredients, and national drug codes

7.7 million triples; 166,000 unique drugs and ingredients

FDA databases

1. US National Library of Medicine

2. Stony Brook School of Medicine


UMLS

Unified Medical Language System (UMLS) sources available without restrictions

55 million triples

Ontologies created by third parties

1. US National Library of Medicine

2. Stony Brook School of Medicine


SIDER

Reported adverse effects of marketed drugs

193,000 triples; 63,000 adverse effect reports

Mined package inserts

1. European Molecular Biology Laboratory, Heidelberg

2. Free University of Berlin


STITCH

Molecular interactions between chemicals and proteins

7.5 million chemicals, 500,000 proteins, 370 organisms

Aggregated from various biomedical and pharmaceutical databases

1. European Molecular Biology Laboratory, Heidelberg

2. Free University of Berlin


Medicare

The Medicare formulary

44,500 triples; 6800 drugs

Primary data

1. US Government

2. Free University of Berlin


WHO Global Health Observatory

Data and statistics for infectious diseases at country, regional, and global levels.

354,000 triples

Primary data collected by the World Health Organization

1. World Health Organization

2. Leipzig University


Statistics about size and coverage were last checked on March 24, 2011.

Samwald et al. Journal of Cheminformatics 2011 3:19   doi:10.1186/1758-2946-3-19

Open Data