The ExperimentHub
server provides easy R / Bioconductor access to
large files of data.
The ExperimentHub package provides a client interface to resources stored at the ExperimentHub web service. It has similar functionality to AnnotationHub package.
library(ExperimentHub)
The ExperimentHub package is straightforward to use.
Create an ExperiemntHub
object
eh = ExperimentHub()
## snapshotDate(): 2018-10-30
Now at this point you have already done everything you need in order
to start retrieving experiment data. For most operations, using the
ExperimentHub
object should feel a lot like working with a familiar
list
or data.frame
and has all of the functionality of an Hub
object like AnnotationHub package’s AnnotationHub
object.
Lets take a minute to look at the show method for the hub object eh
eh
## ExperimentHub with 1697 records
## # snapshotDate(): 2018-10-30
## # $dataprovider: Eli and Edythe L. Broad Institute of Harvard and MIT, NA...
## # $species: Homo sapiens, Mus musculus, Danio rerio, Mus musculus (E18 mi...
## # $rdataclass: ExpressionSet, SummarizedExperiment, RaggedExperiment, dat...
## # additional mcols(): taxonomyid, genome, description,
## # coordinate_1_based, maintainer, rdatadateadded, preparerclass,
## # tags, rdatapath, sourceurl, sourcetype
## # retrieve records with, e.g., 'object[["EH1"]]'
##
## title
## EH1 | RNA-Sequencing and clinical data for 7706 tumor samples from ...
## EH166 | ERR188297
## EH167 | ERR188088
## EH168 | ERR188204
## EH169 | ERR188317
## ... ...
## EH1952 | 20181025.ZellerG_2014.marker_abundance.stool
## EH1953 | 20181025.ZellerG_2014.marker_presence.stool
## EH1954 | 20181025.ZellerG_2014.metaphlan_bugs_list.stool
## EH1955 | 20181025.ZellerG_2014.pathabundance_relab.stool
## EH1956 | 20181025.ZellerG_2014.pathcoverage.stool
You can see that it gives you an idea about the different types of data that are present inside the hub. You can see where the data is coming from (dataprovider), as well as what species have samples present (species), what kinds of R data objects could be returned (rdataclass). We can take a closer look at all the kinds of data providers that are available by simply looking at the contents of dataprovider as if it were the column of a data.frame object like this:
unique(eh$dataprovider)
## [1] "GEO"
## [2] "GEUVADIS"
## [3] "Allen Brain Atlas"
## [4] "ArrayExpress"
## [5] "Department of Psychology, Abdul Haq Campus, Federal Urdu University for Arts, Science and Technology, Karachi, Pakistan. shahiq_psy@yahoo.com"
## [6] "Department of Chemical and Biological Engineering, Chalmers University of Technology, SE-412 96 Gothenburg, Sweden."
## [7] "INRA, Institut National de la Recherche Agronomique, US1367 Metagenopolis, 78350 Jouy en Josas, France."
## [8] "Institute of Microbiology and Infection, University of Birmingham, Birmingham, England."
## [9] "1] Center for Biological Sequence Analysis, Technical University of Denmark, Kongens Lyngby, Denmark. [2] Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kongens Lyngby, Denmark. [3]."
## [10] "1] Department of Anthropology, University of Oklahoma, Dale Hall Tower, 521 Norman, Oklahoma 73019, USA [2] Universidad CientÃfica del Sur, Lima 18, Perú [3] City of Hope, NCI-designated Comprehensive Cancer Center, Duarte, California 91010, USA."
## [11] "Translational and Functional Genomics Branch, National Human Genome Research Institute, NIH, Bethesda, Maryland 20892, USA."
## [12] "BGI-Shenzhen, Shenzhen 518083, China."
## [13] "1] State Key Laboratory for Diagnosis and Treatment of Infectious Disease, The First Affiliated Hospital, College of Medicine, Zhejiang University, 310003 Hangzhou, China [2] Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, Zhejiang University, 310003 Hangzhou, China [3]."
## [14] "Department of Pharmacy and Biotechnology, University of Bologna, Bologna 40126, Italy."
## [15] "NA"
## [16] "Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany."
## [17] "Centre for Integrative Biology, University of Trento, Trento, Italy."
## [18] "Computational Biology Institute, George Washington University , Ashburn, VA , USA ; Center for Bioinformatics and Integrative Biology, Universidad Andres Bello, Facultad de Ciencias Biologicas , Santiago , Chile."
## [19] "Genome Institute of Singapore, Singapore 138672, Singapore."
## [20] "[1] BGI-Shenzhen, Shenzhen 518083, China [2] Department of Biology, University of Copenhagen, Ole Maaloes Vej 5, 2200 Copenhagen, Denmark."
## [21] "Luxembourg Centre for Systems Biomedicine, 7 avenue des Hauts-Fourneaux, 4362 Esch-sur-Alzette, Luxembourg."
## [22] "Key Laboratory of Dairy Biotechnology and Engineering, Education Ministry of P. R. China, Department of Food Science and Engineering, Inner Mongolia Agricultural University, Hohhot 010018, China."
## [23] "[1] Center for Biological Sequence Analysis, Technical University of Denmark, Kongens Lyngby, Denmark. [2] Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kongens Lyngby, Denmark. [3]."
## [24] "[1] Department of Anthropology, University of Oklahoma, Dale Hall Tower, 521 Norman, Oklahoma 73019, USA [2] Universidad Cientifica del Sur, Lima 18, Peru [3] City of Hope, NCI-designated Comprehensive Cancer Center, Duarte, California 91010, USA."
## [25] "Centre de Recherche en Infectiologie, CHU de Quebec-Universite Laval, Quebec, Canada."
## [26] "Department of Microbiology and Immunology, McGill University, Montreal, Quebec, Canada."
## [27] "Division of Cancer Epidemiology & Genetics, National Cancer Institute, Bethesda, Maryland, United States of America."
## [28] "BGI-Shenzhen, Shenzhen 518083, China; China National Genebank-Shenzhen, BGI-Shenzhen, Shenzhen 518083, China."
## [29] "Department of Medicine & Therapeutics, State Key Laboratory of Digestive Disease, Institute of Digestive Disease, LKS Institute of Health Sciences, CUHK Shenzhen Research Institute, The Chinese University of Hong Kong, Hong Kong."
## [30] "[1] State Key Laboratory for Diagnosis and Treatment of Infectious Disease, The First Affiliated Hospital, College of Medicine, Zhejiang University, 310003 Hangzhou, China [2] Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, Zhejiang University, 310003 Hangzhou, China [3]."
## [31] "1000 Genomes Project"
## [32] "yriMulti"
## [33] "10x Genomics"
## [34] "Illumina 450 methylation assay"
## [35] "GTex"
## [36] "Eli and Edythe L. Broad Institute of Harvard and MIT"
## [37] "Harmonized Cancer Datasets Genomic Data Commons Data Portal"
## [38] "[1] 1] BGI-Shenzhen, Shenzhen, China. [2] BGI Hong Kong Research Institute, Hong Kong, China. [3] School of Bioscience and Biotechnology, South China University of Technology, Guangzhou, China. [4]., [2] 1] BGI-Shenzhen, Shenzhen, China. [2]., [3] 1] BGI-Shenzhen, Shenzhen, China. [2] Department of Biology, University of Copenhagen, Copenhagen, Denmark. [3]., [4] European Molecular Biology Laboratory, Heidelberg, Germany., [5] 1] BGI-Shenzhen, Shenzhen, China. [2] European Molecular Biology Laboratory, Heidelberg, Germany. [3] The Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark., [6] INRA, Institut National de la Recherche Agronomique, Metagenopolis, Jouy en Josas, France., [7] The Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark., [8] Center for Biological Sequence Analysis, Technical University of Denmark, Kongens Lyngby, Denmark., [9] Digestive System Research Unit, University Hospital Vall d'Hebron, Ciberehd, Barcelona, Spain., [10] BGI-Shenzhen, Shenzhen, China., [11] 1] Department of Genetic Medicine, Faculty of Medicine, King Abdulaziz University (KAU), Jeddah, Saudi Arabia. [2] Princess Al-Jawhara AlBrahim Centre of Excellence in Research of Hereditary Disorders (PACER-HD), Faculty of Medicine, KAU, Jeddah, Saudi Arabia., [12] 1] Princess Al-Jawhara AlBrahim Centre of Excellence in Research of Hereditary Disorders (PACER-HD), Faculty of Medicine, KAU, Jeddah, Saudi Arabia. [2] Department of Biological Sciences, Faculty of Science, King Abdulaziz University (KAU), Jeddah, Saudi Arabia., [13] 1] BGI-Shenzhen, Shenzhen, China. [2] Princess Al-Jawhara AlBrahim Centre of Excellence in Research of Hereditary Disorders (PACER-HD), Faculty of Medicine, KAU, Jeddah, Saudi Arabia. [3] James D. Watson Institute of Genome Science, Hangzhou, China., [14] 1] BGI-Shenzhen, Shenzhen, China. [2] James D. Watson Institute of Genome Science, Hangzhou, China., [15] Department of Biology, University of Copenhagen, Copenhagen, Denmark., [16] NA, [17] 1] INRA, Institut National de la Recherche Agronomique, Metagenopolis, Jouy en Josas, France. [2] Centre for Host-Microbiome Interactions, Dental Institute Central Office, King's College London, Guy's Hospital, London Bridge, UK., [18] NA, [19] 1] BGI-Shenzhen, Shenzhen, China. [2] Department of Biology, University of Copenhagen, Copenhagen, Denmark. [3] The Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark. [4] Princess Al-Jawhara AlBrahim Centre of Excellence in Research of Hereditary Disorders (PACER-HD), Faculty of Medicine, KAU, Jeddah, Saudi Arabia. [5] Macau University of Science and Technology, Macau, China."
## [39] "NIH Common Fund Human Microbiome Project"
## [40] "10X Genomics"
## [41] "Array Express"
## [42] "PubMed"
## [43] "GDC"
## [44] "European Molecular Biology Laboratory"
## [45] "DKFZ"
## [46] "MDAnderson"
## [47] "EGA"
## [48] "paper supplementary"
## [49] "SMD"
## [50] "TCGA"
## [51] "[1] Centre for Integrative Biology, University of Trento, Trento, Italy., [2] Azienda Provinciale per i Servizi Sanitari, Trento, Italy."
## [52] "[1] Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA., [2] Broad Institute of MIT and Harvard, Cambridge, MA., [3] Sandia National Laboratories, Livermore, CA., [4] Wildlife Conservation Society, Suva, Fiji., [5] Edith Cowan University, Western Australia., [6] University of Helsinki, Helsinki, Finland., [7] Massachusetts General Hospital, Boston, MA., [8] Center for Microbiome, Informatics and Therapeutics, Massachusetts Institute of Technology, Cambridge, MA."
## [53] "[1] Computational Biology Institute, George Washington University , Ashburn, VA , USA ; Center for Bioinformatics and Integrative Biology, Universidad Andres Bello, Facultad de Ciencias Biologicas , Santiago , Chile., [2] Computational Biology Institute, George Washington University , Ashburn, VA , USA., [3] Computational Biology Institute, George Washington University , Ashburn, VA , USA ; CIBIO-InBIO, Centro de Investigacao em Biodiversidade e Recursos Geneticos, Universidade do Porto , Vairao , USA ; Division of Emergency Medicine, Children's National Medical Center , Washington, D.C. , USA., [4] Stanley Neurovirology Laboratory, Johns Hopkins School of Medicine , Baltimore, MD , USA., [5] Sheppard Pratt Hospital , Baltimore, MD , USA., [6] Schroeder Statistical Consulting LLC , Ellicott City, MD , USA."
## [54] "[1] Genome Institute of Singapore, Singapore 138672, Singapore., [2] Institute of Medical Biology, Singapore 138648, Singapore., [3] Institute of Molecular and Cell Biology, Singapore 138673, Singapore., [4] Department of Neurology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan 450001, China., [5] Institute of Biomedical Studies, Baylor University, Waco, Texas 76798, USA., [6] Department of Biological Sciences, National University of Singapore, Singapore 117543., [7] Singapore Immunology Network, Singapore 138648, Singapore., [8] Division of Plastic, Reconstructive &Aesthetic Surgery, National University Health System, Singapore 119074, Singapore., [9] National Skin Centre, Singapore 308205, Singapore., [10] Department of Microbiology and Immunology, National University of Singapore, Singapore 117545, Singapore."
## [55] "[1] 1] BGI-Shenzhen, Shenzhen 518083, China [2] Department of Biology, University of Copenhagen, Ole Maaloes Vej 5, 2200 Copenhagen, Denmark., [2] 1] BGI-Shenzhen, Shenzhen 518083, China [2] School of Bioscience and Biotechnology, South China University of Technology, Guangzhou 510006, China., [3] BGI-Shenzhen, Shenzhen 518083, China., [4] Department of Internal Medicine, Hospital Oberndorf, Teaching Hospital of the Paracelsus Private University of Salzburg, Paracelsusstrasse 37, 5110 Oberndorf, Austria., [5] 1] BGI-Shenzhen, Shenzhen 518083, China [2] School of Bioscience and Biotechnology, South China University of Technology, Guangzhou 510006, China [3] BGI Hong Kong Research Institute, Hong Kong, China., [6] 1] BGI-Shenzhen, Shenzhen 518083, China [2] Princess Al Jawhara Center of Excellence in the Research of Hereditary Disorders, King Abdulaziz University, Jeddah 21589, Saudi Arabia., [7] 1] BGI-Shenzhen, Shenzhen 518083, China [2] The Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen, Denmark., [8] First Department of Internal Medicine, Medical University Innsbruck, Anichstrasse 35, 6020 Innsbruck, Austria., [9] 1] BGI-Shenzhen, Shenzhen 518083, China [2] Department of Biology, University of Copenhagen, Ole Maaloes Vej 5, 2200 Copenhagen, Denmark [3] Princess Al Jawhara Center of Excellence in the Research of Hereditary Disorders, King Abdulaziz University, Jeddah 21589, Saudi Arabia [4] Macau University of Science and Technology, Avenida Wai long, Taipa, Macau 999078, China."
## [56] "[1] Luxembourg Centre for Systems Biomedicine, 7 avenue des Hauts-Fourneaux, 4362 Esch-sur-Alzette, Luxembourg., [2] Integrated BioBank of Luxembourg, 6 rue Nicolas Ernest Barble, 1210 Luxembourg, Luxembourg., [3] Department of Internal Medicine II, Saarland University Medical Center, 66421 Homburg, Germany., [4] Centre Hospitalier Emile Mayrisch, Rue Emile Mayrisch, 4240 Esch-sur-Alzette, Luxembourg., [5] Clinique Pediatrique - Centre Hospitalier de Luxembourg, 4 rue Nicolas Ernest Barble, 1210 Luxembourg."
## [57] "[1] Department of Chemical and Biological Engineering, Chalmers University of Technology, SE-412 96 Gothenburg, Sweden."
## [58] "[1] INRA, Institut National de la Recherche Agronomique, US1367 Metagenopolis, 78350 Jouy en Josas, France."
## [59] "[1] 1] BGI-Shenzhen, Shenzhen, China. [2] BGI Hong Kong Research Institute, Hong Kong, China. [3] School of Bioscience and Biotechnology, South China University of Technology, Guangzhou, China. [4]., [2] 1] BGI-Shenzhen, Shenzhen, China. [2]., [3] 1] BGI-Shenzhen, Shenzhen, China. [2] Department of Biology, University of Copenhagen, Copenhagen, Denmark. [3]., [4] European Molecular Biology Laboratory, Heidelberg, Germany., [5] 1] BGI-Shenzhen, Shenzhen, China. [2] European Molecular Biology Laboratory, Heidelberg, Germany. [3] The Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark., [6] INRA, Institut National de la Recherche Agronomique, Metagenopolis, Jouy en Josas, France., [7] The Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark., [8] Center for Biological Sequence Analysis, Technical University of Denmark, Kongens Lyngby, Denmark., [9] Digestive System Research Unit, University Hospital Vall d'Hebron, Ciberehd, Barcelona, Spain., [10] BGI-Shenzhen, Shenzhen, China., [11] 1] Department of Genetic Medicine, Faculty of Medicine, King Abdulaziz University (KAU), Jeddah, Saudi Arabia. [2] Princess Al-Jawhara AlBrahim Centre of Excellence in Research of Hereditary Disorders (PACER-HD), Faculty of Medicine, KAU, Jeddah, Saudi Arabia., [12] 1] Princess Al-Jawhara AlBrahim Centre of Excellence in Research of Hereditary Disorders (PACER-HD), Faculty of Medicine, KAU, Jeddah, Saudi Arabia. [2] Department of Biological Sciences, Faculty of Science, King Abdulaziz University (KAU), Jeddah, Saudi Arabia., [13] 1] BGI-Shenzhen, Shenzhen, China. [2] Princess Al-Jawhara AlBrahim Centre of Excellence in Research of Hereditary Disorders (PACER-HD), Faculty of Medicine, KAU, Jeddah, Saudi Arabia. [3] James D. Watson Institute of Genome Science, Hangzhou, China., [14] 1] BGI-Shenzhen, Shenzhen, China. [2] James D. Watson Institute of Genome Science, Hangzhou, China., [15] Department of Biology, University of Copenhagen, Copenhagen, Denmark., [16] 1] INRA, Institut National de la Recherche Agronomique, Metagenopolis, Jouy en Josas, France. [2] INRA, Institut National de la Recherche Agronomique, Unite mixte de Recherche 14121 Microbiologie de l'Alimentation au Service de la Sante, Jouy en Josas, France., [17] 1] INRA, Institut National de la Recherche Agronomique, Metagenopolis, Jouy en Josas, France. [2] Centre for Host-Microbiome Interactions, Dental Institute Central Office, King's College London, Guy's Hospital, London Bridge, UK., [18] 1] European Molecular Biology Laboratory, Heidelberg, Germany. [2] Max Delbruck Centre for Molecular Medicine, Berlin, Germany., [19] 1] BGI-Shenzhen, Shenzhen, China. [2] Department of Biology, University of Copenhagen, Copenhagen, Denmark. [3] The Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark. [4] Princess Al-Jawhara AlBrahim Centre of Excellence in Research of Hereditary Disorders (PACER-HD), Faculty of Medicine, KAU, Jeddah, Saudi Arabia. [5] Macau University of Science and Technology, Macau, China."
## [60] "[1] Hypertension Center, Fuwai Hospital, State Key Laboratory of Cardiovascular Disease of China, National Center for Cardiovascular Diseases of China, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100037, China., [2] Department of Cardiology, Beijing ChaoYang Hospital, Capital Medical University, Beijing, 100020, China., [3] Beijing Key Laboratory of Hypertension, Beijing, 100020, China., [4] Computational Genomics Laboratory, Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, 100101, China., [5] Novogene Bioinformatics Institute, Beijing, 100000, China., [6] Department of Cardiology, Baoding NO.1 Central Hospital, Baoding, 071000, China., [7] Department of Cardiology, The First Affiliated Hospital, Xi'an Jiaotong University, Xi'an, 710061, China., [8] Department of Cardiology Kailuan General Hospital, Hebei Union University, Tangshan, 063000, China., [9] Department of Biomedical Informatics, Centre for Noncoding RNA Medicine, School of Basic Medical Sciences, Peking University, Beijing, 100191, China., [10] Department of Biology and Biochemistry, University of Houston, Houston, TX, 77204, USA., [11] Medical Research Center, Beijing ChaoYang Hospital, Capital Medical University, Beijing, 100020, China., [12] Department of Stem Cell Engineering, Texas Heart Institute, Houston, TX, 77030, USA., [13] Tongji Hospital, Huazhong University of Science and Technology, Wuhan, Hubei, 430030, China., [14] Department of Cardiology, Beijing ChaoYang Hospital, Capital Medical University, Beijing, 100020, China. yxc6229@sina.com., [15] Beijing Key Laboratory of Hypertension, Beijing, 100020, China. yxc6229@sina.com., [16] CAS Key Laboratory of Pathogenic Microbiology and Immunology, Institute of Microbiology, Chinese Academy of Sciences, Beijing, 100101, China. zhubaoli@im.ac.cn., [17] Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, College of Medicine, Zhejiang University, Hangzhou, 310003, China. zhubaoli@im.ac.cn., [18] Hypertension Center, Fuwai Hospital, State Key Laboratory of Cardiovascular Disease of China, National Center for Cardiovascular Diseases of China, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100037, China. caijun@fuwaihospital.org."
## [61] "[1] Key Laboratory of Dairy Biotechnology and Engineering, Education Ministry of P. R. China, Department of Food Science and Engineering, Inner Mongolia Agricultural University, Hohhot 010018, China., [2] RealBio Genomic Institute, Shanghai 200050, China., [3] State Key Laboratory for Diagnosis and Treatment of Infectious Disease, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, the First Affiliated Hospital, Zhejiang University, Hangzhou 310003, China."
## [62] "[1] Institute of Microbiology and Infection, University of Birmingham, Birmingham, England."
## [63] "[1] Institute of Clinical Nutrition, University of Hohenheim, Stuttgart, Germany., [2] Algorithms in Bioinformatics, University of Tubingen, Tubingen, Germany."
## [64] "[1] 1] Center for Biological Sequence Analysis, Technical University of Denmark, Kongens Lyngby, Denmark. [2] Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kongens Lyngby, Denmark. [3]., [2] 1] INRA, Institut National de la Recherche Agronomique, UMR 14121 MICALIS, Jouy en Josas, France. [2] INRA, Institut National de la Recherche Agronomique, US 1367 Metagenopolis, Jouy en Josas, France. [3] Department of Computer Science, Center for Bioinformatics and Computational Biology, University of Maryland, USA. [4]., [3] 1] Center for Biological Sequence Analysis, Technical University of Denmark, Kongens Lyngby, Denmark. [2] Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kongens Lyngby, Denmark., [4] Center for Biological Sequence Analysis, Technical University of Denmark, Kongens Lyngby, Denmark., [5] 1] BGI Hong Kong Research Institute, Hong Kong, China. [2] BGI-Shenzhen, Shenzhen, China. [3] School of Bioscience and Biotechnology, South China University of Technology, Guangzhou, China., [6] European Molecular Biology Laboratory, Heidelberg, Germany., [7] 1] INRA, Institut National de la Recherche Agronomique, UMR 14121 MICALIS, Jouy en Josas, France. [2] INRA, Institut National de la Recherche Agronomique, US 1367 Metagenopolis, Jouy en Josas, France., [8] 1] Commissariat a l'Energie Atomique et aux Energies Alternatives, Institut de Genomique, Evry, France. [2] Centre National de la Recherche Scientifique, Evry, France. [3] Universite d'Evry Val d'Essonne, Evry, France., [9] The Novo Nordisk Foundation Center for Basic Metabolic Research, University of Copenhagen, Copenhagen, Denmark., [10] Digestive System Research Unit, University Hospital Vall d'Hebron, Ciberehd, Barcelona, Spain., [11] 1] BGI-Shenzhen, Shenzhen, China. [2] European Molecular Biology Laboratory, Heidelberg, Germany. [3] The Novo Nordisk Foundation Center for Basic Metabolic Research, University of Copenhagen, Copenhagen, Denmark., [12] Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kongens Lyngby, Denmark., [13] 1] The Novo Nordisk Foundation Center for Basic Metabolic Research, University of Copenhagen, Copenhagen, Denmark. [2] Faculty of Health Sciences, University of Southern Denmark, Odense, Denmark., [14] 1] Department of Structural Biology, VIB, Brussels, Belgium. [2] Department of Bioscience Engineering, Vrije Universiteit, Brussels, Belgium., [15] National Food Institute, Division for Epidemiology and Microbial Genomics, Technical University of Denmark, Kongens Lyngby, Denmark., [16] 1] BGI-Shenzhen, Shenzhen, China. [2] Department of Biology, University of Copenhagen, Copenhagen, Denmark., [17] 1] The Novo Nordisk Foundation Center for Basic Metabolic Research, University of Copenhagen, Copenhagen, Denmark. [2] Hagedorn Research Institute, Gentofte, Denmark. [3] Institute of Biomedical Science, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark. [4] Faculty of Health, Aarhus University, Aarhus, Denmark., [18] 1] BGI Hong Kong Research Institute, Hong Kong, China. [2] BGI-Shenzhen, Shenzhen, China., [19] 1] Department of Bioscience Engineering, Vrije Universiteit, Brussels, Belgium. [2] Department of Microbiology and Immunology, Rega Institute, KU Leuven, Belgium. [3] VIB Center for the Biology of Disease, Leuven, Belgium., [20] Section of Microbiology, Department of Biology, University of Copenhagen, Copenhagen, Denmark., [21] Laboratory of Microbiology, Wageningen University, Wageningen, The Netherlands., [22] 1] European Molecular Biology Laboratory, Heidelberg, Germany. [2] Department of Biological Information, Tokyo Institute of Technology, Yokohama, Japan., [23] INRA, Institut National de la Recherche Agronomique, UMR 14121 MICALIS, Jouy en Josas, France., [24] 1] European Molecular Biology Laboratory, Heidelberg, Germany. [2] Max Delbruck Centre for Molecular Medicine, Berlin, Germany., [25] 1] BGI-Shenzhen, Shenzhen, China. [2] The Novo Nordisk Foundation Center for Basic Metabolic Research, University of Copenhagen, Copenhagen, Denmark. [3] Department of Biology, University of Copenhagen, Copenhagen, Denmark. [4] Princess Al Jawhara Center of Excellence in the Research of Hereditary Disorders, King Abdulaziz University, Jeddah, Saudi Arabia., [26] 1] INRA, Institut National de la Recherche Agronomique, UMR 14121 MICALIS, Jouy en Josas, France. [2] INRA, Institut National de la Recherche Agronomique, US 1367 Metagenopolis, Jouy en Josas, France. [3] King's College London, Centre for Host-Microbiome Interactions, Dental Institute Central Office, Guy's Hospital, United Kingdom."
## [65] "[1] 1] Department of Anthropology, University of Oklahoma, Dale Hall Tower, 521 Norman, Oklahoma 73019, USA [2] Universidad Cientifica del Sur, Lima 18, Peru [3] City of Hope, NCI-designated Comprehensive Cancer Center, Duarte, California 91010, USA., [2] 1] Department of Anthropology, University of Oklahoma, Dale Hall Tower, 521 Norman, Oklahoma 73019, USA [2] Universidad Cientifica del Sur, Lima 18, Peru, [3] City of Hope, NCI-designated Comprehensive Cancer Center, Duarte, California 91010, USA., [4] Department of Anthropology, University of Oklahoma, Dale Hall Tower, 521 Norman, Oklahoma 73019, USA., [5] Department of Chemistry and Biochemistry, University of Colorado, Boulder, Colorado 80309, USA., [6] Departments of Pediatrics and Computer Science &Engineering University of California San Diego, La Jolla, CA 92093, USA., [7] Oklahoma Medical Research Foundation, Oklahoma City, Oklahoma 73104, USA., [8] Instituto Nacional de Salud, Lima 11, Peru, [9] Old Dominion University, Norfolk, Virginia 23529, USA., [10] Universidad Cientifica del Sur, Lima 18, Peru"
## [66] "[1] Translational and Functional Genomics Branch, National Human Genome Research Institute, NIH, Bethesda, Maryland 20892, USA., [2] 1] Dermatology Branch, Center for Cancer Research, National Cancer Institute, NIH, Bethesda, Maryland 20892, USA [2]., [3] 1] Translational and Functional Genomics Branch, National Human Genome Research Institute, NIH, Bethesda, Maryland 20892, USA [2]."
## [67] "[1] BGI-Shenzhen, Shenzhen 518083, China."
## [68] "[1] 1] State Key Laboratory for Diagnosis and Treatment of Infectious Disease, The First Affiliated Hospital, College of Medicine, Zhejiang University, 310003 Hangzhou, China [2] Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, Zhejiang University, 310003 Hangzhou, China [3]., [2] 1] State Key Laboratory for Diagnosis and Treatment of Infectious Disease, The First Affiliated Hospital, College of Medicine, Zhejiang University, 310003 Hangzhou, China [2]., [3] 1] Metagenopolis, Institut National de la Recherche Agronomique, 78350 Jouy en Josas, France [2]., [4] State Key Laboratory for Diagnosis and Treatment of Infectious Disease, The First Affiliated Hospital, College of Medicine, Zhejiang University, 310003 Hangzhou, China., [5] Metagenopolis, Institut National de la Recherche Agronomique, 78350 Jouy en Josas, France., [6] 1] State Key Laboratory for Diagnosis and Treatment of Infectious Disease, The First Affiliated Hospital, College of Medicine, Zhejiang University, 310003 Hangzhou, China [2] Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, Zhejiang University, 310003 Hangzhou, China., [7] 1] Metagenopolis, Institut National de la Recherche Agronomique, 78350 Jouy en Josas, France [2] King's College London, Centre for Host-Microbiome Interactions, Dental Institute Central Office, Guy's Hospital, London Bridge, London SE1 9RT, UK., [8] 1] Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, Zhejiang University, 310003 Hangzhou, China [2] Key Laboratory of Combined Multi-organ Transplantation, Ministry of Public Health, the First Affiliated Hospital, Zhejiang University, 310003 Hangzhou, China."
## [69] "[1] Department of Pharmacy and Biotechnology, University of Bologna, Bologna 40126, Italy., [2] Plant Foods in Hominin Dietary Ecology Research Group, Max Planck Institute for Evolutionary Anthropology, Leipzig 04103, Germany. Electronic address: stephanie_schnorr@eva.mpg.de., [3] Institute of Biomedical Technologies, Italian National Research Council, Segrate, Milan 20090, Italy., [4] Metabolism, Anthropometry, and Nutrition Laboratory, Department of Anthropology, University of Nevada, Las Vegas, NV 89154-5003, USA., [5] Plant Foods in Hominin Dietary Ecology Research Group, Max Planck Institute for Evolutionary Anthropology, Leipzig 04103, Germany., [6] Department of Pharmacy and Biotechnology, University of Bologna, Bologna 40126, Italy. Electronic address: marco.candela@unibo.it."
## [70] "[1] Centre de Recherche en Infectiologie, CHU de Quebec-Universite Laval, Quebec, Canada., [2] Institut National de Sante Publique du Quebec, Laboratoire de Sante Publique du Quebec, Montreal, Quebec, Canada."
## [71] "[1] Department of Microbiology and Immunology, Stanford University School of Medicine, Stanford, CA 94305, USA., [2] Human Food Project, 53600 Highway 118, Terlingua, TX 79852, USA., [3] The Department of Twin Research and Genetic Epidemiology, King's College London, St. Thomas' Hospital, Lambeth Palace Road, London SE1 7EH, UK., [4] Department of Chemical and Systems Biology, Stanford School of Medicine, Stanford University, Stanford, CA 94025, USA., [5] Lawson Health Research Institute and Western University, London, Ontario N6A 4V2, Canada., [6] Departments of Pediatrics and Computer Science and Engineering and Center for Microbiome Innovation, University of California, San Diego, CA 92093, USA., [7] National Institute for Medical Research, Mwanza 11101, Tanzania., [8] School of Medicine and Department of Anthropology, New York University, New York, NY, USA."
## [72] "[1] Centre for Integrative Biology, University of Trento, Trento, Italy., [2] Istituto G.B. Mattei, Comano, Italy., [3] NGS Facility, Laboratory of Biomolecular Sequence and Structure Analysis for Health, Centre for Integrative Biology, University of Trento, Trento, Italy., [4] Department of Medicine, Section of Dermatology, University of Verona, Verona, Italy."
## [73] "[1] Department of Microbiology and Immunology, McGill University, Montreal, Quebec, Canada., [2] Genome Quebec Innovation Centre, McGill University, Montreal, Quebec, Canada., [3] Jewish General Hospital, Montreal, Quebec, Canada., [4] Devil's Staircase Consulting, North Vancouver, British Columbia, Canada., [5] New York Genome Center, New York, NY, USA., [6] Department of Human Genetics, McGill University, Montreal, Quebec, Canada., [7] School of Population and Public Health, University of British Columbia, Vancouver, British Columbia, Canada. amee.manges@ubc.ca."
## [74] "[1] Division of Cancer Epidemiology & Genetics, National Cancer Institute, Bethesda, Maryland, United States of America., [2] Division of Cancer Prevention, National Cancer Institute, Bethesda, Maryland, United States of America., [3] Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany., [4] Department of Applied Tumor Biology, Institute of Pathology, University Hospital Heidelberg, Heidelberg, Germany., [5] Clinical Cooperation Unit Applied Tumor Biology, German Cancer Research Center (DKFZ), Heidelberg, Germany., [6] Molecular Medicine Partnership Unit (MMPU), University Hospital Heidelberg and European Molecular Biology Laboratory, Heidelberg, Germany., [7] Genomics Core Facility, European Molecular Biology Laboratory, Heidelberg, Germany., [8] Max Delbruck Centre for Molecular Medicine, Berlin, Germany., [9] Department of Bioinformatics Biocenter, University of Wurzburg, Wurzburg, Germany."
## [75] "[1] BGI-Shenzhen, Shenzhen 518083, China; China National Genebank-Shenzhen, BGI-Shenzhen, Shenzhen 518083, China., [2] BGI-Shenzhen, Shenzhen 518083, China; China National Genebank-Shenzhen, BGI-Shenzhen, Shenzhen 518083, China; Shenzhen Engineering Laboratory of Detection and Intervention of Human Intestinal Microbiome, BGI-Shenzhen, Shenzhen 518083, China; Macau University of Science and Technology, Taipa, Macau 999078, China., [3] BGI-Shenzhen, Shenzhen 518083, China; China National Genebank-Shenzhen, BGI-Shenzhen, Shenzhen 518083, China; Shenzhen Engineering Laboratory of Detection and Intervention of Human Intestinal Microbiome, BGI-Shenzhen, Shenzhen 518083, China., [4] BGI-Shenzhen, Shenzhen 518083, China., [5] Department of Twin Research and Genetic Epidemiology, King's College London, London SE1 7EH, UK., [6] BGI-Shenzhen, Shenzhen 518083, China; BGI Education Center, University of Chinese Academy of Sciences, Shenzhen 518083, China., [7] BGI-Shenzhen, Shenzhen 518083, China; Qingdao University-BGI Joint Innovation College, Qingdao University, Qingdao 266071, China., [8] BGI-Shenzhen, Shenzhen 518083, China; China National Genebank-Shenzhen, BGI-Shenzhen, Shenzhen 518083, China; Shenzhen Key Laboratory of Human Commensal Microorganisms and Health Research, BGI-Shenzhen, Shenzhen 518083, China., [9] BGI-Shenzhen, Shenzhen 518083, China; China National Genebank-Shenzhen, BGI-Shenzhen, Shenzhen 518083, China; BGI Education Center, University of Chinese Academy of Sciences, Shenzhen 518083, China., [10] BGI-Shenzhen, Shenzhen 518083, China; China National Genebank-Shenzhen, BGI-Shenzhen, Shenzhen 518083, China; James D. Watson Institute of Genome Sciences, Hangzhou 310058, China., [11] BGI-Shenzhen, Shenzhen 518083, China; Department of Biology, University of Copenhagen, Ole Maaloes Vej 5, 2200 Copenhagen, Denmark., [12] BGI-Shenzhen, Shenzhen 518083, China; Macau University of Science and Technology, Taipa, Macau 999078, China; Shenzhen Key Laboratory of Human Commensal Microorganisms and Health Research, BGI-Shenzhen, Shenzhen 518083, China., [13] BGI-Shenzhen, Shenzhen 518083, China; China National Genebank-Shenzhen, BGI-Shenzhen, Shenzhen 518083, China; Shenzhen Key Laboratory of Human Commensal Microorganisms and Health Research, BGI-Shenzhen, Shenzhen 518083, China. Electronic address: lijunhua@genomics.cn., [14] Department of Twin Research and Genetic Epidemiology, King's College London, London SE1 7EH, UK. Electronic address: tim.spector@kcl.ac.uk., [15] BGI-Shenzhen, Shenzhen 518083, China; China National Genebank-Shenzhen, BGI-Shenzhen, Shenzhen 518083, China; Macau University of Science and Technology, Taipa, Macau 999078, China; Shenzhen Key Laboratory of Human Commensal Microorganisms and Health Research, BGI-Shenzhen, Shenzhen 518083, China. Electronic address: jiahuijue@genomics.cn."
## [76] "[1] Department of Medicine & Therapeutics, State Key Laboratory of Digestive Disease, Institute of Digestive Disease, LKS Institute of Health Sciences, CUHK Shenzhen Research Institute, The Chinese University of Hong Kong, Hong Kong., [2] BGI-Shenzhen, Shenzhen, China., [3] Department of Biology, University of Copenhagen, Copenhagen, Denmark., [4] Department of Veterinary Disease Biology, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark., [5] Princess Al Jawhara Center of Excellence in the Research of Hereditary Disorders, King Abdulaziz University, Jeddah, Saudi Arabia., [6] Department of Surgical Gastroenterology, Hvidovre Hospital, Hvidovre, Denmark., [7] National Institute of Nutrition and Seafood Research, Bergen, Norway., [8] Department of Internal Medicine, Hospital Oberndorf, Q3 Teaching Hospital of the Paracelsus Private University of Salzburg, Oberndorf, Austria., [9] First Department of Internal Medicine, Medical University Innsbruck, Innsbruck, Austria., [10] The Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark., [11] Macau University of Science and Technology, Macau, China."
## [77] "[1] Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany., [2] Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany Department of Gastroenterology and LIC-EA4393-EC2M3, APHP and UPEC Universite Paris-Est Creteil, Creteil, France., [3] Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany Department of Applied Tumor Biology, Institute of Pathology University Hospital Heidelberg, Heidelberg, Germany Clinical Cooperation Unit Applied Tumor Biology, German Cancer Research Center (DKFZ), Heidelberg, Germany Molecular Medicine Partnership Unit (MMPU), University Hospital Heidelberg and European Molecular Biology Laboratory, Heidelberg, Germany., [4] Department of Gastroenterology and LIC-EA4393-EC2M3, APHP and UPEC Universite Paris-Est Creteil, Creteil, France., [5] Division of Preventive Oncology, National Center for Tumor Diseases (NCT) Heidelberg, Heidelberg, Germany German Cancer Research Center (DKFZ), Heidelberg, Germany., [6] Department of Surgery, APHP and UPEC Universite Paris-Est Creteil, Creteil, France., [7] Genomics Core Facility, European Molecular Biology Laboratory, Heidelberg, Germany., [8] Department of General, Visceral and Transplantation Surgery, University Hospital Heidelberg, Heidelberg, Germany., [9] Department of Radiology, APHP and UPEC Universite Paris-Est Creteil, Creteil, France., [10] Department of Medical Oncology, APHP and UPEC Universite Paris-Est Creteil, Creteil, France., [11] Department of Pathology and LIC-EA4393-EC2M3, APHP and UPEC Universite Paris-Est Creteil, Creteil, France., [12] Department of Biological Information, Tokyo Institute of Technology, Tokyo, Japan., [13] Department of Applied Tumor Biology, Institute of Pathology University Hospital Heidelberg, Heidelberg, Germany Clinical Cooperation Unit Applied Tumor Biology, German Cancer Research Center (DKFZ), Heidelberg, Germany Molecular Medicine Partnership Unit (MMPU), University Hospital Heidelberg and European Molecular Biology Laboratory, Heidelberg, Germany., [14] Division of Preventive Oncology, National Center for Tumor Diseases (NCT) Heidelberg, Heidelberg, Germany German Cancer Research Center (DKFZ), Heidelberg, Germany Fred Hutchinson Cancer Research Center (FHCRC), Seattle, WA, USA., [15] Department of Gastroenterology and LIC-EA4393-EC2M3, APHP and UPEC Universite Paris-Est Creteil, Creteil, France iradj.sobhani@hmn.aphp.fr bork@embl.de., [16] Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany Molecular Medicine Partnership Unit (MMPU), University Hospital Heidelberg and European Molecular Biology Laboratory, Heidelberg, Germany Max Delbruck Centre for Molecular Medicine, Berlin, Germany iradj.sobhani@hmn.aphp.fr bork@embl.de."
## [78] "GRCh"
## [79] "Illumina Inc"
## [80] "Wanding Zhou <zhouwanding@gmail.com>"
## [81] "Steve Horvath"
## [82] "Salk Institute"
## [83] "ICGC"
## [84] "Private"
## [85] "GEO, cbioportal"
## [86] "Robinson group (UZH)"
## [87] "SRA (SRP073808), Koh et al. (2016)"
## [88] "GEO (GSE60749), Kumar et al. (2014)"
## [89] "GEO (GSE52529), Trapnell et al. (2014)"
## [90] "10x Genomics, Zheng et al (2017)"
## [91] "10X"
## [92] "Haemosphere"
## [93] "linnarssonlab.org"
## [94] "Tabula Muris Consortium"
## [95] "Bench to Bassinet GnomEx CvDC"
## [96] "NCI_GDC"
## [97] "Swedish BioMS infrastructure"
## [98] "[1] Department of Infectious Diseases, Institute of Biomedicine, The Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden., [2] Department of Clinical Microbiology, Infectious Diseases, Umea University, Umea, Sweden., [3] Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, Solna, Sweden., [4] Department of Mathematical Sciences, Chalmers University of Technology, Gothenburg, Sweden., [5] Laboratory for Molecular Infection Medicine Sweden, Department of Clinical Microbiology, Bacteriology, Umea University, Umea, Sweden anders.f.johansson@umu.se."
## [99] "[1] Society of Fellows, Harvard University, Cambridge, Massachusetts, USA FAS Center for Systems Biology, Harvard University, Cambridge, Massachusetts, USA., [2] Center for Vaccine Sciences, International Centre for Diarrhoeal Disease Research, Dhaka, Bangladesh., [3] rclarocque@partners.org Peter.Turnbaugh@ucsf.edu., [4] FAS Center for Systems Biology, Harvard University, Cambridge, Massachusetts, USA rclarocque@partners.org Peter.Turnbaugh@ucsf.edu."
## [100] "[1] Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Center for Computational and Integrative Biology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA; Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA., [2] Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA., [3] Children's Hospital, University of Helsinki and Helsinki University Hospital, 00290 Helsinki, Finland; Research Program Unit, Diabetes and Obesity, University of Helsinki, 00290 Helsinki, Finland., [4] Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Department of Information and Computer Science, Aalto University School of Science, 02150 Espoo, Finland., [5] Steno Diabetes Center, 2820 Gentofte, Denmark; VTT Technical Research Centre of Finland, 02044 Espoo, Finland., [6] Department of Pediatrics, Jorvi Hospital, 02740 Espoo, Finland., [7] Department of Pediatrics, University of Tartu, Estonia and Tartu University Hospital, 51014 Tartu, Estonia., [8] Faculty of Pharmacy, University of Helsinki, 00290 Helsinki, Finland; VTT Technical Research Centre of Finland, 02044 Espoo, Finland., [9] Department of Information and Computer Science, Aalto University School of Science, 02150 Espoo, Finland., [10] Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA., [11] Research Program Unit, Diabetes and Obesity, University of Helsinki, 00290 Helsinki, Finland., [12] Department of Medical Microbiology, University Medical Center Groningen and University of Groningen, 9713 GZ Groningen, the Netherlands., [13] Immunogenetics Laboratory, University of Turku, 20520 Turku, Finland; Department of Clinical Microbiology, University of Eastern Finland, 70211 Kuopio, Finland., [14] Department of Lifestyle and Participation, National Institute for Health and Welfare, 00271 Helsinki, Finland; School of Health Sciences, University of Tampere, 33014 Tampere, Finland; Science Centre, Pirkanmaa Hospital District, 33521 Tampere, Finland., [15] Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA., [16] Children's Hospital, University of Helsinki and Helsinki University Hospital, 00290 Helsinki, Finland; Research Program Unit, Diabetes and Obesity, University of Helsinki, 00290 Helsinki, Finland; Folkhalsan Research Center, 00290 Helsinki, Finland; Department of Pediatrics, Tampere University Hospital, 33521 Tampere, Finland., [17] Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Center for Computational and Integrative Biology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA; Gastrointestinal Unit and Center for the Study of Inflammatory Bowel Disease, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA; Center for Microbiome Informatics and Therapeutics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA. Electronic address: xavier@molbio.mgh.harvard.edu."
## [101] "[1] Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany. School of Biotechnology and Biomolecular Sciences, University of New South Wales, 2052 Sydney, Australia., [2] Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany., [3] Genomics Core Facility, European Molecular Biology Laboratory, 69117 Heidelberg, Germany., [4] Department of Vascular Medicine, Academic Medical Center, 1105 AZ Amsterdam, Netherlands. Diabetes Center, Vrije University Medical Center, 1018 HV Amsterdam, Netherlands. Wallenberg Laboratory, University of Gothenburg, 41345 Gothenburg, Sweden., [5] Department of Veterinary Biosciences, University of Helsinki, 00014 Helsinki, Finland. Department of Biosciences, University of Helsinki, 00014 Helsinki, Finland., [6] Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany. Department of Applied Tumor Biology, Institute of Pathology, University Hospital Heidelberg, 69120 Heidelberg, Germany. Molecular Medicine Partnership Unit, University of Heidelberg and European Molecular Biology Laboratory, 69120 Heidelberg, Germany., [7] Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany. bork@embl.de willem.devos@wur.nl sunagawa@embl.de., [8] Department of Veterinary Biosciences, University of Helsinki, 00014 Helsinki, Finland. Laboratory of Microbiology, Wageningen University, 6703 HB Wageningen, Netherlands. Immunobiology Research Program, Department of Bacteriology and Immunology, University of Helsinki, 00014 Helsinki, Finland. bork@embl.de willem.devos@wur.nl sunagawa@embl.de., [9] Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany. Molecular Medicine Partnership Unit, University of Heidelberg and European Molecular Biology Laboratory, 69120 Heidelberg, Germany. Max Delbruck Centre for Molecular Medicine, 13125 Berlin, Germany. Department of Bioinformatics, Biocenter, University of Wurzburg, 97074 Wurzburg, Germany. bork@embl.de willem.devos@wur.nl sunagawa@embl.de."
## [102] "[1] NAFLD Research Center, Department of Medicine, University of California, San Diego, La Jolla, CA 92093, USA; Division of Epidemiology, Department of Family and Preventive Medicine, University of California, San Diego, La Jolla, CA 92093, USA; Division of Gastroenterology, Department of Medicine, University of California, San Diego, La Jolla, CA 92093, USA. Electronic address: roloomba@ucsd.edu., [2] Human Longevity, San Diego, CA 92121, USA., [3] Human Longevity, San Diego, CA 92121, USA; J. Craig Venter Institute, La Jolla, CA 92037, USA., [4] NAFLD Research Center, Department of Medicine, University of California, San Diego, La Jolla, CA 92093, USA., [5] NAFLD Research Center, Department of Medicine, University of California, San Diego, La Jolla, CA 92093, USA; Division of Gastroenterology, Department of Medicine, University of California, San Diego, La Jolla, CA 92093, USA., [6] J. Craig Venter Institute, La Jolla, CA 92037, USA., [7] Liver Imaging Group, Department of Radiology, University of California, San Diego, La Jolla, CA 92093, USA., [8] J. Craig Venter Institute, Rockville, MD 20850, USA."
## [103] "[1] Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA., [2] Department of Surgery, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania 15213, USA., [3] Division of Newborn Medicine, Children's Hospital of Pittsburgh and Magee-Womens Hospital of UPMC, Pittsburgh, Pennsylvania 15213, USA., [4] Department of Earth and Planetary Science, University of California, Berkeley, California 94709, USA., [5] Department of Environmental Science, Policy, and Management, University of California, Berkeley, California 94720, USA., [6] Earth Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA."
## [104] "[1] Department of Molecular and Medical Pharmacology, Crump Institute for Molecular Imaging, David Geffen School of Medicine, UCLA, Los Angeles, California, USA., [2] Section of Periodontics, School of Dentistry, UCLA, Los Angeles, California, USA., [3] The Genome Institute, Washington University, St. Louis, Missouri, USA., [4] The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, USA., [5] huiying@mednet.ucla.edu."
## [105] "[1] Institute of Basic Research in Clinical Medicine, College of Basic Medical Science, Zhejiang Chinese Medical University, Hangzhou, 310053, China. wengcp@163.com., [2] Realbio Genomics Institute, Shanghai, 200123, China., [3] Institute of Basic Research in Clinical Medicine, College of Basic Medical Science, Zhejiang Chinese Medical University, Hangzhou, 310053, China., [4] State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, Department of Infectious Diseases, the First Affiliated College of Medicine, Zhejiang University, Hangzhou, 310003, China., [5] Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, Zhejiang University, Hangzhou, 310003, China., [6] INRA, Institut National de la Recherche Agronomique, Metagenopolis, Jouy en Josas, 78350, France., [7] Rheumatology Division, Ambroise-Pare Hospital, AP-HP, 9, avenue Charles-de-Gaulle, 92100, Boulogne-Billancourt, France., [8] Realbio Genomics Institute, Shanghai, 200123, China. qinnan001@126.com., [9] State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, Department of Infectious Diseases, the First Affiliated College of Medicine, Zhejiang University, Hangzhou, 310003, China. qinnan001@126.com., [10] Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, Zhejiang University, Hangzhou, 310003, China. qinnan001@126.com., [11] INRA, Institut National de la Recherche Agronomique, Metagenopolis, Jouy en Josas, 78350, France. dusko.ehrlich@jouy.inra.fr., [12] King's College London, Centre for Host-Microbiome Interactions, Dental Institute Central Office, Guy's Hospital, London Bridge, London, SE1 9RT, UK. dusko.ehrlich@jouy.inra.fr."
In the same way, you can also see data from different species inside the hub by looking at the contents of species like this:
head(unique(eh$species))
## [1] "Homo sapiens" "Mus musculus"
## [3] "Mus musculus (E18 mice)" NA
## [5] "Danio rerio"
And this will also work for any of the other types of metadata present. You can learn which kinds of metadata are available by simply hitting the tab key after you type ‘eh$’. In this way you can explore for yourself what kinds of data are present in the hub right from the command line. This interface also allows you to access the hub programatically to extract data that matches a particular set of criteria.
Another valuable types of metadata to pay attention to is the rdataclass.
head(unique(eh$rdataclass))
## [1] "ExpressionSet" "GAlignmentPairs"
## [3] "CellMapperList" "gds.class"
## [5] "RangedSummarizedExperiment" "GRanges"
The rdataclass allows you to see which kinds of R objects the hub will return to you. This kind of information is valuable both as a means to filter results and also as a means to explore and learn about some of the kinds of experimenthub objects that are widely available for the project. Right now this is a pretty short list, but over time it should grow as we support more of the different kinds of experimenthub objects via the hub.
Now lets try getting the data files associated with the r Biocpkg("alpineData")
package using the query method. The query method lets you
search rows for specific strings, returning an ExperimentHub
instance with
just the rows matching the query. The preparerclass
column of metadata
monitors which package is associated with the ExperimentHub data.
One can get chain files for Drosophila melanogaster from UCSC with:
apData <- query(eh, "alpineData")
apData
## ExperimentHub with 4 records
## # snapshotDate(): 2018-10-30
## # $dataprovider: GEUVADIS
## # $species: Homo sapiens
## # $rdataclass: GAlignmentPairs
## # additional mcols(): taxonomyid, genome, description,
## # coordinate_1_based, maintainer, rdatadateadded, preparerclass,
## # tags, rdatapath, sourceurl, sourcetype
## # retrieve records with, e.g., 'object[["EH166"]]'
##
## title
## EH166 | ERR188297
## EH167 | ERR188088
## EH168 | ERR188204
## EH169 | ERR188317
Query has worked and you can now see that the only data present is provided by the “alpineData”.
The metadata underlying this hub object can be retrieved by you
apData$preparerclass
## [1] "alpineData" "alpineData" "alpineData" "alpineData"
df <- mcols(apData)
By default the show method will only display the first 5 and last 5 rows. There are hundreds of records present in the hub.
length(eh)
## [1] 1697
Lets look at another example, where we pull down only data from the hub for species “mus musculus”.
mm <- query(eh, "mus musculus")
mm
## ExperimentHub with 73 records
## # snapshotDate(): 2018-10-30
## # $dataprovider: Robinson group (UZH), 10X Genomics, GEO (GSE60749), Kuma...
## # $species: Mus musculus, Mus musculus (E18 mice)
## # $rdataclass: data.frame, SingleCellExperiment, character, RangedSummari...
## # additional mcols(): taxonomyid, genome, description,
## # coordinate_1_based, maintainer, rdatadateadded, preparerclass,
## # tags, rdatapath, sourceurl, sourcetype
## # retrieve records with, e.g., 'object[["EH173"]]'
##
## title
## EH173 | Pre-processed microarray data from the Affymetrix MG-U74Av2 p...
## EH552 | st100k
## EH553 | st400k
## EH557 | tasicST6
## EH1039 | Brain scRNA-seq data, 'HDF5-based 10X Genomics' format
## ... ...
## EH1656 | full_1Mneurons
## EH1689 | Brain scRNA-seq data 20k subset, 'HDF5-based 10x Genomics' fo...
## EH1690 | Brain scRNA-seq data 20k subset, 'dense matrix' format
## EH1691 | Brain scRNA-seq data 20k subset, sample (column) annotation
## EH1692 | Brain scRNA-seq data 20k subset, gene (row) annotation
We can also look at the ExperimentHub
object in a browser using the
display()
function. We can then filter the ExperimentHub
object
using the Global search field on the top right corner of the page or the in-column search fields.
d <- display(eh)
ExperimentHub
to retrieve dataLooking back at our alpineData file example, if we are interested in the first file, we can gets its metadata using
apData
## ExperimentHub with 4 records
## # snapshotDate(): 2018-10-30
## # $dataprovider: GEUVADIS
## # $species: Homo sapiens
## # $rdataclass: GAlignmentPairs
## # additional mcols(): taxonomyid, genome, description,
## # coordinate_1_based, maintainer, rdatadateadded, preparerclass,
## # tags, rdatapath, sourceurl, sourcetype
## # retrieve records with, e.g., 'object[["EH166"]]'
##
## title
## EH166 | ERR188297
## EH167 | ERR188088
## EH168 | ERR188204
## EH169 | ERR188317
apData["EH166"]
## ExperimentHub with 1 record
## # snapshotDate(): 2018-10-30
## # names(): EH166
## # package(): alpineData
## # $dataprovider: GEUVADIS
## # $species: Homo sapiens
## # $rdataclass: GAlignmentPairs
## # $rdatadateadded: 2016-07-21
## # $title: ERR188297
## # $description: Subset of aligned reads from sample ERR188297
## # $taxonomyid: 9606
## # $genome: GRCh38
## # $sourcetype: FASTQ
## # $sourceurl: ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR188/ERR188297/ERR1882...
## # $sourcesize: NA
## # $tags: c("Sequencing", "RNASeq", "GeneExpression",
## # "Transcription")
## # retrieve record with 'object[["EH166"]]'
We can download the file using
apData[["EH166"]]
## see ?alpineData and browseVignettes('alpineData') for documentation
## downloading 0 resources
## loading from cache
## '/home/biocbuild//.ExperimentHub/166'
## GAlignmentPairs object with 25531 pairs, strandMode=1, and 0 metadata columns:
## seqnames strand : ranges -- ranges
## <Rle> <Rle> : <IRanges> -- <IRanges>
## [1] 1 + : 108560389-108560463 -- 108560454-108560528
## [2] 1 - : 108560454-108560528 -- 108560383-108560457
## [3] 1 + : 108560534-108600608 -- 108600626-108606236
## [4] 1 - : 108569920-108569994 -- 108569825-108569899
## [5] 1 - : 108587954-108588028 -- 108587881-108587955
## ... ... ... ... ... ... ...
## [25527] X + : 119790596-119790670 -- 119790717-119790791
## [25528] X + : 119790988-119791062 -- 119791086-119791160
## [25529] X + : 119791037-119791111 -- 119791142-119791216
## [25530] X + : 119791348-119791422 -- 119791475-119791549
## [25531] X + : 119791376-119791450 -- 119791481-119791555
## -------
## seqinfo: 194 sequences from an unspecified genome
Each file is retrieved from the ExperimentHub server and the file is also cache locally, so that the next time you need to retrieve it, it should download much more quickly.
ExperimentHub
objectsWhen you create the ExperimentHub
object, it will set up the object
for you with some default settings. See ?ExperimentHub
for ways to
customize the hub source, the local cache, and other instance-specific
options, and ?getExperimentHubOption
to get or set package-global
options for use across sessions.
If you look at the object you will see some helpful information about it such as where the data is cached and where online the hub server is set to.
eh
## ExperimentHub with 1697 records
## # snapshotDate(): 2018-10-30
## # $dataprovider: Eli and Edythe L. Broad Institute of Harvard and MIT, NA...
## # $species: Homo sapiens, Mus musculus, Danio rerio, Mus musculus (E18 mi...
## # $rdataclass: ExpressionSet, SummarizedExperiment, RaggedExperiment, dat...
## # additional mcols(): taxonomyid, genome, description,
## # coordinate_1_based, maintainer, rdatadateadded, preparerclass,
## # tags, rdatapath, sourceurl, sourcetype
## # retrieve records with, e.g., 'object[["EH1"]]'
##
## title
## EH1 | RNA-Sequencing and clinical data for 7706 tumor samples from ...
## EH166 | ERR188297
## EH167 | ERR188088
## EH168 | ERR188204
## EH169 | ERR188317
## ... ...
## EH1952 | 20181025.ZellerG_2014.marker_abundance.stool
## EH1953 | 20181025.ZellerG_2014.marker_presence.stool
## EH1954 | 20181025.ZellerG_2014.metaphlan_bugs_list.stool
## EH1955 | 20181025.ZellerG_2014.pathabundance_relab.stool
## EH1956 | 20181025.ZellerG_2014.pathcoverage.stool
By default the ExperimentHub
object is set to the latest
snapshotData
and a snapshot version that matches the version of
Bioconductor that you are using. You can also learn about these data
with the appropriate methods.
snapshotDate(eh)
## [1] "2018-10-30"
If you are interested in using an older version of a snapshot, you can
list previous versions with the possibleDates()
like this:
pd <- possibleDates(eh)
pd
## [1] "2016-02-23" "2016-06-07" "2016-07-14" "2016-07-21" "2016-08-08"
## [6] "2016-10-01" "2017-06-09" "2017-08-25" "2017-10-06" "2017-10-10"
## [11] "2017-10-12" "2017-10-16" "2017-10-19" "2017-10-26" "2017-10-30"
## [16] "2017-10-29" "2018-01-08" "2018-02-02" "2018-02-09" "2018-02-22"
## [21] "2018-03-16" "2018-03-30" "2018-04-02" "2018-04-10" "2018-04-20"
## [26] "2018-04-25" "2018-04-26" "2018-04-27" "2018-05-02" "2018-05-08"
## [31] "2018-06-29" "2018-07-30" "2018-08-02" "2018-08-03" "2018-08-27"
## [36] "2018-08-29" "2018-09-05" "2018-09-07" "2018-09-11" "2018-09-19"
## [41] "2018-09-20" "2018-10-30" "2018-10-30"
Set the dates like this:
snapshotDate(ah) <- pd[1]
sessionInfo()
## R version 3.5.1 Patched (2018-07-12 r74967)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 16.04.5 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.8-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.8-bioc/R/lib/libRlapack.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats4 parallel stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] alpineData_1.7.1 GenomicAlignments_1.18.0
## [3] Rsamtools_1.34.0 Biostrings_2.50.0
## [5] XVector_0.22.0 SummarizedExperiment_1.12.0
## [7] DelayedArray_0.8.0 BiocParallel_1.16.0
## [9] matrixStats_0.54.0 Biobase_2.42.0
## [11] GenomicRanges_1.34.0 GenomeInfoDb_1.18.0
## [13] IRanges_2.16.0 S4Vectors_0.20.0
## [15] ExperimentHub_1.8.0 AnnotationHub_2.14.0
## [17] BiocGenerics_0.28.0 BiocStyle_2.10.0
##
## loaded via a namespace (and not attached):
## [1] Rcpp_0.12.19 compiler_3.5.1
## [3] BiocManager_1.30.3 later_0.7.5
## [5] zlibbioc_1.28.0 bitops_1.0-6
## [7] tools_3.5.1 digest_0.6.18
## [9] bit_1.1-14 lattice_0.20-35
## [11] RSQLite_2.1.1 evaluate_0.12
## [13] memoise_1.1.0 pkgconfig_2.0.2
## [15] Matrix_1.2-14 shiny_1.1.0
## [17] DBI_1.0.0 curl_3.2
## [19] yaml_2.2.0 xfun_0.4
## [21] GenomeInfoDbData_1.2.0 stringr_1.3.1
## [23] httr_1.3.1 knitr_1.20
## [25] grid_3.5.1 rprojroot_1.3-2
## [27] bit64_0.9-7 R6_2.3.0
## [29] AnnotationDbi_1.44.0 rmarkdown_1.10
## [31] bookdown_0.7 blob_1.1.1
## [33] magrittr_1.5 backports_1.1.2
## [35] promises_1.0.1 htmltools_0.3.6
## [37] mime_0.6 interactiveDisplayBase_1.20.0
## [39] xtable_1.8-3 httpuv_1.4.5
## [41] stringi_1.2.4 RCurl_1.95-4.11