Public Data Repositories
Through SFARI Base, SFARI accepts, stores and distributes data from designated SFARI-supported cohorts, including SPARK, the Simons Simplex Collection (SSC), Simons Searchlight, the Autism Inpatient Collection (AIC) and Autism BrainNet (ABN). SFARI Base also houses data collected through Research Match initiatives associated with these cohorts.
At this time, other (non-cohort) data generated through SFARI-funded grants or other external funding sources cannot be submitted to SFARI Base for distribution. Investigators seeking to share such datasets should instead deposit them in an appropriate public data repository to ensure long-term accessibility, reproducibility and compliance with open science principles.
Below is a list of public data repositories relevant to SFARI’s scientific focus, including autism spectrum disorder (ASD), neurodevelopmental conditions, genomics, transcriptomics, neuroimaging and behavioral science. We appreciate there are other data repositories that are appropriate and this list is not exhaustive. Please feel free to share additional public data repositories with us so we can add them to this list.
Public Repositories by Data Type
- Genomic, Epigenomic and Transcriptomic Data
- NIH dbGaP (Database of Genotypes and Phenotypes)
For controlled-access human genetic and phenotypic data.
https://www.ncbi.nlm.nih.gov/gap - NIH SRA (Sequence Read Archive)
https://www.ncbi.nlm.nih.gov/sra - GEO (Gene Expression Omnibus)
For microarray, bulk and single-cell RNA-seq, and other high-throughput functional genomic data.
https://www.ncbi.nlm.nih.gov/geo/ - ArrayExpress (via EMBL-EBI)
An alternative to GEO for transcriptomics and functional genomics data.
https://www.ebi.ac.uk/arrayexpress/
- Neuroimaging Data
- OpenNeuro
Public repository for MRI, EEG, MEG and iEEG data formatted according to BIDS standards.
https://openneuro.org/ - NITRC (Neuroimaging Informatics Tools and Resources Clearinghouse)
For hosting and sharing imaging data and tools, including pipelines and atlases.
https://www.nitrc.org/
- Neurophysiology Data
- DANDI
For electrophysiology, optophysiology and behavioral time-series as well as images from immunostaining experiments.
https://dandiarchive.org/
- Behavioral, Clinical, and Cognitive Data
- NIH Data Archive (NDA)
Formerly the NIMH Data Archive, the NDA accepts a wide variety of behavioral, cognitive and clinical research data. Strongly suited for ASD and developmental neuroscience research.
https://nda.nih.gov/ - National Sleep Research Resource (NSRR)
https://sleepdata.org/
- Proteomics and Metabolomics
- PRIDE (Proteomics IDEntifications Database)
For sharing mass spectrometry-based proteomics data.
https://www.ebi.ac.uk/pride/ - MetaboLights
A repository for metabolomics data and metadata.
https://www.ebi.ac.uk/metabolights/
- Generalist Repositories (for data not fitting in specialized domains)
- Dryad
Curated general-purpose repository for data underlying publications.
https://datadryad.org/ - Zenodo
Supports a broad range of data types and provides DOI assignment.
https://zenodo.org/ - OSF (Open Science Framework)
General-purpose repository that supports a wide range of research outputs, including datasets, preprints and supplementary materials; offers project management tools and DOI assignment.
https://osf.io/
For further guidance on choosing a repository, PLOS ONE maintains a curated list of repositories by data type:
https://journals.plos.org/plosone/s/recommended-repositories