SPARK August 2022 update: New phenotypic and genomic data available

The Simons Foundation Autism Research Initiative (SFARI) is pleased to announce that SPARK (Simons Foundation Powering Autism Research)1 has released new phenotypic and genomic data for use in research studies. Approved investigators can request the data via SFARI Base.

Phenotypic data

Phenotypic data from 296,307 individuals enrolled in SPARK are now available, including data from 117,203 individuals with autism spectrum disorder (ASD).

A summary of the available data is listed below:

  • 98,035 children with ASD (under 18 years of age)
  • 19,022 adults with ASD
  • 87,979 males with ASD
  • 29,224 females with ASD
  • 45,390 unaffected siblings
  • 7,423 enrolled twins, triplets and quadruplets
  • 17,909 multiplex families

Detailed medical and developmental history that are available as part of this data release include behavioral questionnaires such as the Developmental Coordination Disorder Questionnaire (DCDQ), Child Behavior Checklist (CBCL), Repetitive Behavior Scale-Revised (RBS-R), Social Communication Questionnaire (SCQ) and Vineland Adaptive Behavior Scale (Vineland-3). Also available for a subset of adults and children with ASD are Intelligence Quotient (IQ) data abstracted from medical records. Highlights of the most requested and latest measures are listed below:

  • 4,461 clinical IQ records
  • 18,874 Vineland-3 records
  • 7,996 CBCL/6-18 records
  • 1,644 CBCL/1.5-5 records
  • A manually derived composite variable reflecting an aggregate across many indicators of cognitive impairment.

This data release also contains a number of additional assessments such as clinical lab reports for every individual with a confirmed and returned genetic result, data from ‘baby siblings’ in multiplex families, and an Area Deprivation Index (ADI) for each available proband.

An overview of the subgroups for whom phenotypic data are currently available, as well as the demographics of the SPARK sample, can be found here.

Genomic data

In addition to phenotypic data, genomic data (whole-exome sequencing and genome-wide genotyping data) are currently available for 81,172 SPARK participants (including 12,509 genomes and 70,487 exomes). Of these, 3,568 genomes and 34,164 exomes are from individuals with ASD.

The WGS4 dataset (which was released in December 2021) contains 3,684 samples, belonging to 1,053 families (528 quads, 519 trios, 2 duos and 3 single-parent families). Of these, 1,061 are individuals with ASD, and 2,623 are unaffected individuals.

The WGS5 dataset (which was released in August 2022) contains 999 samples, belonging to 311 families (65 quads, 244 trios, 1 duo and 1 quintet families). Of these, 347 are individuals with ASD, and 652 are unaffected individuals.

Both WGS5 and WGS4 alone with other previously released batches of WGS are available as a part of the SPARK integrated WGS (iWGS) with 12,509 samples belonging to 3,388 families (2,190 quads, 1106 trios, 67 quintets, 6 duos).

Who can use the data?

The data are available for use by all approved researchers, regardless of SFARI funding. Research projects are not restricted to autism or other neurodevelopmental conditions. There is a six-month embargo on the genomic data, but there is no embargo on phenotypic data.

How can the data be accessed?

Researchers can log in to SFARI Base and apply to use the data. The application will be reviewed by SFARI staff, and once approved, researchers will be provided with information on how to download the data.

Researchers who have previously applied and been approved to access SPARK phenotypic data (i.e., from an earlier SPARK data release) will automatically have access to this latest data release. Simply log in to SFARI Base to view and download the latest data set.

What types of research projects are SPARK phenotypic and genomic data currently being used for?

SPARK phenotypic and genomic data are currently being used in more than 100 research studies. These studies are investigating a variety of different topics relevant to autism, including genetic risk factors, sex differences, clinical assessment measures, special interests and beliefs about causes of ASD, as well as the impact of the COVID-19 pandemic on individuals with ASD and their families.

For example, several studies used data from the SPARK cohort to assess the effects of COVID-19 on access to care for persons with ASD and the mental health of parents and children2–5. Another study developed machine-learning models to predict measures of cognitive challenges from parent-reported online survey, with the goal to provide accurate ‘guesses’ when clinical data on IQ are unavailable6. Other recent studies have uncovered new de novo and inherited genetic risk variants for ASD7,8 as well as evidence of a female protective effect against ASD’s common inherited influences9.

Can investigators recruit SPARK families for new research studies?

In addition to accessing phenotypic and genomic data, researchers can submit an application via SFARI Base to recruit SPARK participants into investigator-initiated research studies through the SPARK Research Match program.

Applications are reviewed by a standing committee on a quarterly basis (application deadlines are March 31, June 30, September 30 and December 31). Researchers will receive further information about how to contact SPARK families once their application is approved.

The SPARK Recruitment Process Document provides answers to many frequently asked questions about the research matching program.

Additional information

For more information, please contact [email protected].


  1. SPARK Consortium. Neuron 97, 488-493 (2018) PubMed
  2. Bhat A. Autism Res. 14, 2454-2470 (2021) PubMed
  3. Kalb L.G. et al. Autism Res. 14, 2183-2188 (2021) PubMed
  4. Bal V.H. et al. Autism Res. 14, 1209-1219 (2021) PubMed
  5. White L.C. et al. J. Autism Dev. Disord. 51, 3766-3773 (2021) PubMed
  6. Chang S. et al. Autism Res. 15, 156-170 (2022) PubMed
  7. Zhou X. et al. medRxiv (2021) Preprint
  8. Wilfert A.B. et al. Nat. Genet. 53, 1125-1134 (2021) PubMed
  9. Wigdor E.M. et al. Cell Genom. 2, 100134 (2022) Article
Recent News