
Over the past decade, researchers have identified hundreds of genes linked to autism and related neurodevelopmental conditions1, yet many cases remain unexplained. Standard sequencing technologies reliably detect small DNA changes, but often miss larger structural variants (SVs) such as insertions, deletions and rearrangements, especially in repetitive or complex regions of the genome2. These alterations can have substantial effects if they disrupt genes or the regulatory elements that control them3. A recent study led by SFARI investigators, including Evan E. Eichler, explores how newer genomic tools can help uncover some of these elusive mutations.
The researchers combined long-read sequencing (LRS) with emerging pangenome reference data. LRS produces much longer DNA reads than conventional short-read sequencing (SRS), allowing improved analysis of complex genomic regions4. Unlike the traditional single human reference genome, a pangenome incorporates DNA sequences from many individuals to better represent human genetic diversity.
The study included 189 individuals from 51 families affected by autism. In each case, earlier genetic testing had failed to identify a pathogenic variant: Analyses using SRS, including whole-genome, whole-exome and gene panel sequencing, had not detected mutations that could explain the condition.
To search for missed variants, the researchers constructed near-complete genome assemblies for each participant using LRS. Previous studies show that LRS can access more than 90 percent of the human genome, increasing the discovery rate for de novo mutations by roughly 30 percent and the rate for SVs by nearly 50 percent compared with SRS datasets5.
To identify variants that might contribute to autism, the researchers compared DNA from affected individuals with sequences from their family members and with reference genomes generated by the Human Pangenome Reference Consortium and related efforts6,7. Each person carries tens of thousands of SVs4, making it difficult to identify those relevant to disease. Comparing variants within families and against hundreds of long-read reference pangenomes allowed the team to filter out more than 97 percent of common SVs, leaving roughly 200 rare candidates per child.
The researchers were able to identify three pathogenic variants affecting genes already linked to neurodevelopmental disorders: SYNGAP1, which plays a key role in synaptic signaling; TBL1XR1, a regulator of transcription and chromatin remodeling; and MECP2, a regulator of gene expression associated with Rett syndrome. Two of these variants, affecting TBL1XR1 and MECP2, were found in girls whose clinical features were already suggestive of Rett syndrome. The third variant, in SYNGAP1, was identified in a child previously classified as having idiopathic autism.
Beside the three pathogenic mutations, the researchers identified nine additional SV candidates that warrant further investigation. Several occurred in non-coding regulatory regions that influence when and where genes are active during development, and many had been inherited from a parent rather than arising as new mutations. Together, these findings demonstrate how LRS and expanding pangenome references can reveal previously hidden pathogenic genetic changes in people with autism and other neurodevelopmental disorders.
References
- SFARI Gene. Human Gene database. Accessed March 5, 2026. https://gene.sfari.org/database/human-gene/
- Wilfert AB, Turner TN, Murali SC, et al. Nat Genet. 2021;53(8):1125–1134. PubMed
- Scott AJ, Chiang C, Hall IM. Genome Res. 2021;31(12):2249–2257. PubMed
- Chaisson MJP, Sanders AD, Zhao X, et al. Nat Commun. 2019;10(1):1784. PubMed
- Collins RL, Talkowski ME. Nat Rev Genet. 2025;26(7):443–462. PubMed
- Ebert P, Audano PA, Zhu Q, et al. Science. 2021;372(6537):eabf7117. PubMed
- Logsdon GA, Rozanski AN, Ryabov F, et al. Nature. 2024;629(8010):136–145. PubMed


