This track shows structural variants (SVs) identified by PacBio HiFi long-read sequencing of 1,027 individuals from the All of Us (AoU) Research Program. Participants self-identified as Black or African American and were sequenced to ~8x coverage. The dataset contains 541,049 SVs (444,524 insertions and 96,525 deletions) on autosomes.
SVs are annotated with population-specific allele frequencies across five ancestry groups (African, Admixed American, East Asian, European, South Asian), gene intersections from curated disease gene lists (OMIM, ACMG, cancer genes), regulatory element overlaps, and associations with eQTLs, GWAS loci, and clinical phenotypes from the AoU electronic health records.
Items are colored by SV type:
Filters are available for SV type, SV length, and population-specific allele frequencies. For insertions, the item is placed at the insertion site with a width of 1 bp; for deletions, the item spans the deleted region.
The detail page shows the following annotations when available:
Garimella et al. 2025 performed PacBio HiFi long-read sequencing on 1,027 All of Us participants self-identifying as Black or African American, to ~8x per-sample coverage at HudsonAlpha Discovery. SVs (≥50 bp) were called per sample with an ensemble of three methods: two alignment-based callers, PBSV v2.6.0 (with Tandem Repeat Finder context) and Sniffles2 v2.0.6, plus the assembly-based PAV v1.2.1 (hifiasm haplotype-resolved contigs aligned to GRCh38 with minimap2 -x asm20). Per-caller VCFs were normalized, merged within and across samples and filtered into stringent and lenient tiers, and the callset was re-genotyped across the cohort to produce the final release: 541,049 autosomal SVs (444,524 insertions, 96,525 deletions) with per-ancestry allele frequencies (AFR, AMR, EAS, EUR, SAS) and gene, regulatory, eQTL, GWAS and EHR-phenotype annotations.
This track was built from the supplementary media-2 table of the AoU long-read sequencing preprint ( doi:10.1101/2025.10.02.25336942). Access to the underlying AoU long-read data requires registration through the All of Us Research Hub.
The step-by-step build commands (download, format conversion, bigBed build) are recorded in the UCSC makeDoc for this track container: doc/hg38/lrSv.txt. The conversion scripts and autoSql schemas live in makeDb/scripts/lrSv.
This track was built from supplementary data (media-2) of the AoU long-read sequencing preprint. Access to the full AoU dataset requires registration through the All of Us Research Hub.
Thanks to Garimella et al. and the All of Us Research Program for making their structural variant annotations publicly available.
Garimella KV, Li Q, Wertz J, Lee SK, Cunial F, Huang Y, Mostovoy Y, Lorig-Roach R, English A, Su H et al. Population-scale Long-read Sequencing in the All of Us Research Program. medRxiv. 2025 Oct 5;. PMID: 41256123; PMC: PMC12622093