Description

This track collection contains structural variant (SV) and copy-number variant (CNV) callsets derived from Illumina short-read sequencing. Most SV tracks in the browser now come from long-read platforms (see the companion Long-read SVs supertrack); the short-read callsets here are included as comparators so users can evaluate the extra sensitivity of long-read calls and cross-check a variant across technologies.

Available Datasets

SV length statistics (min / median / max) are computed from the svLen field of each track, in base pairs. For the Abel CCDG callset, a large fraction of records are breakend (BND) translocations where svLen=-1 is used as a sentinel, which shows up in both min and median.

Dataset N samples Cohort / disease Sequencing SVs Min Median Max
CCDG 17,795 17,795 NHGRI CCDG + PAGE + SGDP (B38 native + B37 lifted) Illumina short-read (LUMPY + CNVnator + svtyper) 737,998 -1 -1 217,985,413
1KG 3202 3,202 1000 Genomes expanded cohort Illumina short-read (GATK-SV) 173,366 1 314 154,807,729
ToMMo 48K CNV 48,874 Japanese, general population Illumina short-read (GATK CNV, 1 kb bins, shown as two bigWigs) ~2M bins with CNV carriers; not comparable to per-SV counts above

CCDG 17,795 SVs (abelSv)

Site-frequency callset from 17,795 deeply sequenced genomes (Abel et al. 2020, Nature; PMID 32460305). Two non-overlapping public releases are combined in this track: the B38 callset (14,623 samples called natively on GRCh38) and the B37 callset (8,417 samples, lifted). Variants are colored by SV type (DEL / DUP / INV / MEI / BND) and carry per-population allele counts for eight ancestry groups plus a HIGH/LOW confidence filter.

1KG 3202 SVs (onekg3202Sr)

1000 Genomes 3202-sample Illumina short-read GATK-SV callset (Byrska-Bishop et al. 2022). 173,366 SVs across 7 classes (DEL, INS, DUP, INV, CPX, CNV, CTX) with AC/AN/AF and per-superpopulation AFs (AFR/AMR/ASN/EUR/SAN).

ToMMo 48K CNV SR (tommoJpCnv)

Per-1 kb-bin copy-number carrier counts from short-read whole-genome sequencing of 48,874 Japanese individuals (jMorp 48KJPN-CNV Frequency Panel, release 20230828), called with GATK CNV germline workflows. Shown as a multiWig overlay: red = samples with copy-number loss (CN<2) per bin, green = samples with gain (CN>2) per bin. This is a useful short-read point of comparison to the ToMMo 333-sample long-read SV track under the Long-read SVs supertrack.

Data Access

See the Data Access section of each subtrack's page for download links. Build documentation lives alongside the scripts at doc/hg38/srSv.txt; conversion scripts and autoSql schemas are at makeDb/scripts/srSv.

Credits

Each subtrack credits its respective upstream project; see the individual description pages.

References

See the individual subtrack description pages for the specific references.