This track shows structural variants (SVs) identified by long-read sequencing of 945 Han Chinese individuals. The dataset contains 111,288 SVs merged across samples using SURVIVOR, including 49,518 deletions, 42,300 insertions, 13,503 duplications, 5,595 inversions, and 372 translocations.
Items are colored by SV type:
Filters are available for SV type, SV length, allele frequency, and number of supporting samples. For insertions, the item is placed at the insertion site with a width of 1 bp. For translocations, only the first breakpoint is shown; the second breakpoint chromosome and position are listed in the item details.
Gong et al. 2025 performed Oxford Nanopore long-read sequencing of 945 Han Chinese individuals on PromethION instruments with R9.4 flow cells. Reads were aligned to GRCh38.p13 with NGMLR v0.2.7 using ONT-tuned parameters, and a joint-calling strategy was used to call SVs at moderate coverage: per-sample discovery with cuteSV v1.0.13, merging of breakpoints within 500 bp across individuals with SURVIVOR v1.0.6, per-sample re-genotyping of the merged set with LRcaller v1.0, and a final BCFtools merge. SVs in centromeric, pericentromeric and gap regions were filtered out, yielding 111,288 high-quality SVs: 49,518 deletions, 42,300 insertions, 13,503 duplications, 5,595 inversions and 372 translocations.
The site-only VCF released at OMIX accession OED00945268 (OED00945268_Han_945samples_SV.vcf.gz) was converted to BED for this track.
The step-by-step build commands (download, format conversion, bigBed build) are recorded in the UCSC makeDoc for this track container: doc/hg38/lrSv.txt. The conversion scripts and autoSql schemas live in makeDb/scripts/lrSv.
The raw VCF data was obtained from the OMIX repository (accession OED00945268) at the National Genomics Data Center (NGDC), China National Center for Bioinformation.
The source VCF also encodes phased per-sample genotypes: the sampleList field on the detail page is derived from the SURVIVOR SUPP_VEC bitmask and is an ordered list of the 1-based indices of the 945 samples carrying each SV. The full per-sample phased VCF can be browsed as a separate track in the SVs from 945 Han Chinese entry of the Phased Variants track collection.
Thanks to Gong et al. for making their structural variant calls publicly available.
Gong J, Sun H, Wang K, Zhao Y, Huang Y, Chen Q, Qiao H, Gao Y, Zhao J, Ling Y et al. Long-read sequencing of 945 Han individuals identifies structural variants associated with phenotypic diversity and disease susceptibility. Nat Commun. 2025 Feb 10;16(1):1494. PMID: 39929826; PMC: PMC11811171