This track shows high-confidence structural variants (SVs) identified by Oxford Nanopore long-read sequencing of 3,622 Icelanders recruited through the deCODE genetics population cohort. The release contains 133,886 SVs (55,649 deletions, 75,050 insertions and 3,187 combined insertion/deletion events). Variants are site-level (no per-sample genotypes) and have been filtered to a high-confidence subset validated in the accompanying population-scale analysis.
Note that this release does not include allele counts or allele frequencies: each row represents a site that was called with high confidence in the cohort, but the number of carrier samples is not provided, so the track cannot be filtered by AF/AC.
Items are colored by SV type:
Insertions are placed at the insertion site with a width of 1 bp; deletions span the deleted interval; INSDEL events span the affected reference region and have SVLEN=0 because the reference and alternate alleles differ in both sequence and length. Filters are available for SV type and SV length.
Where a variant falls inside an annotated tandem-repeat region, the detail page also shows the coordinates of that region (TRRBEGIN / TRREND from the source VCF), which can be useful context for repeat-mediated insertions and deletions.
Beyter et al. 2021 performed Oxford Nanopore long-read sequencing of 3,622 Icelanders recruited through deCODE genetics and detected a median of 22,636 SVs per individual (13,353 insertions and 9,474 deletions). Across the cohort they derived a set of 133,886 reliably genotyped SV alleles, imputed those alleles into 166,281 chip-typed Icelanders, and tested them for association with disease and quantitative traits (notably including a rare PCSK9 deletion associated with lower LDL-cholesterol and a multi-allelic 57-bp VNTR in ACAN associated with adult height). The track shown here displays the 133,886 high-confidence SV sites: 55,649 deletions, 75,050 insertions and 3,187 combined insertion/deletion events. The release is site-only (no per-sample genotypes or allele frequencies), so the track cannot be filtered by AF/AC.
The VCF ont_sv_high_confidence_SVs.sorted.vcf.gz was downloaded from the deCODE genetics LRS_SV_sets GitHub repository.
The step-by-step build commands (download, format conversion, bigBed build) are recorded in the UCSC makeDoc for this track container: doc/hg38/lrSv.txt. The conversion scripts and autoSql schemas live in makeDb/scripts/lrSv.
The data can be explored interactively in table format with the Table Browser or the Data Integrator and exported from there to spreadsheet or tab-sep tables. From scripts, the data can be accessed through our API, track=decodeSv.
The annotation is stored as a bigBed file that can be downloaded from our download server as decodeSv.bb. Individual regions or the whole annotation can be obtained with the bigBedToBed utility, available from our utilities page. Example: bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/hg38/lrSv/decodeSv.bb -chrom=chr21 -start=0 -end=100000000 stdout.
The original VCF is available from the deCODE genetics LRS_SV_sets GitHub repository.
Thanks to the deCODE genetics team and the Icelandic study participants for making this dataset publicly available.
Beyter D, Ingimundardottir H, Oddsson A, Eggertsson HP, Bjornsson E, Jonsson H, Atlason BA, Kristmundsdottir S, Mehringer S, Hardarson MT et al. Long-read sequencing of 3,622 Icelanders provides insight into the role of structural variants in human diseases and other traits. Nat Genet. 2021 Jun;53(6):779-786. PMID: 33972781