Description

This track shows small-variant (single-nucleotide variant and short-indel) allele frequencies from 101 samples released as part of the GWAS SVatalog tool (Chirmade et al. 2026). The same 101-sample cohort underlies the structural-variant sibling track SVatalog 101 SVs in the Long-read SV collection; this track provides the companion small-variant allele frequencies that SVatalog uses to compute linkage disequilibrium between SNPs and SVs.

The callset contains approximately 8.8 million sites across the autosomes and chromosome X. Each site reports the alternate allele frequency in the 101 samples, the gnomAD v3.1 non-Finnish European allele frequency (when annotated in the source release), and a dbSNP rsID when one was available.

Display Conventions and Configuration

The track uses the standard VCF display. Variants appear as colored marks along the genome; clicking an item opens the detail page with per-site INFO fields: AF, AC, AN, the gnomAD v3.1 NFE allele frequency (GNOMAD_NFE_AF) and the dbSNP rsID (RSID).

Note on AC/AN: the source allele-frequency release only ships AF. For this track AC and AN are synthesized by assuming the full 2x101 = 202-allele denominator (AN=202, AC=round(AF x 202)); these are therefore approximations at sites where some samples had missing genotypes.

Methods

Small variants were called from 10X Genomics linked-read (paired-end short-read) whole-genome sequencing of the 101 SVatalog samples with GATK HaplotypeCaller v4.0.0.0 using default parameters. Calls were phased across the cohort with SHAPEIT v4.2.0, and per-site alternate allele frequencies were computed on the resulting joint callset. Structural variants, released as a separate lrSv subtrack, were called from long-read data and merged with these SNPs for the LD analyses reported by GWAS SVatalog.

For display here, the per-chromosome allele-frequency text files (chr{1..22,X}_allele_freq.txt) were converted to a single sites-only VCF with approximate AC/AN fields and bgzipped / tabix indexed. The step-by-step build commands are recorded in the UCSC makeDoc doc/hg38/varFreqs.txt; the converter script lives in makeDb/scripts/varFreqs.

Data Access

The VCF file for this track is available from our download server as svatalog.vcf.gz (with .tbi index). Regions can be extracted with tabix: tabix http://hgdownload.soe.ucsc.edu/gbdb/hg38/varFreqs/svatalog/svatalog.vcf.gz chr21:1-100000000.

The original per-chromosome allele-frequency tables and the accompanying LD statistics used by the SVatalog tool are available from the companion Zenodo deposit: zenodo.org/records/13367574. The SVatalog web tool itself is at svatalog.research.sickkids.ca.

Credits

Thanks to Chirmade, Strug and colleagues at The Hospital for Sick Children and the University of Toronto for releasing this annotated SNP frequency callset alongside the GWAS SVatalog tool.

References

Chirmade S, Wang Z, Mastromatteo S, Sanders E, Thiruvahindrapuram B, Nalpathamkalam T, Pellecchia G, Lin F, Keenan K, Patel RV et al. GWAS SVatalog: a visualization tool to aid fine-mapping of GWAS loci with structural variations. Heredity (Edinb). 2026 Mar;135(3):199-210. PMID: 41203876; PMC: PMC13031531