The MPRAVarDB track shows 239,028 variants successfully mapped to hg38 (from 242,818 total) across 18 MPRA studies compiled in the MPRAVarDB database (Jin et al., 2024). Each variant was experimentally tested in an MPRA experiment to evaluate whether it affects regulatory activity. The database covers over 30 cell lines and 30 human diseases and traits, including neurodegenerative diseases, immune disorders, melanoma, multiple myeloma, and autoimmune diseases.
Note on cell lines: The cell line shown for each variant is the reporter cell line in which the human regulatory element was assayed. Several studies used mouse cell lines (e.g. Neuro-2a, N2A, NIH/3T3, MIN6) as reporter systems for human sequences; these variants retain human (hg38) coordinates.
Note on study type: Not all studies measure transcriptional regulation in the same sense. Two of the larger contributors, Griesemer et al., 2021 (72,546 variants) and Schuster et al., 2023 (26,546 variants), test 3'UTR variants placed downstream of the reporter, where the log2 fold change between alleles reflects changes in mRNA stability, decay, RBP or miRNA binding, or translation efficiency rather than transcriptional activation. The remaining studies test 5' regulatory elements (promoters and enhancers) where log2FC reflects changes in transcription. Together, the 3'UTR studies account for 99,092 of the 239,028 variants in the track (~41%).
Items are colored by statistical significance:
Each item shows the variant name (rsID when available, otherwise chr:pos:ref>alt), the reference and alternate alleles, the associated disease or trait, cell line, log2 fold change, p-value, and FDR.
Cell-type specificity: MPRA results are typically cell-type-specific, and significance in one cell line does not imply activity in another. For example, Tewhey et al., 2016 found only modest correlation (R ≈ 0.63) between LCL and HepG2 measurements of the same eQTL variants, and McAfee et al., 2023 reported that only 205 of 1,004 HEK293-positive variants overlapped HNP-positive variants. The cell line filter can be used to narrow results to a relevant context.
Note on Kircher et al., 2019: This study contributes 44,647 variants (~19% of the track) using a saturation mutagenesis design that tests nearly every possible nucleotide substitution at each position of 20 disease-associated regulatory elements at single-base-pair resolution: 10 promoters (TERT, LDLR, HBB, HBG1, HNF4A, MSMB, PKLR, F9, FOXE1, GP1BB) and 10 enhancers (SORT1, ZRS, BCL11A, IRF4, IRF6, MYC tested with two distinct enhancers, RET, TCF7L2, and the UC88 ultraconserved enhancer). Regions over those elements show many densely-packed Kircher variants that may dominate visualization at those loci.
The log2 fold change is computed as log2(alt RNA/DNA) − log2(ref RNA/DNA). A positive value means the alternate allele drove more reporter activity than the reference allele in this assay; a negative value means the reverse. The linear allelic ratio is approximately 2log2FC: log2FC = 0.5 corresponds to roughly 1.41× allelic difference, log2FC = 1.0 to 2×, and log2FC = 2.0 to 4×. As noted in the Description section, log2FC reflects transcriptional activation for 5'-regulatory studies and steady-state mRNA abundance, decay, or translation efficiency for 3'UTR studies (Griesemer et al., 2021; Schuster et al., 2023).
The following table lists the 18 MPRA studies included in MPRAVarDB, with the number of tested variants, diseases/traits, cell lines, and a brief description of the variant selection.
| Study | Variants | Disease/Trait | Cell Line(s) | Description |
|---|---|---|---|---|
| Griesemer et al., 2021 | 72,546 | NHGRI-EBI GWAS catalog | GM12878, HEK293FT, HMEC, HepG2, K562, SKNSH | 3'UTR SNPs and indels in LD with GWAS catalog variants, variants under positive selection, and rare outlier expression variants from GTEx |
| Kircher et al., 2019 | 44,647 | Various (18 diseases including diabetes, cancer, blood disorders, limb malformations) | HEK293T, HEL92.1.7, HaCaT, HeLa, HepG2, K562, LNCaP, MIN6, NIH/3T3, Neuro-2a, SK-MEL-28, SF7996 | Saturation mutagenesis of 20 disease-associated regulatory elements at single base-pair resolution |
| Abell et al., 2022 | 29,564 | eQTL (no specific disease) | GM12878 | 30,893 variants in LD with independent, common, top-ranked eQTL across 744 eGenes in the CEU cohort |
| Tewhey et al., 2016 | 23,430 | eQTL (no specific disease) | GM12878 | 32,373 variants associated with eQTLs in lymphoblastoid cell lines |
| Schuster et al., 2023 | 26,546 | Prostate cancer | PC3 | 14,497 single-nucleotide mutations enriched in oncogenic pathways and 3'UTR regulatory elements |
| Mouri et al., 2022 | 14,549 | Autoimmune diseases (Crohn's, IBD, psoriasis, MS, RA, T1D, ulcerative colitis) | Jurkat | GWAS variants from autoimmune disease loci tested for regulatory element activity in T cells |
| McAfee et al., 2023 | 10,302 | Schizophrenia | HEK293s, HNPS | 5,173 fine-mapped schizophrenia GWAS variants |
| Cooper et al., 2022 | 5,330 | Alzheimer's disease, Progressive supranuclear palsy | HEK293T | 5,706 noncoding SNVs from 25 AD and 9 PSP genome-wide significant loci |
| Long et al., 2022 | 3,980 | Melanoma | C283T, UACC903 | 1,992 risk-associated variants in tight LD (r2>0.8) from 54 melanoma risk loci |
| Myint et al., 2020 | 2,158 | Schizophrenia, Alzheimer's disease | K562, SH-SY5Y | 1,049 SZ and 30 AD variants in 64 SZ loci and 9 AD loci |
| Choi et al., 2020 | 1,664 | Melanoma | HEK293FT, UACC903 | GWAS melanoma risk variants |
| Ajore et al., 2022 | 1,582 | Multiple myeloma | L363, MOLP8 | 1,039 variants in high LD (r2>0.8) at 23 MM risk loci |
| Klein et al., 2019 | 1,119 | Osteoarthritis | Saos-2 | 1,605 SNPs in high LD (r2>0.8) at 35 lead SNPs associated with OA via GWAS |
| Lu et al., 2021 | 1,036 | Systemic lupus erythematosus | GM12878, Jurkat | 18,312 variants in tight LD (r2>0.8) with 578 GWAS index variants at 531 loci |
| Mulvey & Dougherty, 2021 | 275 | Major depressive disorder | N2A | Over 1,000 SNPs from 39 neuropsychiatric GWAS loci, selected by overlap with eQTL and histone marks |
| Ferraro et al., 2020 | 150 | Rare variant expression (no specific disease) | GM12878 | Rare variants contributing to extreme expression, allelic expression, and splicing across 49 GTEx tissues |
| Rao et al., 2021 | 88 | Alcohol use disorder | BLA, CE, NAC, SFC | SNPs in 3'UTR of 88 genes from allele-specific expression analysis (30 AUD subjects vs 30 controls) |
| Ulirsch et al., 2016 | 62 | Red blood cell traits | K562, K562+GATA1 | 2,756 variants in strong LD with 75 sentinel variants associated with RBC traits |
Variant counts above are from the source publications (pre-liftOver totals). Of 242,818 total source variants, 239,028 lifted successfully to hg38; see Methods.
Data was downloaded from the
MPRAVarDB web server.
Variants originally mapped to hg19 (213,689 of 242,818) were lifted to hg38
using liftOver. 114 variants could not be mapped and were excluded.
The remaining variants were merged with the 29,129 natively hg38-mapped variants
to produce a total of 239,028 hg38 records.
Significance thresholds across studies: The source studies in MPRAVarDB do not all use the same significance framework. Most studies apply a Benjamini-Hochberg FDR threshold (commonly 0.05 or 0.10), but some report only nominal regression p-values. For example, Tewhey et al., 2016 uses BH FDR < 0.05 to call "emVars", Griesemer et al., 2021 and McAfee et al., 2023 use BH FDR < 0.10, and Kircher et al., 2019 reports raw regression p-values rather than FDR. The track applies a uniform FDR < 0.05 / nominal p < 0.05 color cutoff for visual consistency, which is the more conservative of the FDR thresholds reported by the source studies. For any variant of interest, consult the source publication for the original significance call.
The data can be explored interactively in table format with the Table Browser or the Data Integrator and exported from there to spreadsheet or tab-sep tables. From scripts, the data can be accessed through our API, track=mpraVarDb.
For automated download and analysis, the genome annotation is stored in a bigBed file that can be downloaded from our download server. The file for this track is called mpravardb.bb. Individual regions or the whole genome annotation can be obtained using our tool bigBedToBed, which can be compiled from the source code or downloaded as a precompiled binary for your system. Instructions for downloading source code and binaries can be found here. The tool can also be used to obtain features within a given range, e.g. bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/hg38/mpra/mpravardb/mpravardb.bb -chrom=chr21 -start=0 -end=100000000 stdout
The original annotation source data can be downloaded from the MPRAVarDB web server.
Thanks to Tao Wang and colleagues at the University of Florida for creating and maintaining the MPRAVarDB database.
Abell NS, DeGorter MK, Gloudemans MJ, Greenwald E, Smith KS, He Z, Montgomery SB. Multiple causal variants underlie genetic associations in humans. Science. 2022 Mar 18;375(6586):1247-1254. PMID: 35298243; PMC: PMC9725108
Ajore R, Niroula A, Pertesi M, Cafaro C, Thodberg M, Went M, Bao EL, Duran-Lozano L, Lopez de Lapuente Portilla A, Olafsdottir T et al. Functional dissection of inherited non-coding variation influencing multiple myeloma risk. Nat Commun. 2022 Jan 10;13(1):151. PMID: 35013207; PMC: PMC8748989
Choi J, Zhang T, Vu A, Ablain J, Makowski MM, Colli LM, Xu M, Hennessey RC, Yin J, Rothschild H et al. Massively parallel reporter assays of melanoma risk variants identify MX2 as a gene promoting melanoma. Nat Commun. 2020 Jun 1;11(1):2718. PMID: 32483191; PMC: PMC7264232
Cooper YA, Teyssier N, Dräger NM, Guo Q, Davis JE, Sattler SM, Yang Z, Patel A, Wu S, Kosuri S et al. Functional regulatory variants implicate distinct transcriptional networks in dementia. Science. 2022 Aug 19;377(6608):eabi8654. PMID: 35981026
Ferraro NM, Strober BJ, Einson J, Abell NS, Aguet F, Barbeira AN, Brandt M, Bucan M, Castel SE, Davis JR et al. Transcriptomic signatures across human tissues identify functional rare genetic variation. Science. 2020 Sep 11;369(6509). PMID: 32913073; PMC: PMC7646251
Griesemer D, Xue JR, Reilly SK, Ulirsch JC, Kukreja K, Davis JR, Kanai M, Yang DK, Butts JC, Guney MH et al. Genome-wide functional screen of 3'UTR variants uncovers causal variants for human disease and evolution. Cell. 2021 Sep 30;184(20):5247-5260.e19. PMID: 34534445; PMC: PMC8487971
Jin W, Xia Y, Nizomov J, Liu Y, Li Z, Lu Q, Chen L. MPRAVarDB: an online database and web server for exploring regulatory effects of genetic variants. Bioinformatics. 2024 Oct 1;40(10). PMID: 39325859; PMC: PMC11464417
Kircher M, Xiong C, Martin B, Schubach M, Inoue F, Bell RJA, Costello JF, Shendure J, Ahituv N. Saturation mutagenesis of twenty disease-associated regulatory elements at single base-pair resolution. Nat Commun. 2019 Aug 8;10(1):3583. PMID: 31395865; PMC: PMC6687891
Klein JC, Keith A, Rice SJ, Shepherd C, Agarwal V, Loughlin J, Shendure J. Functional testing of thousands of osteoarthritis-associated variants for regulatory activity. Nat Commun. 2019 Jun 4;10(1):2434. PMID: 31164647; PMC: PMC6547687
Long E, Yin J, Funderburk KM, Xu M, Feng J, Kane A, Zhang T, Myers T, Golden A, Thakur R et al. Massively parallel reporter assays and variant scoring identified functional variants and target genes for melanoma loci and highlighted cell-type specificity. Am J Hum Genet. 2022 Dec 1;109(12):2210-2229. PMID: 36423637; PMC: PMC9748337
Lu X, Chen X, Forney C, Donmez O, Miller D, Parameswaran S, Hong T, Huang Y, Pujato M, Cazares T et al. Global discovery of lupus genetic risk variant allelic enhancer activity. Nat Commun. 2021 Mar 12;12(1):1611. PMID: 33712590; PMC: PMC7955039
McAfee JC, Lee S, Lee J, Bell JL, Krupa O, Davis J, Insigne K, Bond ML, Zhao N, Boyle AP et al. Systematic investigation of allelic regulatory activity of schizophrenia-associated common variants. Cell Genom. 2023 Oct 11;3(10):100404. PMID: 37868037; PMC: PMC10589626
Mouri K, Guo MH, de Boer CG, Lissner MM, Harten IA, Newby GA, DeBerg HA, Platt WF, Gentili M, Liu DR et al. Prioritization of autoimmune disease-associated genetic variants that perturb regulatory element activity in T cells. Nat Genet. 2022 May;54(5):603-612. PMID: 35513721; PMC: PMC9793778
Mulvey B, Dougherty JD. Transcriptional-regulatory convergence across functional MDD risk variants identified by massively parallel reporter assays. Transl Psychiatry. 2021 Jul 22;11(1):403. PMID: 34294677; PMC: PMC8298436
Myint L, Wang R, Boukas L, Hansen KD, Goff LA, Avramopoulos D. A screen of 1,049 schizophrenia and 30 Alzheimer's-associated variants for regulatory potential. Am J Med Genet B Neuropsychiatr Genet. 2020 Jan;183(1):61-73. PMID: 31503409; PMC: PMC7233147
Rao X, Thapa KS, Chen AB, Lin H, Gao H, Reiter JL, Hargreaves KA, Ipe J, Lai D, Xuei X et al. Allele-specific expression and high-throughput reporter assay reveal functional genetic variants associated with alcohol use disorders. Mol Psychiatry. 2021 Apr;26(4):1142-1151. PMID: 31477794; PMC: PMC7050407
Schuster SL, Arora S, Wladyka CL, Itagi P, Corey L, Young D, Stackhouse BL, Kollath L, Wu QV, Corey E et al. Multi-level functional genomics reveals molecular and cellular oncogenicity of patient-based 3'-untranslated region mutations. Cell Rep. 2023 Aug 29;42(8):112840. PMID: 37516102; PMC: PMC10540565
Tewhey R, Kotliar D, Park DS, Liu B, Winnicki S, Reilly SK, Andersen KG, Mikkelsen TS, Lander ES, Schaffner SF et al. Direct Identification of Hundreds of Expression-Modulating Variants using a Multiplexed Reporter Assay. Cell. 2016 Jun 2;165(6):1519-1529. PMID: 27259153; PMC: PMC4957403
Ulirsch JC, Nandakumar SK, Wang L, Giani FC, Zhang X, Rogov P, Melnikov A, McDonel P, Do R, Mikkelsen TS et al. Systematic Functional Dissection of Common Genetic Variation Affecting Red Blood Cell Traits. Cell. 2016 Jun 2;165(6):1530-1545. PMID: 27259154; PMC: PMC4893171