cartVersion cartVersion cartVersion cartVersion 0 0 0 0 0 0 0 0 0 0 0 cartVersion cartVersion cartVersion 0 cartVersion 0 cytoBand Chromosome Band bed 4 + Chromosome Bands 0 0.1 0 0 0 200 150 150 0 0 0
\ This track shows chromosome bands annotated by \ FlyBase\ (D. melanogaster version 6.02).\
\\ Thanks to FlyBase for providing these annotations.\
\ \ map 1 altColor 200,150,150\ group map\ longLabel Chromosome Bands\ priority .1\ shortLabel Chromosome Band\ track cytoBand\ type bed 4 +\ visibility hide\ cytoBandIdeo Chromosome Band (Low-res) bed 4 + Chromosome Bands (Low-resolution for Chromosome Ideogram) 1 0.1 0 0 0 200 150 150 0 0 0 map 1 altColor 200,150,150\ group map\ longLabel Chromosome Bands (Low-resolution for Chromosome Ideogram)\ priority .1\ shortLabel Chromosome Band (Low-res)\ track cytoBandIdeo\ type bed 4 +\ visibility dense\ chainDroSim2 droSim2 Chain chain droSim2 D. simulans (Sep. 2014 (ASM75419v2/droSim2)) Chained Alignments 3 1 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel D. simulans (Sep. 2014 (ASM75419v2/droSim2)) Chained Alignments\ otherDb droSim2\ parent insectsChainNetViewchain off\ shortLabel droSim2 Chain\ subGroups view=chain species=s000a clade=c00\ track chainDroSim2\ type chain droSim2\ cons27wayViewphyloP Basewise Conservation (phyloP) bed 4 Multiz Alignment & Conservation (27 Species) 2 1 0 0 0 127 127 127 0 0 0 compGeno 1 longLabel Multiz Alignment & Conservation (27 Species)\ parent cons27way\ shortLabel Basewise Conservation (phyloP)\ track cons27wayViewphyloP\ view phyloP\ viewLimits -3:0.5\ viewLimitsMax -4.611:0.934\ visibility full\ cons27way Conservation bed 4 Multiz Alignment & Conservation (27 Species) 0 1 0 0 0 127 127 127 0 0 0\ This track shows multiple alignments of 27 species and measurements of\ evolutionary conservation using\ two methods (phastCons and phyloP) from the\ \ PHAST package, for all 27 species.\ The multiple alignments were generated using multiz and\ other tools in the UCSC/Penn State Bioinformatics\ comparative genomics alignment pipeline.\ Conserved elements identified by phastCons are also displayed in\ this track.\
\\ PhastCons (which has been used in previous Conservation tracks) is a hidden\ Markov model-based method that estimates the probability that each\ nucleotide belongs to a conserved element, based on the multiple alignment.\ It considers not just each individual alignment column, but also its\ flanking columns. By contrast, phyloP separately measures conservation at\ individual columns, ignoring the effects of their neighbors. As a\ consequence, the phyloP plots have a less smooth appearance than the\ phastCons plots, with more "texture" at individual sites. The two methods\ have different strengths and weaknesses. PhastCons is sensitive to "runs"\ of conserved sites, and is therefore effective for picking out conserved\ elements. PhyloP, on the other hand, is more appropriate for evaluating\ signatures of selection at particular nucleotides or classes of nucleotides\ (e.g., third codon positions, or first positions of miRNA target sites).\
\\ Another important difference is that phyloP can measure acceleration\ (faster evolution than expected under neutral drift) as well as\ conservation (slower than expected evolution). In the phyloP plots, sites\ predicted to be conserved are assigned positive scores (and shown in blue),\ while sites predicted to be fast-evolving are assigned negative scores (and\ shown in red). The absolute values of the scores represent -log p-values\ under a null hypothesis of neutral evolution. The phastCons scores, by\ contrast, represent probabilities of negative selection and range between 0\ and 1.\
\\ Both phastCons and phyloP treat alignment gaps and unaligned nucleotides as\ missing data.\
\ \\ Missing sequence in the assemblies is highlighted in the track display\ by regions of yellow when zoomed out and Ns displayed at base\ level (see Gap Annotation, below).
\\
\ \ Downloads for data in this track are available:\\
\ Species Release date UCSC version alignment type \ D. melanogaster Aug. 2014 BDGP Release 6 + ISO1 MT/dm6 reference \ D. simulans Apr. 2005 WUGSC 1.0/droSim1 syntenic \ D. sechellia Oct. 2005 Broad/droSec1 syntenic \ D. yakuba Jun. 2006 Flybase dyak_caf1/droYak3 syntenic \ D. erecta Feb. 2006 Agencourt CAF1/droEre2 syntenic \ D. biarmipes Mar. 2013 BCM Dbia_2.0/droBia2 syntenic \ D. suzukii Sep. 2013 BGI Dsuzukii.v01/droSuz1 syntenic \ D. ananassae Feb. 2006 Agencourt CAF1/droAna3 syntenic \ D. bipectinata Mar. 2013 BCM Dbip_2.0/droBip2 syntenic \ D. eugracilis Mar. 2013 modENCODE Deug_2.0/droEug2 syntenic \ D. elegans Mar. 2013 BCM Dele_2.0/droEle2 syntenic \ D. kikkawai Mar. 2013 BCM Dkik_2.0/droKik2 syntenic \ D. takahashii Mar. 2013 BCM Dtak_2.0/droTak2 syntenic \ D. rhopaloa Feb. 2013 modENCODE Drho_2.0/droRho2 syntenic \ D. ficusphila Mar. 2013 BCM Dfic_2.0/droFic2 syntenic \ D. pseudoobscura Apr. 2013 BCM Dpse_3.0/droPse3 syntenic \ D. persimilis Oct. 2005 Broad/droPer1 syntenic \ D. miranda Apr. 2013 U.C. Berkeley DroMir_2.2/droMir2 syntenic \ D. willistoni Aug. 2006 JCVI dwil_caf1/droWil2 syntenic \ D. virilis Feb. 2006 Agencourt CAF1/droVir3 syntenic \ D. mojavensis Feb. 2006 Agencourt CAF1/droMoj3 syntenic \ D. albomicans May. 2012 Kunming DroAlb_1.0/droAlb1 syntenic \ D. grimshawi Feb. 2006 Agencourt CAF1/droGri2 syntenic \ Musca domestica Apr. 2013 Glossina M_domestica-2.0.2/musDom2 net \ Anopheles gambiae Feb. 2003 IAGEC MOZ2/anoGam1 net \ Apis mellifera Nov. 2010 BCM Amel_4.5/apiMel4 net \ Tribolium castaneum Sep. 2005 BCM 2.0/triCas2 net
\ Table 1. Genome assemblies included in the 27-way Conservation track.\
\ The track configuration options allow the user to display the three different\ sets of scores, all, birds or vertebrate, individually or all simultaneously.\ In full and pack display modes, conservation scores are displayed as a\ wiggle track (histogram) in which the height reflects the\ value of the score.\ The conservation wiggles can be configured in a variety of ways to\ highlight different aspects of the displayed information.\ Click the Graph configuration help link for an explanation\ of the configuration options.
\\ Pairwise alignments of each species to the D. melanogaster genome are\ displayed below the conservation histogram as a grayscale density plot (in\ pack mode) or as a wiggle (in full mode) that indicates alignment quality.\ In dense display mode, conservation is shown in grayscale using\ darker values to indicate higher levels of overall conservation\ as scored by phastCons.
\\ Checkboxes on the track configuration page allow selection of the\ species to include in the pairwise display.\ Configuration buttons are available to select all of the species\ (Set all), deselect all of the species (Clear all), or\ use the default settings (Set defaults).\ Note that excluding species from the pairwise display does not alter the\ the conservation score display.
\\ To view detailed information about the alignments at a specific\ position, zoom the display in to 30,000 or fewer bases, then click on\ the alignment.
\ \\ The Display chains between alignments configuration option\ enables display of gaps between alignment blocks in the pairwise alignments in\ a manner similar to the Chain track display. The following\ conventions are used:\
\ Discontinuities in the genomic context (chromosome, scaffold or region) of the\ aligned DNA in the aligning species are shown as follows:\
\ When zoomed-in to the base-level display, the track shows the base\ composition of each alignment.\ The numbers and symbols on the Gaps\ line indicate the lengths of gaps in the D. melanogaster sequence at those\ alignment positions relative to the longest non-D. melanogaster sequence.\ If there is sufficient space in the display, the size of the gap is shown.\ If the space is insufficient and the gap size is a multiple of 3, a\ "*" is displayed; other gap sizes are indicated by "+".
\\ Codon translation is available in base-level display mode if the\ displayed region is identified as a coding segment. To display this annotation,\ select the species for translation from the pull-down menu in the Codon\ Translation configuration section at the top of the page. Then, select one of\ the following modes:\
\ Codon translation uses the following gene tracks as the basis for\ translation, depending on the species chosen (Table 2).\ \
\ \\
\ Table 2. Gene tracks used for codon translation.\\ Gene Track Species \ RefSeq Genes D. melanogaster \ Ensembl Genes v68 D. erecta, D. ananassae, Anopheles gambiae
\ Pairwise alignments with the D. melanogaster genome were generated for\ each species using lastz from repeat-masked genomic sequence.\ Pairwise alignments were then linked into chains using a dynamic programming\ algorithm that finds maximally scoring chains of gapless subsections\ of the alignments organized in a kd-tree.\ The scoring matrix and parameters for pairwise alignment and chaining\ were tuned for each species based on phylogenetic distance from the reference.\ High-scoring chains were then placed along the genome, with\ gaps filled by lower-scoring chains, to produce an alignment net.\ For more information about the chaining and netting process and\ parameters for each species, see the description pages for the Chain and Net\ tracks.
\\ An additional filtering step was introduced in the generation of the 27-way\ conservation track to reduce the number of paralogs and pseudogenes from the\ high-quality assemblies and the suspect alignments from the low-quality\ assemblies:\ the pairwise alignments of the bird assemblies were filtered based on synteny;\ those for the human and mouse genomes were filtered to retain only\ alignments of best quality in both the target and query ("reciprocal\ best").
\\ The resulting best-in-genome pairwise alignments\ were progressively aligned using multiz/autoMZ,\ following the tree topology diagrammed above, to produce multiple alignments.\ The multiple alignments were post-processed to\ add annotations indicating alignment gaps, genomic breaks,\ and base quality of the component sequences.\ The annotated multiple alignments, in MAF format, are available for\ bulk download.\ An alignment summary table containing an entry for each\ alignment block in each species was generated to improve\ track display performance at large scales.\ Framing tables were constructed to enable\ visualization of codons in the multiple alignment display.
\ \\ Both phastCons and phyloP are phylogenetic methods that rely\ on a tree model containing the tree topology, branch lengths representing\ evolutionary distance at neutrally evolving sites, the background distribution\ of nucleotides, and a substitution rate matrix.\ The\ all species tree model for this track was\ generated using the phyloFit program from the PHAST package\ (REV model, EM algorithm, medium precision) using multiple alignments of\ 4-fold degenerate sites extracted from the 27-way alignment\ (msa_view). The 4d sites were derived from the Xeno RefSeq gene set,\ filtered to select single-coverage long transcripts.\
\\ This same tree model was used in the phyloP calculations, however their\ background frequencies were modified to maintain reversibility.\ The resulting tree model for\ all species.\
\\ The phastCons program computes conservation scores based on a phylo-HMM, a\ type of probabilistic model that describes both the process of DNA\ substitution at each site in a genome and the way this process changes from\ one site to the next (Felsenstein and Churchill 1996, Yang 1995, Siepel and\ Haussler 2005). PhastCons uses a two-state phylo-HMM, with a state for\ conserved regions and a state for non-conserved regions. The value plotted\ at each site is the posterior probability that the corresponding alignment\ column was "generated" by the conserved state of the phylo-HMM. These\ scores reflect the phylogeny (including branch lengths) of the species in\ question, a continuous-time Markov model of the nucleotide substitution\ process, and a tendency for conservation levels to be autocorrelated along\ the genome (i.e., to be similar at adjacent sites). The general reversible\ (REV) substitution model was used. Unlike many conservation-scoring programs,\ phastCons does not rely on a sliding window\ of fixed size; therefore, short highly-conserved regions and long moderately\ conserved regions can both obtain high scores.\ More information about\ phastCons can be found in Siepel et al. 2005.
\\ The phastCons parameters used were: expected-length=45,\ target-coverage=0.3, rho=0.3.
\ \\ The phyloP program supports several different methods for computing\ p-values of conservation or acceleration, for individual nucleotides or\ larger elements\ (\ http://compgen.cshl.edu/phast/).\ Here it was used\ to produce separate scores at each base (--wig-scores option), considering\ all branches of the phylogeny rather than a particular subtree or lineage\ (i.e., the --subtree option was not used). The scores were computed by\ performing a likelihood ratio test at each alignment column (--method LRT),\ and scores for both conservation and acceleration were produced (--mode CONACC).\
\\ The conserved elements were predicted by running phastCons with the\ --viterbi option. The predicted elements are segments of the alignment\ that are likely to have been "generated" by the conserved state of the\ phylo-HMM. Each element is assigned a log-odds score equal to its log\ probability under the conserved model minus its log probability under the\ non-conserved model. The "score" field associated with this track contains\ transformed log-odds scores, taking values between 0 and 1000. (The scores\ are transformed using a monotonic function of the form a * log(x) + b.) The\ raw log odds scores are retained in the "name" field and can be seen on the\ details page or in the browser when the track's display mode is set to\ "pack" or "full".\
\ \This track was created using the following programs:\
The phylogenetic tree is based on Murphy et al. (2001) and general\ consensus in the vertebrate phylogeny community as of March 2007.\
\ \\ Felsenstein J, Churchill GA.\ A Hidden Markov Model approach to\ variation among sites in rate of evolution.\ Mol Biol Evol. 1996 Jan;13(1):93-104.\ PMID: 8583911\
\ \\ Pollard KS, Hubisz MJ, Rosenbloom KR, Siepel A.\ \ Detection of nonneutral substitution rates on mammalian phylogenies.\ Genome Res. 2010 Jan;20(1):110-21.\ PMID: 19858363; PMC: PMC2798823\
\ \\ Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M, Rosenbloom K,\ Clawson H, Spieth J, Hillier LW, Richards S, et al.\ Evolutionarily conserved elements in vertebrate, insect, worm,\ and yeast genomes.\ Genome Res. 2005 Aug;15(8):1034-50.\ PMID: 16024819; PMC: PMC1182216\
\ \\ Siepel A, Haussler D.\ Phylogenetic Hidden Markov Models.\ In: Nielsen R, editor. Statistical Methods in Molecular Evolution.\ New York: Springer; 2005. pp. 325-351.\
\ \\ Yang Z.\ A space-time process model for the evolution of DNA\ sequences.\ Genetics. 1995 Feb;139(2):993-1005.\ PMID: 7713447; PMC: PMC1206396\
\ \\ Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D.\ Evolution's cauldron:\ duplication, deletion, and rearrangement in the mouse and human genomes.\ Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9.\ PMID: 14500911; PMC: PMC208784\
\ \\ Blanchette M, Kent WJ, Riemer C, Elnitski L, Smit AF, Roskin KM,\ Baertsch R, Rosenbloom K, Clawson H, Green ED, et al.\ Aligning multiple genomic sequences with the threaded blockset aligner.\ Genome Res. 2004 Apr;14(4):708-15.\ PMID: 15060014; PMC: PMC383317\
\ \\ Chiaromonte F, Yap VB, Miller W.\ Scoring pairwise genomic sequence alignments.\ Pac Symp Biocomput. 2002:115-26.\ PMID: 11928468\
\ \\ Harris RS.\ Improved pairwise alignment of genomic DNA.\ Ph.D. Thesis. Pennsylvania State University, USA. 2007.\
\ \\ Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC,\ Haussler D, Miller W.\ Human-mouse alignments with BLASTZ.\ Genome Res. 2003 Jan;13(1):103-7.\ PMID: 12529312; PMC: PMC430961\
\ \ \\ Murphy WJ, Eizirik E, O'Brien SJ, Madsen O, Scally M, Douady CJ, Teeling E,\ Ryder OA, Stanhope MJ, de Jong WW, Springer MS.\ Resolution of the early placental mammal radiation using Bayesian phylogenetics.\ Science. 2001 Dec 14;294(5550):2348-51.\ PMID: 11743200\
\ compGeno 1 compositeTrack on\ dragAndDrop subTracks\ group compGeno\ longLabel Multiz Alignment & Conservation (27 Species)\ priority 1\ shortLabel Conservation\ subGroup1 view Views align=Multiz_Alignments phyloP=Basewise_Conservation_(phyloP) phastcons=Element_Conservation_(phastCons) elements=Conserved_Elements\ track cons27way\ type bed 4\ visibility hide\ cons27wayViewelements Conserved Elements bed 4 Multiz Alignment & Conservation (27 Species) 1 1 0 0 0 127 127 127 0 0 0 compGeno 1 longLabel Multiz Alignment & Conservation (27 Species)\ parent cons27way\ shortLabel Conserved Elements\ track cons27wayViewelements\ view elements\ visibility dense\ cpgIslandExt CpG Islands bed 4 + CpG Islands (Islands < 300 Bases are Light Green) 3 1 0 100 0 128 228 128 0 0 0CpG islands are associated with genes, particularly housekeeping\ genes, in vertebrates. CpG islands are typically common near\ transcription start sites and may be associated with promoter\ regions. Normally a C (cytosine) base followed immediately by a \ G (guanine) base (a CpG) is rare in\ vertebrate DNA because the Cs in such an arrangement tend to be\ methylated. This methylation helps distinguish the newly synthesized\ DNA strand from the parent strand, which aids in the final stages of\ DNA proofreading after duplication. However, over evolutionary time,\ methylated Cs tend to turn into Ts because of spontaneous\ deamination. The result is that CpGs are relatively rare unless\ there is selective pressure to keep them or a region is not methylated\ for some other reason, perhaps having to do with the regulation of gene\ expression. CpG islands are regions where CpGs are present at\ significantly higher levels than is typical for the genome as a whole.
\ \\ The unmasked version of the track displays potential CpG islands\ that exist in repeat regions and would otherwise not be visible\ in the repeat masked version.\
\ \\ By default, only the masked version of the track is displayed. To view the\ unmasked version, change the visibility settings in the track controls at\ the top of this page.\
\ \CpG islands were predicted by searching the sequence one base at a\ time, scoring each dinucleotide (+17 for CG and -1 for others) and\ identifying maximally scoring segments. Each segment was then\ evaluated for the following criteria:\ \
\ The entire genome sequence, masking areas included, was\ used for the construction of the track Unmasked CpG.\ The track CpG Islands is constructed on the sequence after\ all masked sequence is removed.\
\ \The CpG count is the number of CG dinucleotides in the island. \ The Percentage CpG is the ratio of CpG nucleotide bases\ (twice the CpG count) to the length. The ratio of observed to expected \ CpG is calculated according to the formula (cited in \ Gardiner-Garden et al. (1987)):\ \
Obs/Exp CpG = Number of CpG * N / (Number of C * Number of G)\ \ where N = length of sequence.\
\ The calculation of the track data is performed by the following command sequence:\
\ twoBitToFa assembly.2bit stdout | maskOutFa stdin hard stdout \\\ | cpg_lh /dev/stdin 2> cpg_lh.err \\\ | awk '{$2 = $2 - 1; width = $3 - $2; printf("%s\\t%d\\t%s\\t%s %s\\t%s\\t%s\\t%0.0f\\t%0.1f\\t%s\\t%s\\n", $1, $2, $3, $5, $6, width, $6, width*$7*0.01, 100.0*2*$6/width, $7, $9);}' \\\ | sort -k1,1 -k2,2n > cpgIsland.bed\\ The unmasked track data is constructed from\ twoBitToFa -noMask output for the twoBitToFa command.\ \ \
\ CpG islands and its associated tables can be explored interactively using the\ REST API, the\ Table Browser or the\ Data Integrator.\ All the tables can also be queried directly from our public MySQL\ servers, with more information available on our\ help page as well as on\ our blog.
\\ The source for the cpg_lh program can be obtained from\ src/utils/cpgIslandExt/.\ The cpg_lh program binary can be obtained from: http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/cpg_lh (choose "save file")\
\ \This track was generated using a modification of a program developed by G. Miklem and L. Hillier \ (unpublished).
\ \\ Gardiner-Garden M, Frommer M.\ \ CpG islands in vertebrate genomes.\ J Mol Biol. 1987 Jul 20;196(2):261-82.\ PMID: 3656447\
\ regulation 1 html cpgIslandSuper\ longLabel CpG Islands (Islands < 300 Bases are Light Green)\ parent cpgIslandSuper pack\ priority 1\ shortLabel CpG Islands\ track cpgIslandExt\ cons27wayViewphastcons Element Conservation (phastCons) bed 4 Multiz Alignment & Conservation (27 Species) 2 1 0 0 0 127 127 127 0 0 0 compGeno 1 longLabel Multiz Alignment & Conservation (27 Species)\ parent cons27way\ shortLabel Element Conservation (phastCons)\ track cons27wayViewphastcons\ view phastcons\ visibility full\ cons27wayViewalign Multiz Alignments bed 4 Multiz Alignment & Conservation (27 Species) 3 1 0 0 0 127 127 127 0 0 0 compGeno 1 longLabel Multiz Alignment & Conservation (27 Species)\ parent cons27way\ shortLabel Multiz Alignments\ track cons27wayViewalign\ view align\ viewUi on\ visibility pack\ ncbiRefSeq RefSeq All genePred NCBI RefSeq genes, curated and predicted (NM_*, XM_*, NR_*, XR_*, NP_*, YP_*) 1 1 12 12 120 133 133 187 0 0 0 genes 1 baseColorDefault genomicCodons\ baseColorUseCds given\ color 12,12,120\ idXref ncbiRefSeqLink mrnaAcc name\ longLabel NCBI RefSeq genes, curated and predicted (NM_*, XM_*, NR_*, XR_*, NP_*, YP_*)\ parent refSeqComposite on\ priority 1\ shortLabel RefSeq All\ track ncbiRefSeq\ ReMapDensity ReMap density bigWig ReMap density 0 1 0 0 0 127 127 127 0 0 0\ This track represents the ReMap Atlas of regulatory regions, which consists of a\ large-scale integrative analysis of all Public ChIP-seq data for transcriptional\ regulators from GEO, ArrayExpress, and ENCODE. \
\ \\ Below is a schematic diagram of the types of regulatory regions: \
\ This 4th release of ReMap (2022) presents the analysis of 1,206 quality\ controlled ChIP-seq (n=1,315 before QCs) data sets from public sources (GEO,\ ENCODE). Those ChIP-seq data sets have been mapped to the dm6 drosophila\ assembly. The data set is defined as a ChIP-seq experiment in a given series\ (e.g. GSE107059), for a given TF (e.g. Trl), in a particular biological\ condition (i.e. cell line, tissue type, disease state, or experimental conditions;\ e.g. Schneider-2). Data sets were labeled by concatenating these three pieces of\ information, such as GSE107059.Trl.Schneider-2.\
\Those merged analyses cover a total of 550 DNA-binding proteins\ (transcriptional regulators) such as a variety of transcription factors (TFs),\ transcription co-activators (TCFs), and chromatin-remodeling factors (CRFs) for\ 16 million peaks.\
\ \ \\ Available ENCODE ChIP-seq data sets for transcriptional regulators from the\ ENCODE portal were processed with the\ standardized ReMap pipeline. The list of ENCODE data was retrieved as FASTQ files from the\ ENCODE portal\ using filters. Metadata information in JSON format and FASTQ files were retrieved using the Python\ requests module.\
\ \ \ \\ Both Public and ENCODE data were processed similarly. Bowtie 2 (PMC3322381) (version 2.2.9) with options -end-to-end -sensitive was used to align all\ reads on the genome. Biological and technical\ replicates for each unique combination of GSE/TF/Cell type or Biological condition\ were used for peak calling. TFBS were identified using MACS2 peak-calling tool\ (PMC3120977) (version 2.1.1.2) in order to follow ENCODE ChIP-seq guidelines,\ with stringent thresholds (MACS2 default thresholds, p-value: 1e-5). An input data\ set was used when available.\
\ \ \\ To assess the quality of public data sets, a score was computed based on the\ cross-correlation and the FRiP (fraction of reads in peaks) metrics developed by\ the ENCODE Consortium (https://genome.ucsc.edu/ENCODE/qualityMetrics.html). Two\ thresholds were defined for each of the two cross-correlation ratios (NSC,\ normalized strand coefficient: 1.05 and 1.10; RSC, relative strand coefficient:\ 0.8 and 1.0). Detailed descriptions of the ENCODE quality coefficients can be\ found at https://genome.ucsc.edu/ENCODE/qualityMetrics.html. The\ phantompeak tools suite was used\ (https://code.google.com/p/phantompeakqualtools/) to compute\ RSC and NSC.\
\\ Please refer to the ReMap 2022, 2020, and 2018 publications for more details\ (citation below).\
\ \ \ \\ ReMap Atlas of regulatory regions data can be explored interactively with the\ Table Browser and cross-referenced with the \ Data Integrator. For programmatic access,\ the track can be accessed using the Genome Browser's\ REST API.\ ReMap annotations can be downloaded from the\ Genome Browser's download server\ as a bigBed file. This compressed binary format can be remotely queried through\ command line utilities. Please note that some of the download files can be quite large.
\ \\ Individual BED files for specific TFs, cells/biotypes, or data sets can be\ found and downloaded on the ReMap website.\
\ \\ Chèneby J, Gheorghe M, Artufel M, Mathelier A, Ballester B.\ \ ReMap 2018: an updated atlas of regulatory regions from an integrative analysis of DNA-binding ChIP-\ seq experiments.\ Nucleic Acids Res. 2018 Jan 4;46(D1):D267-D275.\ PMID: 29126285; PMC: PMC5753247\
\\ Chèneby J, Ménétrier Z, Mestdagh M, Rosnet T, Douida A, Rhalloussi W, Bergon A, Lopez\ F, Ballester B.\ \ ReMap 2020: a database of regulatory regions from an integrative analysis of Human and Arabidopsis\ DNA-binding sequencing experiments.\ Nucleic Acids Res. 2020 Jan 8;48(D1):D180-D188.\ PMID: 31665499; PMC: PMC7145625\
\\ Griffon A, Barbier Q, Dalino J, van Helden J, Spicuglia S, Ballester B.\ \ Integrative analysis of public ChIP-seq experiments reveals a complex multi-cell regulatory\ landscape.\ Nucleic Acids Res. 2015 Feb 27;43(4):e27.\ PMID: 25477382; PMC: PMC4344487\
\\ Hammal F, de Langen P, Bergon A, Lopez F, Ballester B.\ \ ReMap 2022: a database of Human, Mouse, Drosophila and Arabidopsis regulatory regions from an\ integrative analysis of DNA-binding sequencing experiments.\ Nucleic Acids Res. 2022 Jan 7;50(D1):D316-D325.\ PMID: 34751401; PMC: PMC8728178\
\ regulation 0 autoScale on\ bigDataUrl /gbdb/dm6/reMap/reMapDensity2022.bw\ html reMap\ longLabel ReMap density\ parent ReMap on\ priority 1\ shortLabel ReMap density\ track ReMapDensity\ type bigWig\ visibility hide\ unipAliSwissprot SwissProt Aln. bigPsl UCSC alignment of SwissProt proteins to genome (dark blue: main isoform, light blue: alternative isoforms) 3 1 0 0 0 127 127 127 0 0 0 genes 1 baseColorDefault genomicCodons\ baseColorTickColor contrastingColor\ baseColorUseCds given\ bigDataUrl /gbdb/dm6/uniprot/unipAliSwissprot.bb\ indelDoubleInsert on\ indelQueryInsert on\ itemRgb on\ labelFields name,acc,uniprotName,geneName,hgncSym,refSeq,refSeqProt,ensProt\ longLabel UCSC alignment of SwissProt proteins to genome (dark blue: main isoform, light blue: alternative isoforms)\ mouseOverField protFullNames\ parent uniprot\ priority 1\ searchIndex name,acc\ shortLabel SwissProt Aln.\ showDiffBasesAllScales on\ skipFields isMain\ track unipAliSwissprot\ type bigPsl\ urls acc="https://www.uniprot.org/uniprot/$$" hgncId="https://www.genenames.org/cgi-bin/gene_symbol_report?hgnc_id=$$" refSeq="https://www.ncbi.nlm.nih.gov/nuccore/$$" refSeqProt="https://www.ncbi.nlm.nih.gov/protein/$$" ncbiGene="https://www.ncbi.nlm.nih.gov/gene/$$" entrezGene="https://www.ncbi.nlm.nih.gov/gene/$$" ensGene="https://www.ensembl.org/Gene/Summary?g=$$"\ visibility pack\ netDroSim2 droSim2 Net netAlign droSim2 chainDroSim2 D. simulans (Sep. 2014 (ASM75419v2/droSim2)) Alignment Net 1 2 0 0 0 255 255 0 0 0 0 compGeno 0 longLabel D. simulans (Sep. 2014 (ASM75419v2/droSim2)) Alignment Net\ otherDb droSim2\ parent insectsChainNetViewnet off\ shortLabel droSim2 Net\ subGroups view=net species=s000a clade=c00\ track netDroSim2\ type netAlign droSim2 chainDroSim2\ cons124wayViewphyloP Basewise Conservation (phyloP) bed 4 Multiz Alignment & Conservation (124 insects) 2 2 0 0 0 127 127 127 0 0 0 compGeno 1 longLabel Multiz Alignment & Conservation (124 insects)\ parent cons124way\ shortLabel Basewise Conservation (phyloP)\ track cons124wayViewphyloP\ view phyloP\ viewLimits -20.0:9.869\ viewLimitsMax -20:0.869\ visibility full\ cons124way Cons 124 Insects bed 4 Multiz Alignment & Conservation (124 insects) 2 2 0 0 0 127 127 127 0 0 0\ This track shows multiple alignments of 124 insects and measurements of\ evolutionary conservation using\ two methods (phastCons and phyloP) from the\ \ PHAST package, for all 124 species.\ The multiple alignments were generated using multiz and\ other tools in the UCSC/Penn State Bioinformatics\ comparative genomics alignment pipeline.\ Conserved elements identified by phastCons are also displayed in\ this track.\
\\ The phylogenetic tree was derived from kmers in common counting\ between the sequences to obtain a 'distance' matrix, then using the\ phylip command 'neighbors' operation for the simple neighbor joining\ algorithm to establish this binary tree. This tree is not necessarily\ biologically correct, but it does serve as a useful guide tree for the\ multiz alignment procedure. See also:\ Phylip distance operations\
\\ PhastCons (which has been used in previous Conservation tracks) is a hidden\ Markov model-based method that estimates the probability that each\ nucleotide belongs to a conserved element, based on the multiple alignment.\ It considers not just each individual alignment column, but also its\ flanking columns. By contrast, phyloP separately measures conservation at\ individual columns, ignoring the effects of their neighbors. As a\ consequence, the phyloP plots have a less smooth appearance than the\ phastCons plots, with more "texture" at individual sites. The two methods\ have different strengths and weaknesses. PhastCons is sensitive to "runs"\ of conserved sites, and is therefore effective for picking out conserved\ elements. PhyloP, on the other hand, is more appropriate for evaluating\ signatures of selection at particular nucleotides or classes of nucleotides\ (e.g., third codon positions, or first positions of miRNA target sites).\
\\ Another important difference is that phyloP can measure acceleration\ (faster evolution than expected under neutral drift) as well as\ conservation (slower than expected evolution). In the phyloP plots, sites\ predicted to be conserved are assigned positive scores (and shown in blue),\ while sites predicted to be fast-evolving are assigned negative scores (and\ shown in red). The absolute values of the scores represent -log p-values\ under a null hypothesis of neutral evolution. The phastCons scores, by\ contrast, represent probabilities of negative selection and range between 0\ and 1.\
\\ Both phastCons and phyloP treat alignment gaps and unaligned nucleotides as\ missing data.\
\\ See also: lastz parameters and other details, and\ chain minimum score and gap parameters used in these alignments.\
\ \\ Missing sequence in the assemblies is highlighted in the track display\ by regions of yellow when zoomed out and Ns displayed at base\ level (see Gap Annotation, below).
\\
\
\ Organism Species Assembly name browser or
NCBI sourcealignment type \ D. melanogaster Drosophila melanogaster \Aug. 2014 (BDGP Release 6 + ISO1 MT/dm6) \Aug. 2014 (BDGP Release 6 + ISO1 MT/dm6) \reference \ A. albimanus Anopheles albimanus \Aug. 2017 (Anop_albi_ALBI9_A_V2) \GCA_000349125.2 \net \ A. aquasalis Anopheles aquasalis \Dec. 2017 (A_aquasalis_v1.0) \GCA_002846955.1 \net \ A. arabiensis Anopheles arabiensis \Apr. 2013 (Anop_arab_DONG5_A_V1) \GCA_000349185.1 \net \ A. atroparvus Anopheles atroparvus \Sep. 2013 (Anop_atro_EBRO_V1) \GCA_000473505.1 \net \ A. christyi Anopheles christyi \Apr. 2013 (Anop_chri_ACHKN1017_V1) \GCA_000349165.1 \net \ A. coluzzii Anopheles coluzzii \Apr. 2008 (m5) \GCA_000150765.1 \net \ A. cracens Anopheles cracens \Apr. 2017 (ASM209184v1) \GCA_002091845.1 \net \ A. culicifacies Anopheles culicifacies \Sep. 2013 (Anop_culi_species_A-37_1_V1) \GCA_000473375.1 \net \ A. darlingi Anopheles darlingi \Dec. 2013 (A_darlingi_v1) \GCA_000211455.3 \net \ A. dirus Anopheles dirus \Mar. 2013 (Anop_diru_WRAIR2_V1) \GCA_000349145.1 \net \ A. epiroticus Anopheles epiroticus \Mar. 2013 (Anop_epir_epiroticus2_V1) \GCA_000349105.1 \net \ A. farauti Anopheles farauti \Jan. 2014 (Anop_fara_FAR1_V2) \GCA_000473445.2 \net \ A. farauti_No4 Anopheles farauti No. 4 \Mar. 2015 (ASM95621v1) \GCA_000956215.1 \net \ A. funestus Anopheles funestus \Mar. 2013 (Anop_fune_FUMOZ_V1) \GCA_000349085.1 \net \ A. gambiae Anopheles gambiae \Oct. 2006 (AgamP3/anoGam3) \Oct. 2006 (AgamP3/anoGam3) \net \ A. gambiae_1 Anopheles gambiae str. PEST \Oct. 2006 (AgamP3) \GCF_000005575.2 \net \ A. koliensis Anopheles koliensis \Mar. 2015 (ASM95627v1) \GCA_000956275.1 \net \ A. maculatus Anopheles maculatus \Apr. 2017 (ASM209183v1) \GCA_002091835.1 \net \ A. melas Anopheles melas \Jan. 2014 (Anop_mela_CM1001059_A_V2) \GCA_000473525.2 \net \ A. mellifera Apis mellifera \04 Nov 2010 (Amel_4.5/apiMel4) \04 Nov 2010 (Amel_4.5/apiMel4) \net \ A. merus Anopheles merus \Jan. 2014 (Anop_meru_MAF_V1) \GCA_000473845.2 \net \ A. minimus Anopheles minimus \Mar. 2013 (Anop_mini_MINIMUS1_V1) \GCA_000349025.1 \net \ A. nili Anopheles nili \Jul. 2013 (Anili1) \GCA_000439205.1 \net \ A. punctulatus Anopheles punctulatus \Mar. 2015 (ASM95625v1) \GCA_000956255.1 \net \ A. quadriannulatus Anopheles quadriannulatus \Mar. 2013 (Anop_quad_QUAD4_A_V1) \GCA_000349065.1 \net \ A. sinensis Anopheles sinensis \Jul. 2014 (AS2) \GCA_000441895.2 \net \ A. stephensi Anopheles stephensi \Sep. 2018 (ASM344897v1) \GCA_003448975.1 \net \ Aedes_aegypti Aedes aegypti \Jun. 2017 (AaegL5.0) \GCF_002204515.2 \net \ Aedes_albopictus Aedes albopictus \Jan. 2017 (canu_80X_arrow2.2) \GCF_001876365.2 \net \ Bactrocera_dorsalis Bactrocera dorsalis \Dec. 2014 (ASM78921v2) \GCF_000789215.1 \net \ Bactrocera_latifrons Bactrocera latifrons \Oct. 2016 (ASM185335v1) \GCF_001853355.1 \net \ Bactrocera_oleae Bactrocera oleae \Jul. 2015 (gapfilled_joined_lt9474.gt500.covgt10) \GCF_001188975.1 \net \ Bactrocera_tryoni Bactrocera tryoni \May 2014 (Assembly_2.2_of_Bactrocera_tryoni_genome) \GCA_000695345.1 \net \ Belgica_antarctica Belgica antarctica \Sep. 2014 (ASM77530v1) \GCA_000775305.1 \net \ Calliphora_vicina Calliphora vicina \Jun. 2015 (ASM101727v1) \GCA_001017275.1 \net \ Ceratitis_capitata Ceratitis capitata \Nov. 2017 (Ccap_2.1) \GCF_000347755.3 \net \ Chaoborus_trivitattus Chaoborus trivitattus \May 2015 (ASM101481v1) \GCA_001014815.1 \net \ Chironomus_riparius Chironomus riparius \May 2015 (ASM101450v1) \GCA_001014505.1 \net \ Chironomus_tentans Chironomus tentans \Nov. 2014 (CT01) \GCA_000786525.1 \net \ Cirrula_hians Cirrula hians \May 2015 (ASM101507v1) \GCA_001015075.1 \net \ Clogmia_albipunctata Clogmia albipunctata \May 2015 (ASM101494v1) \GCA_001014945.1 \net \ Clunio_marinus Clunio marinus \Nov. 2016 (CLUMA_1.0) \GCA_900005825.1 \net \ Coboldia_fuscipes Coboldia fuscipes \May 2015 (ASM101433v1) \GCA_001014335.1 \net \ Condylostylus_patibulatus Condylostylus patibulatus \May 2015 (ASM101487v1) \GCA_001014875.1 \net \ Culex_quinquefasciatus Culex quinquefasciatus \Apr. 2007 (CulPip1.0) \GCF_000209185.1 \net \ Culicoides_sonorensis Culicoides sonorensis \Feb. 2018 (Cson_Genome_version_2.0) \GCA_900258525.2 \net \ D. albomicans Drosophila albomicans \21 May 2012 (DroAlb_1.0/droAlb1) \21 May 2012 (DroAlb_1.0/droAlb1) \net \ D. americana Drosophila americana \Oct. 2015 (D._americana_H5_strain_genome_assembly) \GCA_001245395.1 \net \ D. ananassae Drosophila ananassae \Feb. 2006 (Agencourt CAF1/droAna3) \Feb. 2006 (Agencourt CAF1/droAna3) \syntenic \ D. arizonae Drosophila arizonae \May 2016 (ASM165402v1) \GCF_001654025.1 \syntenic \ D. athabasca Drosophila athabasca \Jun. 2018 (ASM318502v1) \GCA_003185025.1 \syntenic \ D. biarmipes Drosophila biarmipes \04 Mar 2013 (Dbia_2.0/droBia2) \04 Mar 2013 (Dbia_2.0/droBia2) \syntenic \ D. bipectinata Drosophila bipectinata \04 Mar 2013 (Dbip_2.0/droBip2) \04 Mar 2013 (Dbip_2.0/droBip2) \net \ D. busckii Drosophila busckii \Sep. 2015 (ASM127793v1) \GCF_001277935.1 \syntenic \ D. elegans Drosophila elegans \04 Mar 2013 (Dele_2.0/droEle2) \04 Mar 2013 (Dele_2.0/droEle2) \net \ D. erecta Drosophila erecta \Feb. 2006 (Agencourt CAF1/droEre2) \Feb. 2006 (Agencourt CAF1/droEre2) \syntenic \ D. eugracilis Drosophila eugracilis \04 Mar 2013 (Deug_2.0/droEug2) \04 Mar 2013 (Deug_2.0/droEug2) \net \ D. ficusphila Drosophila ficusphila \04 Mar 2013 (Dfic_2.0/droFic2) \04 Mar 2013 (Dfic_2.0/droFic2) \net \ D. grimshawi Drosophila grimshawi \Feb. 2006 (Agencourt CAF1/droGri2) \Feb. 2006 (Agencourt CAF1/droGri2) \syntenic \ D. hydei Drosophila hydei \Nov. 2017 (ASM278046v1) \GCF_002780465.1 \net \ D. kikkawai Drosophila kikkawai \04 Mar 2013 (Dkik_2.0/droKik2) \04 Mar 2013 (Dkik_2.0/droKik2) \net \ D. miranda Drosophila miranda \19 Apr 2013 (DroMir_2.2/droMir2) \19 Apr 2013 (DroMir_2.2/droMir2) \syntenic \ D. mojavensis Drosophila mojavensis \Feb. 2006 (Agencourt CAF1/droMoj3) \Feb. 2006 (Agencourt CAF1/droMoj3) \syntenic \ D. montana Drosophila montana \May 2018 (ASM308661v1) \GCA_003086615.1 \net \ D. nasuta Drosophila nasuta \Jul. 2017 (ASM222288v1) \GCA_002222885.1 \net \ D. navojoa Drosophila navojoa \May 2016 (ASM165401v1) \GCF_001654015.1 \syntenic \ D. novamexicana Drosophila novamexicana \Jul. 2018 (DnovRS1) \GCA_003285875.1 \syntenic \ D. obscura Drosophila obscura \Jul. 2017 (Dobs_1.0) \GCF_002217835.1 \net \ D. persimilis Drosophila persimilis \Oct. 2005 (Broad/droPer1) \Oct. 2005 (Broad/droPer1) \net \ D. pseudoobscura Drosophila pseudoobscura pseudoobscura \11 Apr 2013 (Dpse_3.0/droPse3) \11 Apr 2013 (Dpse_3.0/droPse3) \syntenic \ D. pseudoobscura_1 Drosophila pseudoobscura pseudoobscura \Apr. 2013 (Dpse_3.0) \GCF_000001765.3 \net \ D. rhopaloa Drosophila rhopaloa \22 Feb 2013 (Drho_2.0/droRho2) \22 Feb 2013 (Drho_2.0/droRho2) \net \ D. sechellia Drosophila sechellia \Oct. 2005 (Broad/droSec1) \Oct. 2005 (Broad/droSec1) \syntenic \ D. serrata Drosophila serrata \Apr. 2017 (Dser1.0) \GCF_002093755.1 \net \ D. simulans Drosophila simulans \Sep. 2014 (ASM75419v2/droSim2) \Sep. 2014 (ASM75419v2/droSim2) \syntenic \ D. subobscura Drosophila subobscura \Nov. 2017 (Dsub_1.0) \GCA_002749795.1 \net \ D. suzukii Drosophila suzukii \30 Sep 2013 (Dsuzukii.v01/droSuz1) \30 Sep 2013 (Dsuzukii.v01/droSuz1) \net \ D. takahashii Drosophila takahashii \04 Mar 2013 (Dtak_2.0/droTak2) \04 Mar 2013 (Dtak_2.0/droTak2) \net \ D. virilis Drosophila virilis \Feb. 2006 (Agencourt CAF1/droVir3) \Feb. 2006 (Agencourt CAF1/droVir3) \syntenic \ D. willistoni Drosophila willistoni \03 Aug 2006 (dwil_caf1/droWil2) \03 Aug 2006 (dwil_caf1/droWil2) \syntenic \ D. yakuba Drosophila yakuba \27 Jun 2006 (dyak_caf1/droYak3) \27 Jun 2006 (dyak_caf1/droYak3) \syntenic \ Ephydra_gracilis Ephydra gracilis \May 2015 (ASM101467v1) \GCA_001014675.1 \net \ Eristalis_dimidiata Eristalis dimidiata \May 2015 (ASM101514v1) \GCA_001015145.1 \net \ Eutreta_diana Eutreta diana \May 2015 (ASM101511v1) \GCA_001015115.1 \net \ Glossina_austeni Glossina austeni \May 2014 (Glossina_austeni-1.0.3) \GCA_000688735.1 \net \ Glossina_brevipalpis Glossina brevipalpis \May 2014 (Glossina_brevipalpis_1.0.3) \GCA_000671755.1 \net \ Glossina_fuscipes Glossina fuscipes fuscipes \May 2014 (Glossina_fuscipes-3.0.2) \GCA_000671735.1 \net \ Glossina_morsitans_1 Glossina morsitans \May 2015 (ASM101451v1) \GCA_001014515.1 \net \ Glossina_morsitans_2 Glossina morsitans morsitans \Mar. 2014 (ASM107743v1) \GCA_001077435.1 \net \ Glossina_pallidipes Glossina pallidipes \May 2014 (Glossina_pallidipes-1.0.3) \GCA_000688715.1 \net \ Glossina_palpalis_gambiensis Glossina palpalis gambiensis \Jan. 2015 (Glossina_palpalis_gambiensis-2.0.1) \GCA_000818775.1 \net \ Haematobia_irritans Haematobia irritans \May 2018 (Hi_v1.0) \GCA_003123925.1 \net \ Hermetia_illucens Hermetia illucens \May 2015 (ASM101489v1) \GCA_001014895.1 \net \ Holcocephala_fusca Holcocephala fusca \May 2015 (ASM101521v1) \GCA_001015215.1 \net \ Liriomyza_trifolii Liriomyza trifolii \May 2015 (ASM101493v1) \GCA_001014935.1 \net \ Lucilia_cuprina Lucilia cuprina \Dec. 2017 (Lcup_2.0) \GCF_000699065.1 \net \ Lucilia_sericata Lucilia sericata \May 2015 (ASM101483v1) \GCA_001014835.1 \net \ Lutzomyia_longipalpis Lutzomyia longipalpis \Jun. 2012 (Llon_1.0) \GCA_000265325.1 \net \ M. domestica Musca domestica \22 Apr 2013 (Musca_domestica-2.0.2/musDom2) \22 Apr 2013 (Musca_domestica-2.0.2/musDom2) \net \ Mayetiola_destructor Mayetiola destructor \Oct. 2010 (Mdes_1.0) \GCA_000149185.1 \net \ Megaselia_abdita Megaselia abdita \May 2015 (ASM101517v1) \GCA_001015175.1 \net \ Megaselia_scalaris Megaselia scalaris \Mar. 2013 (ASM34191v2) \GCA_000341915.2 \net \ Mochlonyx_cinctipes Mochlonyx cinctipes \May 2015 (ASM101484v1) \GCA_001014845.1 \net \ Neobellieria_bullata Neobellieria bullata \Jun. 2015 (ASM101745v1) \GCA_001017455.1 \net \ Paykullia_maculata Paykullia maculata \Apr. 2018 (ASM305512v1) \GCA_003055125.1 \net \ Phlebotomus_papatasi Phlebotomus papatasi \May 2012 (Ppap_1.0) \GCA_000262795.1 \net \ Phormia_regina Phormia regina \Sep. 2016 (ASM173554v1) \GCA_001735545.1 \net \ Phortica_variegata Phortica variegata \May 2015 (ASM101441v1) \GCA_001014415.1 \net \ Proctacanthus_coquilletti Proctacanthus coquilletti \Jan. 2017 (200kmer_750.trimmed) \GCA_001932985.1 \net \ Rhagoletis_zephyria Rhagoletis zephyria \Jul. 2016 (Rhagoletis_zephyria_1.0) \GCF_001687245.1 \net \ Sarcophagidae_BV_2014 Sarcophagidae sp. BV-2014 \Jul. 2015 (ASM104719v1) \GCA_001047195.1 \net \ Scaptodrosophila_lebanonensis Scaptodrosophila lebanonensis \Jul. 2018 (SlebRS1) \GCA_003285725.1 \net \ Sphyracephala_brevicornis Sphyracephala brevicornis \May 2015 (ASM101523v1) \GCA_001015235.1 \net \ Stomoxys_calcitrans Stomoxys calcitrans \May 2015 (Stomoxys_calcitrans-1.0.1) \GCF_001015335.1 \net \ T. castaneum Tribolium castaneum \Sep. 2005 (Baylor 2.0/triCas2) \Sep. 2005 (Baylor 2.0/triCas2) \net \ Teleopsis_dalmanni Teleopsis dalmanni \Jul. 2017 (Tel_dalmanni_2A_v1.0) \GCA_002237135.1 \net \ Tephritis_californica Tephritis californica \Jun. 2015 (ASM101751v1) \GCA_001017515.1 \net \ Themira_minor Themira minor \May 2015 (ASM101457v1) \GCA_001014575.1 \net \ Tipula_oleracea Tipula oleracea \Jun. 2015 (ASM101753v1) \GCA_001017535.1 \net \ Trichoceridae_BV_2014 Trichoceridae sp. BV-2014 \May 2015 (ASM101442v1) \GCA_001014425.1 \net \ Trupanea_jonesi Trupanea jonesi \May 2015 (ASM101466v1) \GCA_001014665.1 \net \ Zaprionus_indianus Zaprionus indianus \Oct. 2016 (ZP_IN_1.0) \GCA_001752445.1 \net \ Zeugodacus_cucurbitae Zeugodacus cucurbitae \Dec. 2014 (ASM80634v1) \GCF_000806345.1 \net
\ \ Downloads for data in this track are available:\\
\ \- \ Multiz alignments (MAF format), and phylogenetic trees\
- \ PhyloP conservation (WIG format)\
- \ PhastCons conservation (WIG format)\
Display Conventions and Configuration
\\ The track configuration options allow the user to display the three different\ clade sets of scores, all, Brachycera, Nematocera or Holometabola,\ individually or all simultaneously.\ In full and pack display modes, conservation scores are displayed as a\ wiggle track (histogram) in which the height reflects the\ value of the score.\ The conservation wiggles can be configured in a variety of ways to\ highlight different aspects of the displayed information.\ Click the Graph configuration help link for an explanation\ of the configuration options.
\\ Pairwise alignments of each species to the D. melanogaster genome are\ displayed below the conservation histogram as a grayscale density plot (in\ pack mode) or as a wiggle (in full mode) that indicates alignment quality.\ In dense display mode, conservation is shown in grayscale using\ darker values to indicate higher levels of overall conservation\ as scored by phastCons.
\\ Checkboxes on the track configuration page allow selection of the\ species to include in the pairwise display.\ Configuration buttons are available to select all of the species\ (Set all), deselect all of the species (Clear all), or\ use the default settings (Set defaults).\ Note that excluding species from the pairwise display does not alter the\ the conservation score display.
\\ To view detailed information about the alignments at a specific\ position, zoom the display in to 30,000 or fewer bases, then click on\ the alignment.
\ \Gap Annotation
\\ The Display chains between alignments configuration option\ enables display of gaps between alignment blocks in the pairwise alignments in\ a manner similar to the Chain track display. The following\ conventions are used:\
\
\ \- Single line: No bases in the aligned species. Possibly due to a\ lineage-specific insertion between the aligned blocks in the D. melanogaster genome\ or a lineage-specific deletion between the aligned blocks in the aligning\ species.\
- Double line: Aligning species has one or more unalignable bases in\ the gap region. Possibly due to excessive evolutionary distance between\ species or independent indels in the region between the aligned blocks in both\ species.\
- Pale yellow coloring: Aligning species has Ns in the gap region.\ Reflects uncertainty in the relationship between the DNA of both species, due\ to lack of sequence in relevant portions of the aligning species.\
Genomic Breaks
\\ Discontinuities in the genomic context (chromosome, scaffold or region) of the\ aligned DNA in the aligning species are shown as follows:\
\
\ \- \ Vertical blue bar: Represents a discontinuity that persists indefinitely\ on either side, e.g. a large region of DNA on either side of the bar\ comes from a different chromosome in the aligned species due to a large scale\ rearrangement.\
- \ Green square brackets: Enclose shorter alignments consisting of DNA from\ one genomic context in the aligned species nested inside a larger chain of\ alignments from a different genomic context. The alignment within the\ brackets may represent a short misalignment, a lineage-specific insertion of a\ transposon in the D. melanogaster genome that aligns to a paralogous copy somewhere\ else in the aligned species, or other similar occurrence.\
Base Level
\\ When zoomed-in to the base-level display, the track shows the base\ composition of each alignment.\ The numbers and symbols on the Gaps\ line indicate the lengths of gaps in the D. melanogaster sequence at those\ alignment positions relative to the longest non-D. melanogaster sequence.\ If there is sufficient space in the display, the size of the gap is shown.\ If the space is insufficient and the gap size is a multiple of 3, a\ "*" is displayed; other gap sizes are indicated by "+".
\\ Codon translation is available in base-level display mode if the\ displayed region is identified as a coding segment. To display this annotation,\ select the species for translation from the pull-down menu in the Codon\ Translation configuration section at the top of the page. Then, select one of\ the following modes:\
\
\- \ No codon translation: The gene annotation is not used; the bases are\ displayed without translation.\
- \ Use default species reading frames for translation: The annotations from\ the genome displayed in the Default species to establish reading frame\ pull-down menu are used to translate all the aligned species present in the\ alignment.\
- \ Use reading frames for species if available, otherwise no translation:\ Codon translation is performed only for those species where the region is\ annotated as protein coding.\
- Use reading frames for species if available, otherwise use default species:\ Codon translation is done on those species that are annotated as being protein\ coding over the aligned region using species-specific annotation; the remaining\ species are translated using the default species annotation.\
\ Codon translation uses the following gene tracks as the basis for\ translation, depending on the species chosen (Table 2).\ \
\ \\
\ Table 2. Gene tracks used for codon translation.\\ Gene Track Species \ NCBI RefSeq Genes D. persimilis \ Ensembl Genes v68 D. erecta, D. ananassae, D. melanogaster \ Xeno RefGene D. sechellia \ no annotations all others Methods
\\ Pairwise alignments with the D. melanogaster genome were generated for\ each species using lastz from repeat-masked genomic sequence.\ Pairwise alignments were then linked into chains using a dynamic programming\ algorithm that finds maximally scoring chains of gapless subsections\ of the alignments organized in a kd-tree.\ Please note the specific parameters for the alignments.\ High-scoring chains were then placed along the genome, with\ gaps filled by lower-scoring chains, to produce an alignment net.\ For more information about the chaining and netting process\ for each species, see the description pages for the Chain and Net\ tracks.
\ \\ An additional filtering step was introduced in the generation of the\ 124-way conservation track to reduce the number of paralogs and pseudogenes\ from the high-quality assemblies and the suspect alignments from the\ low-quality assemblies: some of the pairwise alignments were\ filtered based on synteny; and some were filtered to retain only\ alignments of best quality in both the target and query ("reciprocal best").\ \ See also: D. melanogaster/dm6 124-way alignment filtering parameters.\ The column alignment type indicates the type of filtering.
\\ The resulting best-in-genome pairwise alignments\ were progressively aligned using multiz/autoMZ,\ following the tree topology diagrammed above, to produce multiple alignments.\ The multiple alignments were post-processed to\ add annotations indicating alignment gaps, genomic breaks,\ and base quality of the component sequences.\ The annotated multiple alignments, in MAF format, are available for\ bulk download.\ An alignment summary table containing an entry for each\ alignment block in each species was generated to improve\ track display performance at large scales.\ Framing tables were constructed to enable\ visualization of codons in the multiple alignment display.
\ \Phylogenetic Tree Model
\\ Both phastCons and phyloP are phylogenetic methods that rely\ on a tree model containing the tree topology, branch lengths representing\ evolutionary distance at neutrally evolving sites, the background distribution\ of nucleotides, and a substitution rate matrix.\ The\ all species tree model for this track was\ generated using the phyloFit program from the PHAST package\ (REV model, EM algorithm, medium precision) using multiple alignments of\ 4-fold degenerate sites extracted from the 124-way alignment\ (msa_view). The 4d sites were derived from the NCBI RefSeq gene set,\ filtered to select single-coverage long transcripts.\
\\ This same tree model was used in the phyloP calculations, however their\ background frequencies were modified to maintain reversibility.\ The resulting tree model for\ all species.\
\PhastCons Conservation
\\ The phastCons program computes conservation scores based on a phylo-HMM, a\ type of probabilistic model that describes both the process of DNA\ substitution at each site in a genome and the way this process changes from\ one site to the next (Felsenstein and Churchill 1996, Yang 1995, Siepel and\ Haussler 2005). PhastCons uses a two-state phylo-HMM, with a state for\ conserved regions and a state for non-conserved regions. The value plotted\ at each site is the posterior probability that the corresponding alignment\ column was "generated" by the conserved state of the phylo-HMM. These\ scores reflect the phylogeny (including branch lengths) of the species in\ question, a continuous-time Markov model of the nucleotide substitution\ process, and a tendency for conservation levels to be autocorrelated along\ the genome (i.e., to be similar at adjacent sites). The general reversible\ (REV) substitution model was used. Unlike many conservation-scoring programs,\ phastCons does not rely on a sliding window\ of fixed size; therefore, short highly-conserved regions and long moderately\ conserved regions can both obtain high scores.\ More information about\ phastCons can be found in Siepel et al. 2005.
\\ The phastCons parameters used were: expected-length=45,\ target-coverage=0.3, rho=0.3.
\ \PhyloP Conservation
\\ The phyloP program supports several different methods for computing\ p-values of conservation or acceleration, for individual nucleotides or\ larger elements\ (\ http://compgen.cshl.edu/phast/).\ Here it was used\ to produce separate scores at each base (--wig-scores option), considering\ all branches of the phylogeny rather than a particular subtree or lineage\ (i.e., the --subtree option was not used). The scores were computed by\ performing a likelihood ratio test at each alignment column (--method LRT),\ and scores for both conservation and acceleration were produced (--mode CONACC).\
\Conserved Elements
\\ The conserved elements were predicted by running phastCons with the\ --viterbi option. The predicted elements are segments of the alignment\ that are likely to have been "generated" by the conserved state of the\ phylo-HMM. Each element is assigned a log-odds score equal to its log\ probability under the conserved model minus its log probability under the\ non-conserved model. The "score" field associated with this track contains\ transformed log-odds scores, taking values between 0 and 1000. (The scores\ are transformed using a monotonic function of the form a * log(x) + b.) The\ raw log odds scores are retained in the "name" field and can be seen on the\ details page or in the browser when the track's display mode is set to\ "pack" or "full".\
\ \Credits
\This track was created using the following programs:\
\
\ \- Alignment tools: lastz (formerly blastz) and multiz by Minmei Hou, Scott Schwartz and Webb\ Miller of the Penn State Bioinformatics Group\
- Chaining and Netting: axtChain, chainNet by Jim Kent at UCSC\
- Conservation scoring: phastCons, phyloP, phyloFit, tree_doctor, msa_view and\ other programs in PHAST by\ Adam Siepel at Cold Spring Harbor Laboratory (original development\ done at the Haussler lab at UCSC).\
- MAF Annotation tools: mafAddIRows by Brian Raney, UCSC; mafAddQRows\ by Richard Burhans, Penn State; genePredToMafFrames by Mark Diekhans, UCSC\
- Tree image generator: phyloPng by Galt Barber, UCSC\
- Conservation track display: Kate Rosenbloom, Hiram Clawson (wiggle\ display), and Brian Raney (gap annotation and codon framing) at UCSC\
The phylogenetic tree is based on Murphy et al. (2001) and general\ consensus in the vertebrate phylogeny community as of March 2007.\
\ \References
\Phylip distance operations:
\\ Fan H, Ives A, Surget_groba Y, Cannon C.\ An assembly and alignment-free method of phylogeny reconstruction from next-generation sequencing data.\ BMC Genomics. 2015; 16(1): 522.\ PMID: 26169061\
\\ Bernard G, Ragan M, Chana C.X.\ Recapitulating phylogenies using k-mers: from trees to networks.\ F1000Res. 2016; 5: 2789.\ PMID: 28105314\
\ \Phylo-HMMs, phastCons, and phyloP:
\\ Felsenstein J, Churchill GA.\ A Hidden Markov Model approach to\ variation among sites in rate of evolution.\ Mol Biol Evol. 1996 Jan;13(1):93-104.\ PMID: 8583911\
\ \\ Pollard KS, Hubisz MJ, Rosenbloom KR, Siepel A.\ \ Detection of nonneutral substitution rates on mammalian phylogenies.\ Genome Res. 2010 Jan;20(1):110-21.\ PMID: 19858363; PMC: PMC2798823\
\ \\ Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M, Rosenbloom K,\ Clawson H, Spieth J, Hillier LW, Richards S, et al.\ Evolutionarily conserved elements in vertebrate, insect, worm,\ and yeast genomes.\ Genome Res. 2005 Aug;15(8):1034-50.\ PMID: 16024819; PMC: PMC1182216\
\ \\ Siepel A, Haussler D.\ Phylogenetic Hidden Markov Models.\ In: Nielsen R, editor. Statistical Methods in Molecular Evolution.\ New York: Springer; 2005. pp. 325-351.\
\ \\ Yang Z.\ A space-time process model for the evolution of DNA\ sequences.\ Genetics. 1995 Feb;139(2):993-1005.\ PMID: 7713447; PMC: PMC1206396\
\ \Chain/Net:
\\ Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D.\ Evolution's cauldron:\ duplication, deletion, and rearrangement in the mouse and human genomes.\ Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9.\ PMID: 14500911; PMC: PMC208784\
\ \Multiz:
\\ Blanchette M, Kent WJ, Riemer C, Elnitski L, Smit AF, Roskin KM,\ Baertsch R, Rosenbloom K, Clawson H, Green ED, et al.\ Aligning multiple genomic sequences with the threaded blockset aligner.\ Genome Res. 2004 Apr;14(4):708-15.\ PMID: 15060014; PMC: PMC383317\
\ \Lastz (formerly Blastz):
\\ Chiaromonte F, Yap VB, Miller W.\ Scoring pairwise genomic sequence alignments.\ Pac Symp Biocomput. 2002:115-26.\ PMID: 11928468\
\ \\ Harris RS.\ Improved pairwise alignment of genomic DNA.\ Ph.D. Thesis. Pennsylvania State University, USA. 2007.\
\ \\ Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC,\ Haussler D, Miller W.\ Human-mouse alignments with BLASTZ.\ Genome Res. 2003 Jan;13(1):103-7.\ PMID: 12529312; PMC: PMC430961\
\ \ \Phylogenetic Tree:
\\ Murphy WJ, Eizirik E, O'Brien SJ, Madsen O, Scally M, Douady CJ, Teeling E,\ Ryder OA, Stanhope MJ, de Jong WW, Springer MS.\ Resolution of the early placental mammal radiation using Bayesian phylogenetics.\ Science. 2001 Dec 14;294(5550):2348-51.\ PMID: 11743200\
\ compGeno 1 compositeTrack on\ dragAndDrop subTracks\ group compGeno\ longLabel Multiz Alignment & Conservation (124 insects)\ priority 2\ shortLabel Cons 124 Insects\ subGroup1 view Views align=Multiz_Alignments phyloP=Basewise_Conservation_(phyloP) phastcons=Element_Conservation_(phastCons) elements=Conserved_Elements\ track cons124way\ type bed 4\ visibility full\ cons124wayViewelements Conserved Elements bed 4 Multiz Alignment & Conservation (124 insects) 1 2 0 0 0 127 127 127 0 0 0 compGeno 1 longLabel Multiz Alignment & Conservation (124 insects)\ parent cons124way\ shortLabel Conserved Elements\ track cons124wayViewelements\ view elements\ visibility dense\ cons124wayViewphastcons Element Conservation (phastCons) bed 4 Multiz Alignment & Conservation (124 insects) 2 2 0 0 0 127 127 127 0 0 0 compGeno 1 longLabel Multiz Alignment & Conservation (124 insects)\ parent cons124way\ shortLabel Element Conservation (phastCons)\ track cons124wayViewphastcons\ view phastcons\ visibility full\ cons124wayViewalign Multiz Alignments bed 4 Multiz Alignment & Conservation (124 insects) 3 2 0 0 0 127 127 127 0 0 0 compGeno 1 longLabel Multiz Alignment & Conservation (124 insects)\ parent cons124way\ shortLabel Multiz Alignments\ track cons124wayViewalign\ view align\ viewUi on\ visibility pack\ refSeqComposite NCBI RefSeq genePred RefSeq genes from NCBI 1 2 0 0 0 127 127 127 0 0 0Description
\\ The NCBI RefSeq Genes composite track shows D. melanogaster protein-coding and non-protein-coding\ genes taken from the NCBI RNA reference sequences collection (RefSeq). All subtracks use\ coordinates provided by RefSeq, except for the UCSC RefSeq track, which UCSC produces by\ realigning the RefSeq RNAs to the genome. This realignment may result in occasional differences\ between the annotation coordinates provided by UCSC and NCBI. For RNA-seq analysis, we advise\ using NCBI aligned tables like RefSeq All or RefSeq Curated. See the \ Methods section for more details about how the different tracks were \ created.
\\ Please visit NCBI's Feedback for Gene and Reference Sequences (RefSeq) page to make suggestions, \ submit additions and corrections, or ask for help concerning RefSeq records.
\ \\ For more information on the different gene tracks, see our Genes FAQ.
\ \Display Conventions and Configuration
\\ This track is a composite track that contains differing data sets.\ To show only a selected set of subtracks, uncheck the boxes next to the tracks that you wish to \ hide. Note: Not all subtracts are available on all assemblies.
\ \ The possible subtracks include:\\
\ \- RefSeq aligned annotations and UCSC alignment of RefSeq annotations\
\\
\- \ RefSeq All – all curated and predicted annotations provided by \ RefSeq.
\- \ RefSeq Curated – subset of RefSeq All that includes only those \ annotations whose accessions begin with NM, NR, NP or YP. (NP and YP are used only for\ protein-coding genes on the mitochondrion; YP is used for human only.) They were \ manually curated, based on publications describing transcripts and manual reviews of \ evidence which includes EST and full-length cDNA alignments, protein sequences, splice sites\ and any other evidence available in databases or the scientific literature. The \ resulting sequences can differ from the genome, they exist independently \ from a particular human genome build, and so must be aligned to the genome to create a track.\ The "RefSeq Curated" track is NCBI's mapping of these transcripts to the genome.\ Another alignment track exists for these, the "UCSC RefSeq" track (see beloow).
\- \ RefSeq Predicted – subset of RefSeq All that includes those annotations whose \ accessions begin with XM or XR. They were predicted based on protein, cDNA, EST\ and RNA-seq alignments to the genome assembly by the NCBI Gnomon prediction software.
\- \ RefSeq Other – all other annotations produced by the RefSeq group that \ do not fit the requirements for inclusion in the RefSeq Curated or the \ RefSeq Predicted tracks. Examples are untranscribed pseudogenes or gene clusters, such as HOX or protocadherin alpha. They were manually curated from \ publications or databases but are not typical transcribed genes.
\- \ RefSeq Alignments – alignments of RefSeq RNAs to the D. melanogaster genome provided\ by the RefSeq group, following the display conventions for\ PSL tracks.
\- \ RefSeq Diffs – alignment differences between the D. melanogaster reference genome(s) \ and RefSeq curated transcripts. (Track not currently available for every assembly.)\
\- \ UCSC RefSeq – annotations generated from UCSC's realignment of RNAs with NM \ and NR accessions to the D. melanogaster genome. This track was previously known as the "RefSeq \ Genes" track.
\- \ RefSeq Select (subset, only on hg38) – Subset of RefSeq Curated, transcripts marked as \ part of the RefSeq Select dataset. \ A single Select transcript is chosen as representative for each protein-coding gene. \ See NCBI RefSeq Select. \
\- \ RefSeq HGMD (subset) – Subset of RefSeq Curated, transcripts annotated by the Human\ Gene Mutation Database. This track is only available on the human genomes hg19 and hg38.\ It is the most restricted RefSeq subset, targeting clinical diagnostics.\
\\ The RefSeq All, RefSeq Curated, RefSeq Predicted, and\ UCSC RefSeq tracks follow the display conventions for\ gene prediction tracks.\ The color shading indicates the level of review the RefSeq record has undergone:\ predicted (light), provisional (medium), or reviewed (dark), as defined by RefSeq.
\ \\
\ \
\ \ \\ \ \Color \Level of review \\ \\ Reviewed: the RefSeq record has been reviewed by NCBI staff or by a collaborator. The NCBI review process includes assessing available sequence data and the literature. Some RefSeq records may incorporate expanded sequence and annotation information. \\ \\ Provisional: the RefSeq record has not yet been subject to individual review. The initial sequence-to-gene association has been established by outside collaborators or NCBI staff. \\ \\ Predicted: the RefSeq record has not yet been subject to individual review, and some aspect of the RefSeq record is predicted. \\ The item labels and codon display properties for features within this track can be configured \ through the check-box controls at the top of the track description page. To adjust the settings \ for an individual subtrack, click the wrench icon next to the track name in the subtrack list .
\\
\ \- \ Label: By default, items are labeled by gene name. Click the appropriate Label \ option to display the accession name or OMIM identifier instead of the gene name, show all or a \ subset of these labels including the gene name, OMIM identifier and accession names, or turn off \ the label completely.
\- \ Codon coloring: This track has an optional codon coloring feature that \ allows users to quickly validate and compare gene predictions. To display codon colors, select the\ genomic codons option from the Color track by codons pull-down menu. For more \ information about this feature, go to the Coloring Gene Predictions and Annotations by Codon page.
\The RefSeq Diffs track contains five different types of inconsistency between the\ reference genome sequence and the RefSeq transcript sequences. The five types of differences are\ as follows:\
\
\ \ HGVS Terminology (Human Genome Variation Society):\ \ g. = genomic sequence ; c. = coding DNA sequence ; n. = non-coding RNA reference sequence.\ \ \- \ mismatch – aligned but mismatching bases, plus HGVS g. \ to show the genomic change required to match the transcript and HGVS c./n. \ to show the transcript change required to match the genome.
\- \ short gap – genomic gaps that are too small to be introns (arbitrary cutoff of\ \ < 45 bp), most likely insertions/deletion variants or errors, with HGVS g. and c./n. \ \ showing differences.
\- \ shift gap – shortGap items whose placement could be shifted left and/or right on\ \ the genome due to repetitive sequence, with HGVS c./n. position range of ambiguous region \ \ in transcript. Here, thin and thick lines are used -- the thin line shows the span of the\ \ repetitive sequence, and the thick line shows the rightmost shifted gap.\
\- \ double gap – genomic gaps that are long enough to be introns but that skip over \ \ transcript sequence (invisible in default setting), with HGVS c./n. deletion.
\- \ skipped – sequence at the beginning or end of a transcript that is not aligned to\ the genome\ (invisible in default setting), with HGVS c./n. deletion
\ \\ When reporting HGVS with RefSeq sequences, to make sure that results from\ research articles can be mapped to the genome unambiguously, \ please specify the RefSeq annotation release displayed on the transcript's\ Genome Browser details page and also the RefSeq transcript ID with version\ (e.g. NM_012309.4 not NM_012309). \
\ \ \ \Methods
\\ Tracks contained in the RefSeq annotation and RefSeq RNA alignment tracks were created at UCSC using \ data from the NCBI RefSeq project. Data files were downloaded from RefSeq in GFF file format and \ converted to the genePred and PSL table formats for display in the Genome Browser. Information about\ the NCBI annotation pipeline can be found \ here.
\ \The RefSeq Diffs track is generated by UCSC using NCBI's RefSeq RNA alignments.
\\ The UCSC RefSeq Genes track is constructed using the same methods as previous RefSeq Genes tracks.\ RefSeq RNAs were aligned against the D. melanogaster genome using BLAT. Those with an alignment of\ less than 15% were discarded. When a single RNA aligned in multiple places, the alignment\ having the highest base identity was identified. Only alignments having a base identity\ level within 0.1% of the best and at least 96% base identity with the genomic sequence were\ kept.
\ \Data Access
\\ The raw data for these tracks can be accessed in multiple ways. It can be explored interactively \ using the REST API,\ Table Browser or\ Data Integrator. The tables can also be accessed programmatically through our\ public MySQL server or downloaded from our\ downloads server for local processing. The previous track versions are available\ in the archives of our downloads server. You can also access any RefSeq table\ entries in JSON format through our \ JSON API.
\\ The data in the RefSeq Other and RefSeq Diffs tracks are organized in \ bigBed file format; more\ information about accessing the information in this bigBed file can be found\ below. The other subtracks are associated with database tables as follows:
\\
\- genePred format:
\\
\- RefSeq All - ncbiRefSeq
\- RefSeq Curated - ncbiRefSeqCurated
\- RefSeq Predicted - ncbiRefSeqPredicted
\- UCSC RefSeq - refGene
\- PSL format:
\\ \
\- RefSeq Alignments - ncbiRefSeqPsl
\\ The first column of each of these tables is "bin". This column is designed\ to speed up access for display in the Genome Browser, but can be safely ignored in downstream\ analysis. You can read more about the bin indexing system\ here.
\\ The annotations in the RefSeqOther and RefSeqDiffs tracks are stored in bigBed \ files, which can be obtained from our downloads server here,\ ncbiRefSeqOther.bb and \ ncbiRefSeqDiffs.bb.\ Individual regions or the whole set of genome-wide annotations can be obtained using our tool\ bigBedToBed which can be compiled from the source code or downloaded as a precompiled\ binary for your system from the utilities directory linked below. For example, to extract only\ annotations in a given region, you could use the following command:
\\ bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/dm6/ncbiRefSeq/ncbiRefSeqOther.bb\ -chrom=chr16 -start=34990190 -end=36727467 stdout
\\ You can download a GTF format version of the RefSeq All table from the \ GTF downloads directory.\ The genePred format tracks can also be converted to GTF format using the\ genePredToGtf utility, available from the\ utilities directory on the UCSC downloads \ server. The utility can be run from the command line like so:
\ genePredToGtf dm6 ncbiRefSeqPredicted ncbiRefSeqPredicted.gtf\\ Note that using genePredToGtf in this manner accesses our public MySQL server, and you therefore \ must set up your hg.conf as described on the MySQL page linked near the beginning of the Data Access\ section.
\\ A file containing the RNA sequences in FASTA format for all items in the RefSeq All, RefSeq Curated, \ and RefSeq Predicted tracks can be found on our downloads server\ here.
\\ Please refer to our mailing list archives for questions.
\ \\ Previous versions of the ncbiRefSeq set of tracks can be found on our archive download server.\
\ \Credits
\\ This track was produced at UCSC from data generated by scientists worldwide and curated by the\ NCBI RefSeq project.
\ \References
\\ Kent WJ.\ BLAT - the BLAST-like \ alignment tool. Genome Res. 2002 Apr;12(4):656-64.\ PMID: 11932250; PMC: PMC187518
\\ Pruitt KD, Brown GR, Hiatt SM, Thibaud-Nissen F, Astashyn A, Ermolaeva O, Farrell CM, Hart J,\ Landrum MJ, McGarvey KM et al.\ RefSeq: an update on mammalian reference sequences.\ Nucleic Acids Res. 2014 Jan;42(Database issue):D756-63.\ PMID: 24259432; PMC: \ PMC3965018
\\ Pruitt KD, Tatusova T, Maglott DR.\ \ NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts \ and proteins.\ Nucleic Acids Res. 2005 Jan 1;33(Database issue):D501-4.\ PMID: 15608248; PMC: PMC539979
\ genes 1 allButtonPair on\ compositeTrack on\ dataVersion /gbdb/$D/ncbiRefSeq/ncbiRefSeqVersion.txt\ dbPrefixLabels hg="HGNC" dm="FlyBase" ce="WormBase" rn="RGD" sacCer="SGD" danRer="ZFIN" mm="MGI" xenTro="XenBase"\ dbPrefixUrls hg="http://www.genenames.org/cgi-bin/gene_symbol_report?hgnc_id=$$" dm="http://flybase.org/reports/$$" ce="http://www.wormbase.org/db/gene/gene?name=$$" rn="https://rgd.mcw.edu/rgdweb/search/search.html?term=$$" sacCer="https://www.yeastgenome.org/locus/$$" danRer="https://zfin.org/$$" mm="http://www.informatics.jax.org/marker/$$" xenTro="https://www.xenbase.org/gene/showgene.do?method=display&geneId=$$"\ dragAndDrop subTracks\ group genes\ longLabel RefSeq genes from NCBI\ noInherit on\ priority 2\ shortLabel NCBI RefSeq\ track refSeqComposite\ type genePred\ visibility dense\ ncbiRefSeqCurated RefSeq Curated genePred NCBI RefSeq genes, curated subset (NM_*, NR_*, NP_* or YP_*) 1 2 12 12 120 133 133 187 0 0 0 genes 1 baseColorDefault genomicCodons\ baseColorUseCds given\ color 12,12,120\ idXref ncbiRefSeqLink mrnaAcc name\ longLabel NCBI RefSeq genes, curated subset (NM_*, NR_*, NP_* or YP_*)\ parent refSeqComposite on\ priority 2\ shortLabel RefSeq Curated\ track ncbiRefSeqCurated\ ReMapTFs ReMap ChIP-seq bigBed 9 + ReMap Atlas of Regulatory Regions 4 2 0 0 0 127 127 127 0 0 0Description
\\ This track represents the ReMap Atlas of regulatory regions, which consists of a\ large-scale integrative analysis of all Public ChIP-seq data for transcriptional\ regulators from GEO, ArrayExpress, and ENCODE. \
\ \\ Below is a schematic diagram of the types of regulatory regions: \
\
\ \ \- ReMap 2022 Atlas (all peaks for each analyzed data set)
\- ReMap 2022 Non-redundant peaks (merged similar target)
\- ReMap 2022 Cis Regulatory Modules
\\ \
Display Conventions and Configuration
\\
\ \- \ Each transcription factor follows a specific RGB color.\
\- \ ChIP-seq peak summits are represented by vertical bars.\
\- \ Hsap: A data set is defined as a ChIP/Exo-seq experiment in a given\ GEO/ArrayExpress/ENCODE series (e.g. GSE41561), for a given TF (e.g. ESR1), in\ a particular biological condition (e.g. MCF-7).\
\
Data sets are labeled with the concatenation of these three pieces of\ information (e.g. GSE41561.ESR1.MCF-7).\- \ Atha: The data set is defined as a ChIP-seq experiment in a given series\ (e.g. GSE94486), for a given target (e.g. ARR1), in a particular biological\ condition (i.e. ecotype, tissue type, experimental conditions; e.g.\ Col-0_seedling_3d-6BA-4h).\
\
Data sets are labeled with the concatenation of these three pieces of\ information (e.g. GSE94486.ARR1.Col-0_seedling_3d-6BA-4h).\Methods
\ \\ This 4th release of ReMap (2022) presents the analysis of 1,206 quality\ controlled ChIP-seq (n=1,315 before QCs) data sets from public sources (GEO,\ ENCODE). Those ChIP-seq data sets have been mapped to the dm6 drosophila\ assembly. The data set is defined as a ChIP-seq experiment in a given series\ (e.g. GSE107059), for a given TF (e.g. Trl), in a particular biological\ condition (i.e. cell line, tissue type, disease state, or experimental conditions;\ e.g. Schneider-2). Data sets were labeled by concatenating these three pieces of\ information, such as GSE107059.Trl.Schneider-2.\
\Those merged analyses cover a total of 550 DNA-binding proteins\ (transcriptional regulators) such as a variety of transcription factors (TFs),\ transcription co-activators (TCFs), and chromatin-remodeling factors (CRFs) for\ 16 million peaks.\
\ \ \\ \
ENCODE
\\ Available ENCODE ChIP-seq data sets for transcriptional regulators from the\ ENCODE portal were processed with the\ standardized ReMap pipeline. The list of ENCODE data was retrieved as FASTQ files from the\ ENCODE portal\ using filters. Metadata information in JSON format and FASTQ files were retrieved using the Python\ requests module.\
\ \ \ \ChIP-seq processing
\\ Both Public and ENCODE data were processed similarly. Bowtie 2 (PMC3322381) (version 2.2.9) with options -end-to-end -sensitive was used to align all\ reads on the genome. Biological and technical\ replicates for each unique combination of GSE/TF/Cell type or Biological condition\ were used for peak calling. TFBS were identified using MACS2 peak-calling tool\ (PMC3120977) (version 2.1.1.2) in order to follow ENCODE ChIP-seq guidelines,\ with stringent thresholds (MACS2 default thresholds, p-value: 1e-5). An input data\ set was used when available.\
\ \ \Quality assessment
\\ To assess the quality of public data sets, a score was computed based on the\ cross-correlation and the FRiP (fraction of reads in peaks) metrics developed by\ the ENCODE Consortium (https://genome.ucsc.edu/ENCODE/qualityMetrics.html). Two\ thresholds were defined for each of the two cross-correlation ratios (NSC,\ normalized strand coefficient: 1.05 and 1.10; RSC, relative strand coefficient:\ 0.8 and 1.0). Detailed descriptions of the ENCODE quality coefficients can be\ found at https://genome.ucsc.edu/ENCODE/qualityMetrics.html. The\ phantompeak tools suite was used\ (https://code.google.com/p/phantompeakqualtools/) to compute\ RSC and NSC.\
\\ Please refer to the ReMap 2022, 2020, and 2018 publications for more details\ (citation below).\
\ \ \ \Data Access
\\ ReMap Atlas of regulatory regions data can be explored interactively with the\ Table Browser and cross-referenced with the \ Data Integrator. For programmatic access,\ the track can be accessed using the Genome Browser's\ REST API.\ ReMap annotations can be downloaded from the\ Genome Browser's download server\ as a bigBed file. This compressed binary format can be remotely queried through\ command line utilities. Please note that some of the download files can be quite large.
\ \\ Individual BED files for specific TFs, cells/biotypes, or data sets can be\ found and downloaded on the ReMap website.\
\ \References
\ \\ Chèneby J, Gheorghe M, Artufel M, Mathelier A, Ballester B.\ \ ReMap 2018: an updated atlas of regulatory regions from an integrative analysis of DNA-binding ChIP-\ seq experiments.\ Nucleic Acids Res. 2018 Jan 4;46(D1):D267-D275.\ PMID: 29126285; PMC: PMC5753247\
\\ Chèneby J, Ménétrier Z, Mestdagh M, Rosnet T, Douida A, Rhalloussi W, Bergon A, Lopez\ F, Ballester B.\ \ ReMap 2020: a database of regulatory regions from an integrative analysis of Human and Arabidopsis\ DNA-binding sequencing experiments.\ Nucleic Acids Res. 2020 Jan 8;48(D1):D180-D188.\ PMID: 31665499; PMC: PMC7145625\
\\ Griffon A, Barbier Q, Dalino J, van Helden J, Spicuglia S, Ballester B.\ \ Integrative analysis of public ChIP-seq experiments reveals a complex multi-cell regulatory\ landscape.\ Nucleic Acids Res. 2015 Feb 27;43(4):e27.\ PMID: 25477382; PMC: PMC4344487\
\\ Hammal F, de Langen P, Bergon A, Lopez F, Ballester B.\ \ ReMap 2022: a database of Human, Mouse, Drosophila and Arabidopsis regulatory regions from an\ integrative analysis of DNA-binding sequencing experiments.\ Nucleic Acids Res. 2022 Jan 7;50(D1):D316-D325.\ PMID: 34751401; PMC: PMC8728178\
\ regulation 1 bigDataUrl /gbdb/dm6/reMap/reMap2022.bb\ denseCoverage 100\ filterLabel.Biotypes Biotypes (cell lines, tissues...)\ filterLabel.FBgn FBgn ID\ filterLabel.TF Transcriptional regulators\ filterText.Biotypes *\ filterText.FBgn *\ filterText.TF *\ filterType.Biotypes multipleListOnlyOr\ filterType.FBgn multipleListOnlyOr\ filterType.TF multipleListOnlyOr\ filterValues.Biotypes adult,embryo,Kc,Kc167,larva,ML-DmBG3-c2,mushroom-body,ovarian-somatic-cell,pharate,prepupa,pupa,S2-DRSC,S2R-plus,Schneider-2,Schneider-3,second-instar,third-instar\ filterValues.FBgn FBgn0000014,FBgn0000015,FBgn0000022,FBgn0000028,FBgn0000097,FBgn0000166,FBgn0000212,FBgn0000227,FBgn0000251,FBgn0000283,FBgn0000287,FBgn0000289,FBgn0000307,FBgn0000370,FBgn0000411,FBgn0000412,FBgn0000439,FBgn0000448,FBgn0000463,FBgn0000504,FBgn0000520,FBgn0000529,FBgn0000541,FBgn0000546,FBgn0000567,FBgn0000568,FBgn0000575,FBgn0000576,FBgn0000577,FBgn0000581,FBgn0000591,FBgn0000606,FBgn0000611,FBgn0000625,FBgn0000629,FBgn0000659,FBgn0000964,FBgn0001078,FBgn0001089,FBgn0001133,FBgn0001138,FBgn0001139,FBgn0001147,FBgn0001150,FBgn0001168,FBgn0001180,FBgn0001185,FBgn0001206,FBgn0001222,FBgn0001233,FBgn0001235,FBgn0001291,FBgn0001297,FBgn0001319,FBgn0001320,FBgn0001325,FBgn0001990,FBgn0001994,FBgn0002183,FBgn0002441,FBgn0002521,FBgn0002522,FBgn0002573,FBgn0002576,FBgn0002609,FBgn0002631,FBgn0002633,FBgn0002643,FBgn0002723,FBgn0002733,FBgn0002736,FBgn0002774,FBgn0002775,FBgn0002775-FBgn0015550,FBgn0002781,FBgn0002914,FBgn0002922,FBgn0002931,FBgn0002941,FBgn0002985,FBgn0003002,FBgn0003028,FBgn0003042,FBgn0003071,FBgn0003118,FBgn0003145,FBgn0003165,FBgn0003254,FBgn0003300,FBgn0003330,FBgn0003334,FBgn0003339,FBgn0003345,FBgn0003396,FBgn0003459,FBgn0003460,FBgn0003513,FBgn0003567,FBgn0003598,FBgn0003607,FBgn0003612,FBgn0003651,FBgn0003687,FBgn0003720,FBgn0003862,FBgn0003870,FBgn0003896,FBgn0003900,FBgn0003944,FBgn0003964,FBgn0003996,FBgn0004053,FBgn0004102,FBgn0004110,FBgn0004170,FBgn0004362,FBgn0004396,FBgn0004400,FBgn0004510,FBgn0004567,FBgn0004595,FBgn0004607,FBgn0004618,FBgn0004647,FBgn0004652,FBgn0004656,FBgn0004837,FBgn0004854,FBgn0004861-FBgn0004860,FBgn0004865,FBgn0004870,FBgn0004880,FBgn0004895,FBgn0004896,FBgn0004898,FBgn0004914,FBgn0004915,FBgn0005386,FBgn0005427,FBgn0005558,FBgn0005561,FBgn0005612,FBgn0005613,FBgn0005616,FBgn0005617,FBgn0005624,FBgn0005658,FBgn0005660,FBgn0005677,FBgn0005771,FBgn0010109,FBgn0010228,FBgn0010278,FBgn0010313,FBgn0010328,FBgn0010355,FBgn0010433,FBgn0010768,FBgn0010825,FBgn0011274,FBgn0011278,FBgn0011305,FBgn0011648,FBgn0011655,FBgn0011701,FBgn0011763,FBgn0011817,FBgn0013263,FBgn0013717,FBgn0013799,FBgn0014018,FBgn0014037,FBgn0014127,FBgn0014179,FBgn0014340,FBgn0014343,FBgn0014859,FBgn0014931,FBgn0014949,FBgn0014977,FBgn0015239,FBgn0015240,FBgn0015270,FBgn0015371,FBgn0015381,FBgn0015391,FBgn0015396,FBgn0015550,FBgn0015561,FBgn0015602,FBgn0015664,FBgn0015799,FBgn0015805,FBgn0015919,FBgn0016061,FBgn0016076,FBgn0016694,FBgn0016917,FBgn0017460,FBgn0017578,FBgn0019809,FBgn0019949,FBgn0020378,FBgn0020412,FBgn0020493,FBgn0020887,FBgn0021738,FBgn0021767,FBgn0021872,FBgn0022699,FBgn0022740,FBgn0022764,FBgn0022935,FBgn0023076,FBgn0023094,FBgn0023215,FBgn0023518,FBgn0024249,FBgn0024250,FBgn0024288,FBgn0024321,FBgn0024975,FBgn0025185,FBgn0025334,FBgn0025525,FBgn0025635,FBgn0025679,FBgn0025776,FBgn0025800,FBgn0025874,FBgn0026058,FBgn0026144,FBgn0026427,FBgn0026533,FBgn0026573,FBgn0027339,FBgn0027364,FBgn0027378,FBgn0027620,FBgn0027788,FBgn0027835,FBgn0028550,FBgn0028647,FBgn0028878,FBgn0028926,FBgn0028931,FBgn0028979,FBgn0028999,FBgn0029173,FBgn0029504,FBgn0029672,FBgn0029711,FBgn0029822,FBgn0029905,FBgn0029928,FBgn0029936,FBgn0029957,FBgn0030003,FBgn0030005,FBgn0030012,FBgn0030240,FBgn0030455,FBgn0030477,FBgn0030673,FBgn0030680,FBgn0030710,FBgn0030787,FBgn0030963,FBgn0030990,FBgn0031052,FBgn0031144,FBgn0031232,FBgn0031391,FBgn0031399,FBgn0031434,FBgn0031435,FBgn0031573,FBgn0031621,FBgn0031874,FBgn0032130,FBgn0032150,FBgn0032157,FBgn0032202,FBgn0032209,FBgn0032223,FBgn0032295,FBgn0032321,FBgn0032401,FBgn0032473,FBgn0032491,FBgn0032493,FBgn0032512,FBgn0032600,FBgn0032651,FBgn0032730,FBgn0032763,FBgn0032815,FBgn0032816,FBgn0032817,FBgn0032979,FBgn0033155,FBgn0033183,FBgn0033185,FBgn0033186,FBgn0033252,FBgn0033358,FBgn0033459,FBgn0033491,FBgn0033569,FBgn0033581,FBgn0033616,FBgn0033627,FBgn0033749,FBgn0033762,FBgn0033782,FBgn0033934,FBgn0033971,FBgn0033993,FBgn0033998,FBgn0034012,FBgn0034051,FBgn0034062,FBgn0034096,FBgn0034114,FBgn0034120,FBgn0034217,FBgn0034240,FBgn0034379,FBgn0034520,FBgn0034534,FBgn0034570,FBgn0034599,FBgn0034726,FBgn0034821,FBgn0034853,FBgn0034878,FBgn0034946,FBgn0034961,FBgn0034970,FBgn0035036,FBgn0035137,FBgn0035144,FBgn0035157,FBgn0035238,FBgn0035407,FBgn0035414,FBgn0035518,FBgn0035625,FBgn0035687,FBgn0035690,FBgn0035702,FBgn0035713,FBgn0035721,FBgn0035769,FBgn0035824,FBgn0035849,FBgn0035902,FBgn0035903,FBgn0035997,FBgn0036004,FBgn0036124,FBgn0036179,FBgn0036274,FBgn0036285,FBgn0036294,FBgn0036396,FBgn0036398,FBgn0036423,FBgn0036661,FBgn0036746,FBgn0036791,FBgn0036804,FBgn0037027,FBgn0037051,FBgn0037085,FBgn0037093,FBgn0037206,FBgn0037275,FBgn0037379,FBgn0037436,FBgn0037445,FBgn0037446,FBgn0037475,FBgn0037555,FBgn0037617,FBgn0037618,FBgn0037619,FBgn0037621,FBgn0037634,FBgn0037669,FBgn0037670,FBgn0037672,FBgn0037698,FBgn0037717,FBgn0037722,FBgn0037746,FBgn0037751,FBgn0037794,FBgn0037831,FBgn0037876,FBgn0037920,FBgn0037921,FBgn0037922,FBgn0037923,FBgn0037931,FBgn0037937,FBgn0038047,FBgn0038197,FBgn0038244,FBgn0038252,FBgn0038301,FBgn0038316,FBgn0038402,FBgn0038418,FBgn0038472,FBgn0038548,FBgn0038549,FBgn0038550,FBgn0038592,FBgn0038741,FBgn0038765,FBgn0038766,FBgn0038767,FBgn0038805,FBgn0038833,FBgn0038851,FBgn0038852,FBgn0038978,FBgn0039039,FBgn0039044,FBgn0039114,FBgn0039120,FBgn0039169,FBgn0039209,FBgn0039225,FBgn0039329,FBgn0039509,FBgn0039602,FBgn0039683,FBgn0039712,FBgn0039733,FBgn0039740,FBgn0039743,FBgn0039808,FBgn0039860,FBgn0039937,FBgn0039938,FBgn0039946,FBgn0040022,FBgn0040066,FBgn0040305,FBgn0040366,FBgn0040465,FBgn0040765,FBgn0040918,FBgn0040929,FBgn0041092,FBgn0041111,FBgn0041156,FBgn0041210,FBgn0042205,FBgn0042696,FBgn0043001,FBgn0043364,FBgn0043457,FBgn0043796,FBgn0044324,FBgn0046874,FBgn0050011,FBgn0050020,FBgn0050403,FBgn0050431,FBgn0051365,FBgn0051388,FBgn0051481,FBgn0051627,FBgn0052006,FBgn0052264,FBgn0052296,FBgn0052423,FBgn0053017,FBgn0053213,FBgn0053557,FBgn0085396,FBgn0085405,FBgn0085432,FBgn0085450,FBgn0086655,FBgn0087035,FBgn0250732,FBgn0259211,FBgn0259234,FBgn0259785,FBgn0259789,FBgn0259938,FBgn0260243,FBgn0260632,FBgn0260642,FBgn0260741,FBgn0260987,FBgn0261015,FBgn0261239,FBgn0261283,FBgn0261434,FBgn0261573,FBgn0261588,FBgn0261617,FBgn0261648,FBgn0261793,FBgn0262139,FBgn0262477,FBgn0262582,FBgn0262656,FBgn0262937,FBgn0262975,FBgn0263072,FBgn0263102,FBgn0263106,FBgn0263108,FBgn0263118,FBgn0263240,FBgn0263511,FBgn0263512,FBgn0263667,FBgn0263755,FBgn0263934,FBgn0264005,FBgn0264442,FBgn0264490,FBgn0264562,FBgn0264744,FBgn0264922,FBgn0264954,FBgn0265193,FBgn0265276,FBgn0265784,FBgn0266129,FBgn0266411,FBgn0266441,FBgn0267792,FBgn0267821,FBgn0270924,FBgn0278608,FBgn0283451,FBgn0283521,FBgn0284084,FBgn0285879,FBgn0287768\ filterValues.TF 1-BP,ab,abd-A,Abd-B,ac,Acf,achi,acj6,Ada2b,ADD1,AGO2,Antp,aop,Asciz,ash1,ATbp,Atf-2,Atf3,ato,az2,bab1,bab2,barr,bcd,Bdp1,BEAF-32,B-H2,BigH1,bigmax,Blimp-1,br,Br140,brk,brm,bsh,BtbVII,btn,BuGZ,cad,Camta,Cap-H2,cato,caup,CBP,cbt,Cdk12,Cdk9,cg,CG10147,CG10209,CG10274,CG10431,CG10462,CG10543,CG10565,CG10631,CG10654,CG10669,CG10979,CG11398,CG11504,CG11617,CG11723,CG11902,CG12071,CG12104,CG12155,CG12219,CG12236,CG12299,CG1233,CG12391,CG12659,CG12744,CG12768,CG12769,CG12942,CG13123,CG13204,CG13296,CG13775,CG13894,CG14655,CG14710,CG14711,CG14965,CG15011,CG15073,CG15269,CG1529,CG15514,CG15601,CG15696,CG15710,CG1602,CG1603,CG1620,CG1647,CG16779,CG16815,CG16863,CG17186,CG17359,CG17385,CG17568,CG17801,CG17802,CG17806,CG17829,CG1792,CG18011,CG18262,CG18476,CG18599,CG18764,CG2116,CG2120,CG2202,CG2678,CG2712,CG2875,CG30020,CG3032,CG30403,CG30431,CG3065,CG31365,CG31388,CG31627,CG3163,CG32006,CG32264,CG3281,CG33017,CG33213,CG33557,CG3407,CG34367,CG34376,CG3838,CG3919,CG3995,CG4282,CG4318,CG4328,CG43347,CG44002,CG4424,CG45071,CG4617,CG4707,CG4820,CG4854,CG5180,CG5204,CG5245,CG6254,CG6276,CG6654,CG6683,CG6765,CG6808,CG6813,CG7101,CG7271,CG7368,CG7556,CG7745,CG7786,CG7839,CG7987,CG8089,CG8159,CG8281,CG8301,CG8319,CG8388,CG8478,CG8924,CG8944,CG9609,CG9705,CG9727,CG9876,CG9948,CHES-1-like,chif,chn,Chrac-16,Chro,cic,Clamp,Clk,cnc,Coop,CoRest,corto,Cp190,crc,CrebA,CrebB,Crg-1,crp,Crtc,CTCF,cwo,cyc,D,D1,D19A,D19B,da,dac,Dad,Deaf1,Dek,Dfd,Dif,disco-r,dl,Dl,Dlip3,dmrt11E,dmrt93B,dmrt99B,DnaJ-1,Dp,Dp1,dpn,dre4,Dref,dsf,Dsp1,dsx,dwg,E-bx,EcR,Eip74EF,Eip75B,Eip78C,Eip93F,Elba1,Elba2,Elba3,Elys,emc,ems,en,E-Pc,ERR,esg,esn,E-spl-m3-HLH,E-spl-m5-HLH,E-spl-m7-HLH,E-spl-m8-HLH,E-spl-m-beta-HLH,Etl1,Ets21C,Ets65A,Ets96B,Ets97D,E-var-3-9,eve,ewg,exd,exex,ey,eyg,E-z,fd102C,fd3F,fd59A,fd64A,fd96Cb,Fer1,Fer2,Fer3,fkh,foxo,FoxP,fru,fs-1-h,ftz-f1,fu2,Gal,GATAd,gcm,gcm2,gem,gfzf,gl,glu,grau,grh,grn,gro,gsb-n,gt,Gug,h,Hand,hb,HDAC1,HDAC4,her,Hey,HHEX,HIPP1,hkb,HLH54F,HmgD,HmgZ,Hmr,Hmx,Hnf4,hng1,hng2,Hr3,Hr38,Hr39,Hr4,Hr51,Hr78,Hr83,Hr96,Hsf,Hsp83,hth,Ibf1,Ibf2,Ice1,ind,insv,Jarid2,Jasper,JIL-1,jim,jing,Jra,jumu,Kah,kay,Klf15,kmg,kn,kni,Kr,l-3-mbt,l-3-neo38,lab,lbe,Lhr,lilli,lmd,lms,lola,lov,Lpt,Lsd-1,luna,lz,M1BP,Mabi,Mad,maf-S,mago,mam,Max,Med,Meics,Mes4,MESR4,Met,mio,mirr,mle,Mlf,Mnt,mod-mdg4,mof,MRG15,Mrtf,msl-1,msl-2,msl-3,msl-3-tap,MTF-1,Myb,Myc,N,nau,NC2-beta,nej,nerfin-1,net,Neu2,NfI,Nf-YB,Nf-YC,NK7-1,nmo,noc,nom,not,Nup98-96,oc,odd,OdsH,Oli,opa,Orc2,org-1,ouib,ovo,p53,pad,pan,pb,Pc,pdm3,Pdp1,Pfk,ph,PHDP,pho,phol,Pif1A,Pif1B,pita,Plzf,pnt,prd,pros,Psc,psq,pum,pzg,Rabex-5,Rbf,Rel,repo,REPTOR,REPTOR-BP,rgr,rhi,rib,row,Rsf1,run,sage,salm,salr,sc,Sce,schlank,Scm,Scr,scrt,sd,sens,Set1,Sgf11,shep,shn,side,sima,Sin3A,Six4,slou,slp2,Smox,smt3,Snoo,so,Sox100B,Sox102F,Sox14,Sox15,Sp1,spab,Spps,sqz,SREBP,ss,Ssrp,Stat92E,stwl,sug,Su-H,su-Hw,Su-Tpl,Su-var-205,Su-var-2-10,Su-var-2-HP2,Su-var-3-7,Su-var-3-9,Su-z-12,sv,svp,Taf1,tai,tap,Tbp,TFAM,TfIIB,TfIIIC,tHMG1,tin,tio,tj,tll,toe,topi,trem,Trf2,trh,Trl,trr,trx,ttk,tup,twi,tx,Ubx,unpg,upSET,Usf,usp,velo,vri,Vsx1,Vsx2,vtd,w,wds,wek,wg,woc,Xbp1,yki,YL-1,Zaf1,zen,zf30C,zfh2,Zif,ZIPIC,zld,ZnT49B\ html reMap\ itemRgb on\ labelFields name, TF, Biotypes, FBgn\ longLabel ReMap Atlas of Regulatory Regions\ maxWindowCoverage 50000\ parent ReMap on\ priority 2\ shortLabel ReMap ChIP-seq\ showCfg on\ track ReMapTFs\ type bigBed 9 +\ urls TF="http://remap.univ-amu.fr/target_page/$$:7227" Biotypes="http://remap.univ-amu.fr/biotype_page/$$:7227"\ visibility squish\ unipAliTrembl TrEMBL Aln. bigPsl UCSC alignment of TrEMBL proteins to genome 0 2 0 0 0 127 127 127 0 0 0 genes 1 baseColorDefault genomicCodons\ baseColorTickColor contrastingColor\ baseColorUseCds given\ bigDataUrl /gbdb/dm6/uniprot/unipAliTrembl.bb\ indelDoubleInsert on\ indelQueryInsert on\ itemRgb on\ labelFields name,acc,uniprotName,geneName,hgncSym,refSeq,refSeqProt,ensProt\ longLabel UCSC alignment of TrEMBL proteins to genome\ mouseOverField protFullNames\ parent uniprot off\ priority 2\ searchIndex name,acc\ shortLabel TrEMBL Aln.\ showDiffBasesAllScales on\ skipFields isMain\ track unipAliTrembl\ type bigPsl\ urls acc="https://www.uniprot.org/uniprot/$$" hgncId="https://www.genenames.org/cgi-bin/gene_symbol_report?hgnc_id=$$" refseq="https://www.ncbi.nlm.nih.gov/nuccore/$$" refSeqProt="https://www.ncbi.nlm.nih.gov/protein/$$" ncbiGene="https://www.ncbi.nlm.nih.gov/gene/$$" entrezGene="https://www.ncbi.nlm.nih.gov/gene/$$" ensGene="https://www.ensembl.org/Gene/Summary?g=$$"\ visibility hide\ cpgIslandExtUnmasked Unmasked CpG bed 4 + CpG Islands on All Sequence (Islands < 300 Bases are Light Green) 0 2 0 100 0 128 228 128 0 0 0Description
\ \CpG islands are associated with genes, particularly housekeeping\ genes, in vertebrates. CpG islands are typically common near\ transcription start sites and may be associated with promoter\ regions. Normally a C (cytosine) base followed immediately by a \ G (guanine) base (a CpG) is rare in\ vertebrate DNA because the Cs in such an arrangement tend to be\ methylated. This methylation helps distinguish the newly synthesized\ DNA strand from the parent strand, which aids in the final stages of\ DNA proofreading after duplication. However, over evolutionary time,\ methylated Cs tend to turn into Ts because of spontaneous\ deamination. The result is that CpGs are relatively rare unless\ there is selective pressure to keep them or a region is not methylated\ for some other reason, perhaps having to do with the regulation of gene\ expression. CpG islands are regions where CpGs are present at\ significantly higher levels than is typical for the genome as a whole.
\ \\ The unmasked version of the track displays potential CpG islands\ that exist in repeat regions and would otherwise not be visible\ in the repeat masked version.\
\ \\ By default, only the masked version of the track is displayed. To view the\ unmasked version, change the visibility settings in the track controls at\ the top of this page.\
\ \Methods
\ \CpG islands were predicted by searching the sequence one base at a\ time, scoring each dinucleotide (+17 for CG and -1 for others) and\ identifying maximally scoring segments. Each segment was then\ evaluated for the following criteria:\ \
\ \
\ \- GC content of 50% or greater
\ \- length greater than 200 bp
\ \- ratio greater than 0.6 of observed number of CG dinucleotides to the expected number on the \ \ basis of the number of Gs and Cs in the segment
\\ The entire genome sequence, masking areas included, was\ used for the construction of the track Unmasked CpG.\ The track CpG Islands is constructed on the sequence after\ all masked sequence is removed.\
\ \The CpG count is the number of CG dinucleotides in the island. \ The Percentage CpG is the ratio of CpG nucleotide bases\ (twice the CpG count) to the length. The ratio of observed to expected \ CpG is calculated according to the formula (cited in \ Gardiner-Garden et al. (1987)):\ \
Obs/Exp CpG = Number of CpG * N / (Number of C * Number of G)\ \ where N = length of sequence.\\ The calculation of the track data is performed by the following command sequence:\
\ twoBitToFa assembly.2bit stdout | maskOutFa stdin hard stdout \\\ | cpg_lh /dev/stdin 2> cpg_lh.err \\\ | awk '{$2 = $2 - 1; width = $3 - $2; printf("%s\\t%d\\t%s\\t%s %s\\t%s\\t%s\\t%0.0f\\t%0.1f\\t%s\\t%s\\n", $1, $2, $3, $5, $6, width, $6, width*$7*0.01, 100.0*2*$6/width, $7, $9);}' \\\ | sort -k1,1 -k2,2n > cpgIsland.bed\\ The unmasked track data is constructed from\ twoBitToFa -noMask output for the twoBitToFa command.\ \ \Data access
\\ CpG islands and its associated tables can be explored interactively using the\ REST API, the\ Table Browser or the\ Data Integrator.\ All the tables can also be queried directly from our public MySQL\ servers, with more information available on our\ help page as well as on\ our blog.
\\ The source for the cpg_lh program can be obtained from\ src/utils/cpgIslandExt/.\ The cpg_lh program binary can be obtained from: http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/cpg_lh (choose "save file")\
\ \Credits
\ \This track was generated using a modification of a program developed by G. Miklem and L. Hillier \ (unpublished).
\ \References
\ \\ Gardiner-Garden M, Frommer M.\ \ CpG islands in vertebrate genomes.\ J Mol Biol. 1987 Jul 20;196(2):261-82.\ PMID: 3656447\
\ regulation 1 html cpgIslandSuper\ longLabel CpG Islands on All Sequence (Islands < 300 Bases are Light Green)\ parent cpgIslandSuper hide\ priority 2\ shortLabel Unmasked CpG\ track cpgIslandExtUnmasked\ chainDroSim1 droSim1 Chain chain droSim1 D. simulans (Apr. 2005 (WUGSC mosaic 1.0/droSim1)) Chained Alignments 3 3 0 0 0 255 255 0 1 0 0Description
\\ This track shows alignments of D. simulans (droSim1, Apr. 2005 (WUGSC mosaic 1.0/droSim1)) to the\ D. melanogaster genome using a gap scoring system that allows longer gaps than \ traditional affine gap scoring systems. It can also tolerate gaps in both \ D. simulans and D. melanogaster simultaneously. These "double-sided"\ gaps can be caused by local inversions and overlapping deletions\ in both species.
\\ The chain track displays boxes joined together by either single or \ double lines. The boxes represent aligning regions. \ Single lines indicate gaps that are largely due to a deletion in the \ D. simulans assembly or an insertion in the D. melanogaster assembly.\ Double lines represent more complex gaps that involve substantial\ sequence in both species. This may result from inversions, overlapping\ deletions, an abundance of local mutation, or an unsequenced gap in one \ species. In cases where multiple chains align over a particular region of \ the D. melanogaster genome, the chains with single-lined gaps are often due to \ processed pseudogenes, while chains with double-lined gaps are more often \ due to paralogs and unprocessed pseudogenes.
\\ In the "pack" and "full" display\ modes, the individual feature names indicate the chromosome, strand, and \ location (in thousands) of the match for each matching alignment.
\ \ \Display Conventions and Configuration
\By default, the chains to chromosome-based assemblies are colored\ based on which chromosome they map to in the aligning organism. To turn\ off the coloring, check the "off" button next to: Color\ track based on chromosome.
\\ To display only the chains of one chromosome in the aligning\ organism, enter the name of that chromosome (e.g. chr4) in box next to: \ Filter by chromosome.
\ \Methods
\\ The blastz alignments were converted into axt format using the lavToAxt\ program. The axt alignments were fed into axtChain, which organizes all \ alignments between a single D. simulans chromosome and a single\ D. melanogaster chromosome into a group and creates a kd-tree out\ of the gapless subsections (blocks) of the alignments. A dynamic program \ was then run over the kd-trees to find the maximally scoring chains of these\ blocks. Chains scoring below a threshold were discarded; the remaining \ chains are displayed in this track.
\ \Credits
\\ Blastz was developed at Pennsylvania State University by \ Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from\ Ross Hardison.
\\ Lineage-specific repeats were identified by Arian Smit and his\ RepeatMasker\ program.
\\ The axtChain program was developed at the University of California\ at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler.\
\\ The browser display and database storage of the chains were generated\ by Robert Baertsch and Jim Kent.
\ \References
\\ Chiaromonte F, Yap VB, Miller W.\ Scoring pairwise genomic sequence alignments.\ Pac Symp Biocomput. 2002:115-26.\ PMID: 11928468\
\ \\ Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D.\ Evolution's cauldron:\ duplication, deletion, and rearrangement in the mouse and human genomes.\ Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9.\ PMID: 14500911; PMC: PMC208784\
\ \\ Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC,\ Haussler D, Miller W.\ Human-mouse alignments with BLASTZ.\ Genome Res. 2003 Jan;13(1):103-7.\ PMID: 12529312; PMC: PMC430961\
\ compGeno 1 longLabel D. simulans (Apr. 2005 (WUGSC mosaic 1.0/droSim1)) Chained Alignments\ otherDb droSim1\ parent insectsChainNetViewchain off\ shortLabel droSim1 Chain\ subGroups view=chain species=s000b clade=c00\ track chainDroSim1\ type chain droSim1\ unipLocSignal Signal Peptide bigBed 12 + UniProt Signal Peptides 1 3 255 0 150 255 127 202 0 0 0 genes 1 bigDataUrl /gbdb/dm6/uniprot/unipLocSignal.bb\ color 255,0,150\ filterValues.status Manually reviewed (Swiss-Prot),Unreviewed (TrEMBL)\ itemRgb off\ longLabel UniProt Signal Peptides\ parent uniprot\ priority 3\ shortLabel Signal Peptide\ track unipLocSignal\ type bigBed 12 +\ visibility dense\ netDroSim1 droSim1 Net netAlign droSim1 chainDroSim1 D. simulans (Apr. 2005 (WUGSC mosaic 1.0/droSim1)) Alignment Net 1 4 0 0 0 255 255 0 0 0 0Description
\\ This track shows the best D. simulans/D. melanogaster chain for \ every part of the D. melanogaster genome. It is useful for\ finding orthologous regions and for studying genome\ rearrangement. The D. simulans sequence used in this annotation is from\ the Apr. 2005 (WUGSC mosaic 1.0/droSim1) (droSim1) assembly.
\ \Display Conventions and Configuration
\\ In full display mode, the top-level (level 1)\ chains are the largest, highest-scoring chains that\ span this region. In many cases gaps exist in the\ top-level chain. When possible, these are filled in by\ other chains that are displayed at level 2. The gaps in \ level 2 chains may be filled by level 3 chains and so\ forth.
\\ In the graphical display, the boxes represent ungapped \ alignments; the lines represent gaps. Click\ on a box to view detailed information about the chain\ as a whole; click on a line to display information\ about the gap. The detailed information is useful in determining\ the cause of the gap or, for lower level chains, the genomic\ rearrangement.
\\ Individual items in the display are categorized as one of four types\ (other than gap):
\\
\ \- Top - the best, longest match. Displayed on level 1.\
- Syn - line-ups on the same chromosome as the gap in the level above\ it.\
- Inv - a line-up on the same chromosome as the gap above it, but in \ the opposite orientation.\
- NonSyn - a match to a chromosome different from the gap in the \ level above.\
Methods
\\ Chains were derived from blastz alignments, using the methods\ described on the chain tracks description pages, and sorted with the \ highest-scoring chains in the genome ranked first. The program\ chainNet was then used to place the chains one at a time, trimming them as \ necessary to fit into sections not already covered by a higher-scoring chain. \ During this process, a natural hierarchy emerged in which a chain that filled \ a gap in a higher-scoring chain was placed underneath that chain. The program \ netSyntenic was used to fill in information about the relationship between \ higher- and lower-level chains, such as whether a lower-level\ chain was syntenic or inverted relative to the higher-level chain. \ The program netClass was then used to fill in how much of the gaps and chains \ contained Ns (sequencing gaps) in one or both species and how much\ was filled with transposons inserted before and after the two organisms \ diverged.
\ \Credits
\\ The chainNet, netSyntenic, and netClass programs were\ developed at the University of California\ Santa Cruz by Jim Kent.
\\ Blastz was developed at Pennsylvania State University by\ Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from\ Ross Hardison.
\ \References
\\ Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D.\ Evolution's cauldron:\ duplication, deletion, and rearrangement in the mouse and human genomes.\ Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9.\ PMID: 14500911; PMC: PMC208784\
\ \\ Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC,\ Haussler D, Miller W.\ Human-mouse alignments with BLASTZ.\ Genome Res. 2003 Jan;13(1):103-7.\ PMID: 12529312; PMC: PMC430961\
\ compGeno 0 longLabel D. simulans (Apr. 2005 (WUGSC mosaic 1.0/droSim1)) Alignment Net\ otherDb droSim1\ parent insectsChainNetViewnet off\ shortLabel droSim1 Net\ subGroups view=net species=s000b clade=c00\ track netDroSim1\ type netAlign droSim1 chainDroSim1\ phyloP124way Cons 124 insects wig -20 7.532 124 insects Basewise Conservation by PhyloP 2 4 60 60 140 140 60 60 0 0 0 compGeno 0 altColor 140,60,60\ autoScale off\ color 60,60,140\ configurable on\ longLabel 124 insects Basewise Conservation by PhyloP\ maxHeightPixels 100:50:11\ noInherit on\ parent cons124wayViewphyloP on\ priority 4\ shortLabel Cons 124 insects\ spanList 1\ subGroups view=phyloP\ track phyloP124way\ type wig -20 7.532\ viewLimits -4.5:4.88\ windowingFunction mean\ unipLocExtra Extracellular bigBed 12 + UniProt Extracellular Domain 1 4 0 150 255 127 202 255 0 0 0 genes 1 bigDataUrl /gbdb/dm6/uniprot/unipLocExtra.bb\ color 0,150,255\ filterValues.status Manually reviewed (Swiss-Prot),Unreviewed (TrEMBL)\ itemRgb off\ longLabel UniProt Extracellular Domain\ parent uniprot\ priority 4\ shortLabel Extracellular\ track unipLocExtra\ type bigBed 12 +\ visibility dense\ unipInterest Interest bigBed 12 + UniProt Regions of Interest 1 4 0 0 0 127 127 127 0 0 0 genes 1 bigDataUrl /gbdb/dm6/uniprot/unipInterest.bb\ filterValues.status Manually reviewed (Swiss-Prot),Unreviewed (TrEMBL)\ itemRgb off\ longLabel UniProt Regions of Interest\ parent uniprot\ priority 4\ shortLabel Interest\ track unipInterest\ type bigBed 12 +\ visibility dense\ phyloP27way phyloP wig -4.711 0.934 27 insects Basewise Conservation by PhyloP 2 4 60 60 140 140 60 60 0 0 0 compGeno 0 altColor 140,60,60\ autoScale off\ color 60,60,140\ configurable on\ longLabel 27 insects Basewise Conservation by PhyloP\ maxHeightPixels 100:50:11\ noInherit on\ parent cons27wayViewphyloP\ priority 4\ shortLabel phyloP\ spanList 1\ subGroups view=phyloP\ track phyloP27way\ type wig -4.711 0.934\ viewLimits -3.107:0.934\ windowingFunction mean\ ncbiRefSeqOther RefSeq Other bigBed 12 + NCBI RefSeq Other Annotations (not NM_*, NR_*, XM_*, XR_*, NP_* or YP_*) 1 4 32 32 32 143 143 143 0 0 0 genes 1 bigDataUrl /gbdb/dm6/ncbiRefSeq/ncbiRefSeqOther.bb\ color 32,32,32\ labelFields gene\ longLabel NCBI RefSeq Other Annotations (not NM_*, NR_*, XM_*, XR_*, NP_* or YP_*)\ parent refSeqComposite off\ priority 4\ searchIndex name\ searchTrix /gbdb/dm6/ncbiRefSeq/ncbiRefSeqOther.ix\ shortLabel RefSeq Other\ skipEmptyFields on\ track ncbiRefSeqOther\ type bigBed 12 +\ urls GeneID="https://www.ncbi.nlm.nih.gov/gene/$$" MIM="https://www.ncbi.nlm.nih.gov/omim/612091" HGNC="https://www.genenames.org/data/gene-symbol-report/#!/hgnc_id/$$" FlyBase="http://flybase.org/reports/$$" WormBase="http://www.wormbase.org/db/gene/gene?name=$$" RGD="https://rgd.mcw.edu/rgdweb/search/search.html?term=$$" SGD="https://www.yeastgenome.org/locus/$$" miRBase="http://www.mirbase.org/cgi-bin/mirna_entry.pl?acc=$$" ZFIN="https://zfin.org/$$" MGI="http://www.informatics.jax.org/marker/$$"\ chainDroSec1 D. sechellia Chain chain droSec1 D. sechellia (Oct. 2005 (Broad/droSec1)) Chained Alignments 3 5 0 0 0 255 255 0 1 0 0Description
\\ This track shows alignments of D. sechellia (droSec1, Oct. 2005 (Broad/droSec1)) to the\ D. melanogaster genome using a gap scoring system that allows longer gaps \ than traditional affine gap scoring systems. It can also tolerate gaps in both\ D. sechellia and D. melanogaster simultaneously. These \ "double-sided" gaps can be caused by local inversions and \ overlapping deletions in both species. \
\ The chain track displays boxes joined together by either single or\ double lines. The boxes represent aligning regions.\ Single lines indicate gaps that are largely due to a deletion in the\ D. sechellia assembly or an insertion in the D. melanogaster \ assembly. Double lines represent more complex gaps that involve substantial\ sequence in both species. This may result from inversions, overlapping\ deletions, an abundance of local mutation, or an unsequenced gap in one\ species. In cases where multiple chains align over a particular region of\ the D. melanogaster genome, the chains with single-lined gaps are often \ due to processed pseudogenes, while chains with double-lined gaps are more \ often due to paralogs and unprocessed pseudogenes.
\\ In the "pack" and "full" display\ modes, the individual feature names indicate the chromosome, strand, and\ location (in thousands) of the match for each matching alignment.
\ \ \Display Conventions and Configuration
\By default, the chains to chromosome-based assemblies are colored\ based on which chromosome they map to in the aligning organism. To turn\ off the coloring, check the "off" button next to: Color\ track based on chromosome.
\\ To display only the chains of one chromosome in the aligning\ organism, enter the name of that chromosome (e.g. chr4) in box next to: \ Filter by chromosome.
\ \Methods
\\ The D. sechellia/D. melanogaster genomes were aligned with \ blastz and converted into axt format using the lavToAxt program.\ The axt alignments were fed into axtChain, which organizes all \ alignments between a single D. sechellia chromosome and a single \ D. melanogaster chromosome into a group and creates a kd-tree out \ of the gapless subsections (blocks) of the alignments. A dynamic program \ was then run over the kd-trees to find the maximally scoring chains of these \ blocks. Chains scoring below a threshold were discarded; the remaining \ chains are displayed in this track.
\ \Credits
\\ Blastz was developed at Pennsylvania State University by \ Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from\ Ross Hardison.
\\ Lineage-specific repeats were identified by Arian Smit and his \ RepeatMasker\ program.
\\ The axtChain program was developed at the University of California at \ Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler.
\\ The browser display and database storage of the chains were generated\ by Robert Baertsch and Jim Kent.
\ \References
\\ Chiaromonte F, Yap VB, Miller W.\ Scoring pairwise genomic sequence alignments.\ Pac Symp Biocomput. 2002:115-26.\ PMID: 11928468\
\ \\ Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D.\ Evolution's cauldron:\ duplication, deletion, and rearrangement in the mouse and human genomes.\ Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9.\ PMID: 14500911; PMC: PMC208784\
\ \\ Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC,\ Haussler D, Miller W.\ Human-mouse alignments with BLASTZ.\ Genome Res. 2003 Jan;13(1):103-7.\ PMID: 12529312; PMC: PMC430961\
\ compGeno 1 longLabel D. sechellia (Oct. 2005 (Broad/droSec1)) Chained Alignments\ otherDb droSec1\ parent insectsChainNetViewchain off\ shortLabel D. sechellia Chain\ subGroups view=chain species=s001 clade=c00\ track chainDroSec1\ type chain droSec1\ ncbiRefSeqPsl RefSeq Alignments psl RefSeq Alignments of RNAs 1 5 0 0 0 127 127 127 0 0 0 genes 1 baseColorDefault diffCodons\ baseColorUseCds table ncbiRefSeqCds\ baseColorUseSequence extFile seqNcbiRefSeq extNcbiRefSeq\ color 0,0,0\ idXref ncbiRefSeqLink mrnaAcc name\ indelDoubleInsert on\ indelQueryInsert on\ longLabel RefSeq Alignments of RNAs\ parent refSeqComposite off\ pepTable ncbiRefSeqPepTable\ priority 5\ pslSequence no\ shortLabel RefSeq Alignments\ showCdsAllScales .\ showCdsMaxZoom 10000.0\ showDiffBasesAllScales .\ showDiffBasesMaxZoom 10000.0\ track ncbiRefSeqPsl\ type psl\ unipLocTransMemb Transmembrane bigBed 12 + UniProt Transmembrane Domains 1 5 0 150 0 127 202 127 0 0 0 genes 1 bigDataUrl /gbdb/dm6/uniprot/unipLocTransMemb.bb\ color 0,150,0\ filterValues.status Manually reviewed (Swiss-Prot),Unreviewed (TrEMBL)\ itemRgb off\ longLabel UniProt Transmembrane Domains\ parent uniprot\ priority 5\ shortLabel Transmembrane\ track unipLocTransMemb\ type bigBed 12 +\ visibility dense\ netDroSec1 D. sechellia Net netAlign droSec1 chainDroSec1 D. sechellia (Oct. 2005 (Broad/droSec1)) Alignment Net 1 6 0 0 0 255 255 0 0 0 0Description
\\ This track shows the best D. sechellia/D. melanogaster chain for \ every part of the D. melanogaster genome. It is useful for\ finding orthologous regions and for studying genome\ rearrangement. The D. sechellia sequence used in this annotation is \ from the Oct. 2005 (Broad/droSec1) (droSec1) assembly.
\ \Display Conventions and Configuration
\\ In full display mode, the top-level (level 1)\ chains are the largest, highest-scoring chains that\ span this region. In many cases gaps exist in the\ top-level chain. When possible, these are filled in by\ other chains that are displayed at level 2. The gaps in \ level 2 chains may be filled by level 3 chains and so\ forth.
\\ In the graphical display, the boxes represent ungapped \ alignments; the lines represent gaps. Click\ on a box to view detailed information about the chain\ as a whole; click on a line to display information\ about the gap. The detailed information is useful in determining\ the cause of the gap or, for lower level chains, the genomic\ rearrangement.
\\ Individual items in the display are categorized as one of four types\ (other than gap):
\\
\ \- Top - the best, longest match. Displayed on level 1.\
- Syn - line-ups on the same chromosome as the gap in the level above\ it.\
- Inv - a line-up on the same chromosome as the gap above it, but in \ the opposite orientation.\
- NonSyn - a match to a chromosome different from the gap in the \ level above.\
Methods
\\ Chains were derived from blastz alignments, using the methods\ described on the chain tracks description pages, and sorted with the \ highest-scoring chains in the genome ranked first. The program\ chainNet was then used to place the chains one at a time, trimming them as \ necessary to fit into sections not already covered by a higher-scoring chain. \ During this process, a natural hierarchy emerged in which a chain that filled \ a gap in a higher-scoring chain was placed underneath that chain. The program \ netSyntenic was used to fill in information about the relationship between \ higher- and lower-level chains, such as whether a lower-level\ chain was syntenic or inverted relative to the higher-level chain. \ The program netClass was then used to fill in how much of the gaps and chains \ contained Ns (sequencing gaps) in one or both species and how much\ was filled with transposons inserted before and after the two organisms \ diverged.
\ \Credits
\\ The chainNet, netSyntenic, and netClass programs were\ developed at the University of California\ Santa Cruz by Jim Kent.
\\ Blastz was developed at Pennsylvania State University by\ Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from\ Ross Hardison.
\\ Lineage-specific repeats were identified by Arian Smit and his program \ RepeatMasker.
\\ The browser display and database storage of the nets were made\ by Robert Baertsch and Jim Kent.
\ \References
\\ Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D.\ Evolution's cauldron:\ duplication, deletion, and rearrangement in the mouse and human genomes.\ Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9.\ PMID: 14500911; PMC: PMC208784\
\ \\ Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC,\ Haussler D, Miller W.\ Human-mouse alignments with BLASTZ.\ Genome Res. 2003 Jan;13(1):103-7.\ PMID: 12529312; PMC: PMC430961\
\ compGeno 0 longLabel D. sechellia (Oct. 2005 (Broad/droSec1)) Alignment Net\ otherDb droSec1\ parent insectsChainNetViewnet off\ shortLabel D. sechellia Net\ subGroups view=net species=s001 clade=c00\ track netDroSec1\ type netAlign droSec1 chainDroSec1\ unipLocCytopl Cytoplasmic bigBed 12 + UniProt Cytoplasmic Domains 1 6 255 150 0 255 202 127 0 0 0 genes 1 bigDataUrl /gbdb/dm6/uniprot/unipLocCytopl.bb\ color 255,150,0\ filterValues.status Manually reviewed (Swiss-Prot),Unreviewed (TrEMBL)\ itemRgb off\ longLabel UniProt Cytoplasmic Domains\ parent uniprot\ priority 6\ shortLabel Cytoplasmic\ track unipLocCytopl\ type bigBed 12 +\ visibility dense\ ncbiRefSeqGenomicDiff RefSeq Diffs bigBed 9 + Differences between NCBI RefSeq Transcripts and the Reference Genome 1 6 0 0 0 127 127 127 0 0 0 genes 1 bigDataUrl /gbdb/dm6/ncbiRefSeq/ncbiRefSeqGenomicDiff.bb\ itemRgb on\ longLabel Differences between NCBI RefSeq Transcripts and the Reference Genome\ parent refSeqComposite off\ priority 6\ shortLabel RefSeq Diffs\ skipEmptyFields on\ track ncbiRefSeqGenomicDiff\ type bigBed 9 +\ chainDroYak3 droYak3 Chain chain droYak3 D. yakuba (27 Jun 2006 (dyak_caf1/droYak3)) Chained Alignments 3 7 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel D. yakuba (27 Jun 2006 (dyak_caf1/droYak3)) Chained Alignments\ otherDb droYak3\ parent insectsChainNetViewchain off\ shortLabel droYak3 Chain\ subGroups view=chain species=s002 clade=c00\ track chainDroYak3\ type chain droYak3\ unipChain Chains bigBed 12 + UniProt Mature Protein Products (Polypeptide Chains) 1 7 0 0 0 127 127 127 0 0 0 genes 1 bigDataUrl /gbdb/dm6/uniprot/unipChain.bb\ filterValues.status Manually reviewed (Swiss-Prot),Unreviewed (TrEMBL)\ longLabel UniProt Mature Protein Products (Polypeptide Chains)\ parent uniprot\ priority 7\ shortLabel Chains\ track unipChain\ type bigBed 12 +\ urls uniProtId="http://www.uniprot.org/uniprot/$$#ptm_processing" pmids="https://www.ncbi.nlm.nih.gov/pubmed/$$"\ visibility dense\ refGene UCSC RefSeq genePred refPep refMrna UCSC annotations of RefSeq RNAs (NM_* and NR_*) 1 7 12 12 120 133 133 187 0 0 0Description
\\ The RefSeq Genes track shows known D. melanogaster protein-coding and \ non-protein-coding genes taken from the NCBI RNA reference sequences \ collection (RefSeq), which were directly contributed to NCBI by\ FlyBase.\ The data underlying this track are updated weekly.
\ \ \\ Please visit the Feedback for Gene and Reference Sequences (RefSeq) page to\ make suggestions, submit additions and corrections, or ask for help concerning\ RefSeq records.\
\ \Display Conventions and Configuration
\\ This track follows the display conventions for \ gene prediction \ tracks.\ The color shading indicates the level of review the RefSeq record has \ undergone: predicted (light), provisional (medium), reviewed (dark).
\\ The item labels and display colors of features within this track can be\ configured through the controls at the top of the track description page. \ This page is accessed via the small button to the left of the track's \ graphical display or through the link on the track's control menu. \
\
\ \- Label: By default, items are labeled by gene name. Click the \ appropriate Label option to display the accession name instead of the gene\ name, show both the gene and accession names, or turn off the label \ completely.\
- Codon coloring: This track contains an optional codon coloring \ feature that allows users to quickly validate and compare gene predictions.\ To display codon colors, select the genomic codons option from the\ Color track by codons pull-down menu. Click \ here for more \ information about this feature.\
- Hide non-coding genes: By default, both the protein-coding and\ non-protein-coding genes are displayed. If you wish to see only the coding\ genes, click this box.\
Methods
\\ RefSeq RNAs were aligned against the D. melanogaster genome using blat; \ those with an alignment of less than 15% were discarded. When a single RNA \ aligned in multiple places, the alignment having the highest base identity \ was identified. Only alignments having a base identity level within 0.1% of \ the best and at least 96% base identity with the genomic sequence were kept.\
\ \ \Credits
\\ This track was produced at UCSC from RNA sequence data\ generated by scientists worldwide and curated by the \ NCBI RefSeq project.
\ \References
\\ Kent WJ.\ \ BLAT--the BLAST-like alignment tool.\ Genome Res. 2002 Apr;12(4):656-64.\ PMID: 11932250; PMC: PMC187518\
\ \\ Pruitt KD, Tatusova T, Maglott DR.\ NCBI Reference Sequence (RefSeq): a curated non-redundant\ sequence database of genomes, transcripts and proteins.\ Nucleic Acids Res. 2005 Jan 1;33(Database issue):D501-4.\ PMID: 15608248; PMC: PMC539979\
\ genes 1 baseColorDefault genomicCodons\ baseColorUseCds given\ color 12,12,120\ dataVersion \ group genes\ idXref hgFixed.refLink mrnaAcc name\ longLabel UCSC annotations of RefSeq RNAs (NM_* and NR_*)\ parent refSeqComposite off\ priority 7\ shortLabel UCSC RefSeq\ track refGene\ type genePred refPep refMrna\ visibility dense\ netDroYak3 droYak3 Net netAlign droYak3 chainDroYak3 D. yakuba (27 Jun 2006 (dyak_caf1/droYak3)) Alignment Net 1 8 0 0 0 255 255 0 0 0 0 compGeno 0 longLabel D. yakuba (27 Jun 2006 (dyak_caf1/droYak3)) Alignment Net\ otherDb droYak3\ parent insectsChainNetViewnet off\ shortLabel droYak3 Net\ subGroups view=net species=s002 clade=c00\ track netDroYak3\ type netAlign droYak3 chainDroYak3\ unipDisulfBond Disulf. Bonds bigBed 12 + UniProt Disulfide Bonds 1 8 0 0 0 127 127 127 0 0 0 genes 1 bigDataUrl /gbdb/dm6/uniprot/unipDisulfBond.bb\ filterValues.status Manually reviewed (Swiss-Prot),Unreviewed (TrEMBL)\ longLabel UniProt Disulfide Bonds\ parent uniprot\ priority 8\ shortLabel Disulf. Bonds\ track unipDisulfBond\ type bigBed 12 +\ visibility dense\ unipDomain Domains bigBed 12 + UniProt Domains 1 8 0 0 0 127 127 127 0 0 0 genes 1 bigDataUrl /gbdb/dm6/uniprot/unipDomain.bb\ filterValues.status Manually reviewed (Swiss-Prot),Unreviewed (TrEMBL)\ longLabel UniProt Domains\ parent uniprot\ priority 8\ shortLabel Domains\ track unipDomain\ type bigBed 12 +\ urls uniProtId="http://www.uniprot.org/uniprot/$$#family_and_domains" pmids="https://www.ncbi.nlm.nih.gov/pubmed/$$"\ visibility dense\ chainDroEre2 droEre2 Chain chain droEre2 D. erecta (Feb. 2006 (Agencourt CAF1/droEre2)) Chained Alignments 3 9 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel D. erecta (Feb. 2006 (Agencourt CAF1/droEre2)) Chained Alignments\ otherDb droEre2\ parent insectsChainNetViewchain off\ shortLabel droEre2 Chain\ subGroups view=chain species=s003 clade=c00\ track chainDroEre2\ type chain droEre2\ unipModif AA Modifications bigBed 12 + UniProt Amino Acid Modifications 1 9 0 0 0 127 127 127 0 0 0 genes 1 bigDataUrl /gbdb/dm6/uniprot/unipModif.bb\ filterValues.status Manually reviewed (Swiss-Prot),Unreviewed (TrEMBL)\ longLabel UniProt Amino Acid Modifications\ parent uniprot\ priority 9\ shortLabel AA Modifications\ track unipModif\ type bigBed 12 +\ urls uniProtId="http://www.uniprot.org/uniprot/$$#aaMod_section" pmids="https://www.ncbi.nlm.nih.gov/pubmed/$$"\ visibility dense\ netDroEre2 droEre2 Net netAlign droEre2 chainDroEre2 D. erecta (Feb. 2006 (Agencourt CAF1/droEre2)) Alignment Net 1 10 0 0 0 255 255 0 0 0 0 compGeno 0 longLabel D. erecta (Feb. 2006 (Agencourt CAF1/droEre2)) Alignment Net\ otherDb droEre2\ parent insectsChainNetViewnet off\ shortLabel droEre2 Net\ subGroups view=net species=s003 clade=c00\ track netDroEre2\ type netAlign droEre2 chainDroEre2\ unipMut Mutations bigBed 12 + UniProt Amino Acid Mutations 1 10 0 0 0 127 127 127 0 0 0 genes 1 bigDataUrl /gbdb/dm6/uniprot/unipMut.bb\ longLabel UniProt Amino Acid Mutations\ parent uniprot\ priority 10\ shortLabel Mutations\ track unipMut\ type bigBed 12 +\ urls uniProtId="http://www.uniprot.org/uniprot/$$#pathology_and_biotech" pmids="https://www.ncbi.nlm.nih.gov/pubmed/$$" variationId="http://www.uniprot.org/uniprot/$$"\ visibility dense\ chainDroTak2 droTak2 Chain chain droTak2 D. takahashii (04 Mar 2013 (Dtak_2.0/droTak2)) Chained Alignments 3 11 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel D. takahashii (04 Mar 2013 (Dtak_2.0/droTak2)) Chained Alignments\ otherDb droTak2\ parent insectsChainNetViewchain off\ shortLabel droTak2 Chain\ subGroups view=chain species=s004 clade=c00\ track chainDroTak2\ type chain droTak2\ unipOther Other Annot. bigBed 12 + UniProt Other Annotations 1 11 0 0 0 127 127 127 0 0 0 genes 1 bigDataUrl /gbdb/dm6/uniprot/unipOther.bb\ filterValues.status Manually reviewed (Swiss-Prot),Unreviewed (TrEMBL)\ longLabel UniProt Other Annotations\ parent uniprot\ priority 11\ shortLabel Other Annot.\ track unipOther\ type bigBed 12 +\ urls uniProtId="http://www.uniprot.org/uniprot/$$#family_and_domains" pmids="https://www.ncbi.nlm.nih.gov/pubmed/$$"\ visibility dense\ unipStruct Structure bigBed 12 + UniProt Protein Primary/Secondary Structure Annotations 0 11 0 0 0 127 127 127 0 0 0 genes 1 bigDataUrl /gbdb/dm6/uniprot/unipStruct.bb\ filterValues.status Manually reviewed (Swiss-Prot),Unreviewed (TrEMBL)\ group genes\ longLabel UniProt Protein Primary/Secondary Structure Annotations\ parent uniprot\ priority 11\ shortLabel Structure\ track unipStruct\ type bigBed 12 +\ urls uniProtId="http://www.uniprot.org/uniprot/$$#structure" pmids="https://www.ncbi.nlm.nih.gov/pubmed/$$"\ visibility hide\ netDroTak2 droTak2 Net netAlign droTak2 chainDroTak2 D. takahashii (04 Mar 2013 (Dtak_2.0/droTak2)) Alignment Net 1 12 0 0 0 255 255 0 0 0 0 compGeno 0 longLabel D. takahashii (04 Mar 2013 (Dtak_2.0/droTak2)) Alignment Net\ otherDb droTak2\ parent insectsChainNetViewnet off\ shortLabel droTak2 Net\ subGroups view=net species=s004 clade=c00\ track netDroTak2\ type netAlign droTak2 chainDroTak2\ unipRepeat Repeats bigBed 12 + UniProt Repeats 1 12 0 0 0 127 127 127 0 0 0 genes 1 bigDataUrl /gbdb/dm6/uniprot/unipRepeat.bb\ filterValues.status Manually reviewed (Swiss-Prot),Unreviewed (TrEMBL)\ longLabel UniProt Repeats\ parent uniprot\ priority 12\ shortLabel Repeats\ track unipRepeat\ type bigBed 12 +\ urls uniProtId="http://www.uniprot.org/uniprot/$$#family_and_domains" pmids="https://www.ncbi.nlm.nih.gov/pubmed/$$"\ visibility dense\ chainDroEle2 droEle2 Chain chain droEle2 D. elegans (04 Mar 2013 (Dele_2.0/droEle2)) Chained Alignments 3 13 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel D. elegans (04 Mar 2013 (Dele_2.0/droEle2)) Chained Alignments\ otherDb droEle2\ parent insectsChainNetViewchain off\ shortLabel droEle2 Chain\ subGroups view=chain species=s005 clade=c00\ track chainDroEle2\ type chain droEle2\ phastCons124way Cons 124 insects wig 0 1 124 insects conservation by PhastCons 2 13 70 130 70 130 70 70 0 0 0 compGeno 0 altColor 130,70,70\ autoScale off\ color 70,130,70\ configurable on\ longLabel 124 insects conservation by PhastCons\ maxHeightPixels 100:40:11\ noInherit on\ parent cons124wayViewphastcons off\ priority 13\ shortLabel Cons 124 insects\ spanList 1\ subGroups view=phastcons\ track phastCons124way\ type wig 0 1\ windowingFunction mean\ phastCons27way phastCons wig 0 1 27 insects conservation by PhastCons 2 13 70 130 70 130 70 70 0 0 0 compGeno 0 altColor 130,70,70\ autoScale off\ color 70,130,70\ configurable on\ longLabel 27 insects conservation by PhastCons\ maxHeightPixels 100:40:11\ noInherit on\ parent cons27wayViewphastcons\ priority 13\ shortLabel phastCons\ spanList 1\ subGroups view=phastcons\ track phastCons27way\ type wig 0 1\ windowingFunction mean\ unipConflict Seq. Conflicts bigBed 12 + UniProt Sequence Conflicts 1 13 0 0 0 127 127 127 0 0 0 genes 1 bigDataUrl /gbdb/dm6/uniprot/unipConflict.bb\ filterValues.status Manually reviewed (Swiss-Prot),Unreviewed (TrEMBL)\ longLabel UniProt Sequence Conflicts\ parent uniprot off\ priority 13\ shortLabel Seq. Conflicts\ track unipConflict\ type bigBed 12 +\ urls uniProtId="http://www.uniprot.org/uniprot/$$#Sequence_conflict_section" pmids="https://www.ncbi.nlm.nih.gov/pubmed/$$"\ visibility dense\ netDroEle2 droEle2 Net netAlign droEle2 chainDroEle2 D. elegans (04 Mar 2013 (Dele_2.0/droEle2)) Alignment Net 1 14 0 0 0 255 255 0 0 0 0 compGeno 0 longLabel D. elegans (04 Mar 2013 (Dele_2.0/droEle2)) Alignment Net\ otherDb droEle2\ parent insectsChainNetViewnet off\ shortLabel droEle2 Net\ subGroups view=net species=s005 clade=c00\ track netDroEle2\ type netAlign droEle2 chainDroEle2\ chainDroEug2 droEug2 Chain chain droEug2 D. eugracilis (04 Mar 2013 (Deug_2.0/droEug2)) Chained Alignments 3 15 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel D. eugracilis (04 Mar 2013 (Deug_2.0/droEug2)) Chained Alignments\ otherDb droEug2\ parent insectsChainNetViewchain off\ shortLabel droEug2 Chain\ subGroups view=chain species=s006 clade=c00\ track chainDroEug2\ type chain droEug2\ netDroEug2 droEug2 Net netAlign droEug2 chainDroEug2 D. eugracilis (04 Mar 2013 (Deug_2.0/droEug2)) Alignment Net 1 16 0 0 0 255 255 0 0 0 0 compGeno 0 longLabel D. eugracilis (04 Mar 2013 (Deug_2.0/droEug2)) Alignment Net\ otherDb droEug2\ parent insectsChainNetViewnet off\ shortLabel droEug2 Net\ subGroups view=net species=s006 clade=c00\ track netDroEug2\ type netAlign droEug2 chainDroEug2\ chainDroBia2 droBia2 Chain chain droBia2 D. biarmipes (04 Mar 2013 (Dbia_2.0/droBia2)) Chained Alignments 3 17 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel D. biarmipes (04 Mar 2013 (Dbia_2.0/droBia2)) Chained Alignments\ otherDb droBia2\ parent insectsChainNetViewchain off\ shortLabel droBia2 Chain\ subGroups view=chain species=s007 clade=c00\ track chainDroBia2\ type chain droBia2\ netDroBia2 droBia2 Net netAlign droBia2 chainDroBia2 D. biarmipes (04 Mar 2013 (Dbia_2.0/droBia2)) Alignment Net 1 18 0 0 0 255 255 0 0 0 0 compGeno 0 longLabel D. biarmipes (04 Mar 2013 (Dbia_2.0/droBia2)) Alignment Net\ otherDb droBia2\ parent insectsChainNetViewnet off\ shortLabel droBia2 Net\ subGroups view=net species=s007 clade=c00\ track netDroBia2\ type netAlign droBia2 chainDroBia2\ chainDroRho2 droRho2 Chain chain droRho2 D. rhopaloa (22 Feb 2013 (Drho_2.0/droRho2)) Chained Alignments 3 19 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel D. rhopaloa (22 Feb 2013 (Drho_2.0/droRho2)) Chained Alignments\ otherDb droRho2\ parent insectsChainNetViewchain off\ shortLabel droRho2 Chain\ subGroups view=chain species=s008 clade=c00\ track chainDroRho2\ type chain droRho2\ netDroRho2 droRho2 Net netAlign droRho2 chainDroRho2 D. rhopaloa (22 Feb 2013 (Drho_2.0/droRho2)) Alignment Net 1 20 0 0 0 255 255 0 0 0 0 compGeno 0 longLabel D. rhopaloa (22 Feb 2013 (Drho_2.0/droRho2)) Alignment Net\ otherDb droRho2\ parent insectsChainNetViewnet off\ shortLabel droRho2 Net\ subGroups view=net species=s008 clade=c00\ track netDroRho2\ type netAlign droRho2 chainDroRho2\ chainDroFic2 droFic2 Chain chain droFic2 D. ficusphila (04 Mar 2013 (Dfic_2.0/droFic2)) Chained Alignments 3 21 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel D. ficusphila (04 Mar 2013 (Dfic_2.0/droFic2)) Chained Alignments\ otherDb droFic2\ parent insectsChainNetViewchain off\ shortLabel droFic2 Chain\ subGroups view=chain species=s009 clade=c00\ track chainDroFic2\ type chain droFic2\ netDroFic2 droFic2 Net netAlign droFic2 chainDroFic2 D. ficusphila (04 Mar 2013 (Dfic_2.0/droFic2)) Alignment Net 1 22 0 0 0 255 255 0 0 0 0 compGeno 0 longLabel D. ficusphila (04 Mar 2013 (Dfic_2.0/droFic2)) Alignment Net\ otherDb droFic2\ parent insectsChainNetViewnet off\ shortLabel droFic2 Net\ subGroups view=net species=s009 clade=c00\ track netDroFic2\ type netAlign droFic2 chainDroFic2\ chainDroSuz1 droSuz1 Chain chain droSuz1 D. suzukii (30 Sep 2013 (Dsuzukii.v01/droSuz1)) Chained Alignments 3 23 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel D. suzukii (30 Sep 2013 (Dsuzukii.v01/droSuz1)) Chained Alignments\ otherDb droSuz1\ parent insectsChainNetViewchain off\ shortLabel droSuz1 Chain\ subGroups view=chain species=s010 clade=c00\ track chainDroSuz1\ type chain droSuz1\ phastConsElements124way 124 insects El bed 5 . 124 insects Conserved Elements 1 23 110 10 40 182 132 147 0 0 0 compGeno 1 color 110,10,40\ longLabel 124 insects Conserved Elements\ noInherit on\ parent cons124wayViewelements off\ priority 23\ shortLabel 124 insects El\ subGroups view=elements\ track phastConsElements124way\ type bed 5 .\ phastConsElements27way Cons Elements bed 5 . 27 insects Conserved Elements 1 23 110 10 40 182 132 147 0 0 0 compGeno 1 color 110,10,40\ longLabel 27 insects Conserved Elements\ noInherit on\ parent cons27wayViewelements on\ priority 23\ shortLabel Cons Elements\ subGroups view=elements\ track phastConsElements27way\ type bed 5 .\ netDroSuz1 droSuz1 Net netAlign droSuz1 chainDroSuz1 D. suzukii (30 Sep 2013 (Dsuzukii.v01/droSuz1)) Alignment Net 1 24 0 0 0 255 255 0 0 0 0 compGeno 0 longLabel D. suzukii (30 Sep 2013 (Dsuzukii.v01/droSuz1)) Alignment Net\ otherDb droSuz1\ parent insectsChainNetViewnet off\ shortLabel droSuz1 Net\ subGroups view=net species=s010 clade=c00\ track netDroSuz1\ type netAlign droSuz1 chainDroSuz1\ chainDroKik2 droKik2 Chain chain droKik2 D. kikkawai (04 Mar 2013 (Dkik_2.0/droKik2)) Chained Alignments 3 25 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel D. kikkawai (04 Mar 2013 (Dkik_2.0/droKik2)) Chained Alignments\ otherDb droKik2\ parent insectsChainNetViewchain off\ shortLabel droKik2 Chain\ subGroups view=chain species=s011 clade=c00\ track chainDroKik2\ type chain droKik2\ netDroKik2 droKik2 Net netAlign droKik2 chainDroKik2 D. kikkawai (04 Mar 2013 (Dkik_2.0/droKik2)) Alignment Net 1 26 0 0 0 255 255 0 0 0 0 compGeno 0 longLabel D. kikkawai (04 Mar 2013 (Dkik_2.0/droKik2)) Alignment Net\ otherDb droKik2\ parent insectsChainNetViewnet off\ shortLabel droKik2 Net\ subGroups view=net species=s011 clade=c00\ track netDroKik2\ type netAlign droKik2 chainDroKik2\ chainD_serrata D_serrata Chain chain D_serrata D_serrata (D_serrata) Chained Alignments 3 27 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel D_serrata (D_serrata) Chained Alignments\ otherDb D_serrata\ parent insectsChainNetViewchain off\ shortLabel D_serrata Chain\ subGroups view=chain species=s012 clade=c00\ track chainD_serrata\ type chain D_serrata\ chainDroAna3 droAna3 Chain chain droAna3 D. ananassae (Feb. 2006 (Agencourt CAF1/droAna3)) Chained Alignments 3 28 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel D. ananassae (Feb. 2006 (Agencourt CAF1/droAna3)) Chained Alignments\ otherDb droAna3\ parent insectsChainNetViewchain off\ shortLabel droAna3 Chain\ subGroups view=chain species=s013 clade=c00\ track chainDroAna3\ type chain droAna3\ netDroAna3 droAna3 Net netAlign droAna3 chainDroAna3 D. ananassae (Feb. 2006 (Agencourt CAF1/droAna3)) Alignment Net 1 29 0 0 0 255 255 0 0 0 0 compGeno 0 longLabel D. ananassae (Feb. 2006 (Agencourt CAF1/droAna3)) Alignment Net\ otherDb droAna3\ parent insectsChainNetViewnet off\ shortLabel droAna3 Net\ subGroups view=net species=s013 clade=c00\ track netDroAna3\ type netAlign droAna3 chainDroAna3\ chainDroBip2 droBip2 Chain chain droBip2 D. bipectinata (04 Mar 2013 (Dbip_2.0/droBip2)) Chained Alignments 3 30 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel D. bipectinata (04 Mar 2013 (Dbip_2.0/droBip2)) Chained Alignments\ otherDb droBip2\ parent insectsChainNetViewchain off\ shortLabel droBip2 Chain\ subGroups view=chain species=s014 clade=c00\ track chainDroBip2\ type chain droBip2\ netDroBip2 droBip2 Net netAlign droBip2 chainDroBip2 D. bipectinata (04 Mar 2013 (Dbip_2.0/droBip2)) Alignment Net 1 31 0 0 0 255 255 0 0 0 0 compGeno 0 longLabel D. bipectinata (04 Mar 2013 (Dbip_2.0/droBip2)) Alignment Net\ otherDb droBip2\ parent insectsChainNetViewnet off\ shortLabel droBip2 Net\ subGroups view=net species=s014 clade=c00\ track netDroBip2\ type netAlign droBip2 chainDroBip2\ chainD_obscura D_obscura Chain chain D_obscura D_obscura (D_obscura) Chained Alignments 3 32 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel D_obscura (D_obscura) Chained Alignments\ otherDb D_obscura\ parent insectsChainNetViewchain off\ shortLabel D_obscura Chain\ subGroups view=chain species=s015 clade=c00\ track chainD_obscura\ type chain D_obscura\ chainDroPse3 droPse3 Chain chain droPse3 D. pseudoobscura (11 Apr 2013 (Dpse_3.0/droPse3)) Chained Alignments 3 33 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel D. pseudoobscura (11 Apr 2013 (Dpse_3.0/droPse3)) Chained Alignments\ otherDb droPse3\ parent insectsChainNetViewchain off\ shortLabel droPse3 Chain\ subGroups view=chain species=s016a clade=c00\ track chainDroPse3\ type chain droPse3\ netDroPse3 droPse3 Net netAlign droPse3 chainDroPse3 D. pseudoobscura (11 Apr 2013 (Dpse_3.0/droPse3)) Alignment Net 1 34 0 0 0 255 255 0 0 0 0 compGeno 0 longLabel D. pseudoobscura (11 Apr 2013 (Dpse_3.0/droPse3)) Alignment Net\ otherDb droPse3\ parent insectsChainNetViewnet off\ shortLabel droPse3 Net\ subGroups view=net species=s016a clade=c00\ track netDroPse3\ type netAlign droPse3 chainDroPse3\ chainDroMir2 droMir2 Chain chain droMir2 D. miranda (19 Apr 2013 (DroMir_2.2/droMir2)) Chained Alignments 3 35 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel D. miranda (19 Apr 2013 (DroMir_2.2/droMir2)) Chained Alignments\ otherDb droMir2\ parent insectsChainNetViewchain off\ shortLabel droMir2 Chain\ subGroups view=chain species=s017 clade=c00\ track chainDroMir2\ type chain droMir2\ netDroMir2 droMir2 Net netAlign droMir2 chainDroMir2 D. miranda (19 Apr 2013 (DroMir_2.2/droMir2)) Alignment Net 1 36 0 0 0 255 255 0 0 0 0 compGeno 0 longLabel D. miranda (19 Apr 2013 (DroMir_2.2/droMir2)) Alignment Net\ otherDb droMir2\ parent insectsChainNetViewnet off\ shortLabel droMir2 Net\ subGroups view=net species=s017 clade=c00\ track netDroMir2\ type netAlign droMir2 chainDroMir2\ chainDroPer1 D. persimilis Chain chain droPer1 D. persimilis (Oct. 2005 (Broad/droPer1)) Chained Alignments 3 37 0 0 0 255 255 0 1 0 0Description
\\ This track shows alignments of D. persimilis (droPer1, Oct. 2005 (Broad/droPer1)) to the\ D. melanogaster genome using a gap scoring system that allows longer gaps \ than traditional affine gap scoring systems. It can also tolerate gaps in both\ D. persimilis and D. melanogaster simultaneously. These \ "double-sided" gaps can be caused by local inversions and \ overlapping deletions in both species. \
\ The chain track displays boxes joined together by either single or\ double lines. The boxes represent aligning regions.\ Single lines indicate gaps that are largely due to a deletion in the\ D. persimilis assembly or an insertion in the D. melanogaster \ assembly. Double lines represent more complex gaps that involve substantial\ sequence in both species. This may result from inversions, overlapping\ deletions, an abundance of local mutation, or an unsequenced gap in one\ species. In cases where multiple chains align over a particular region of\ the D. melanogaster genome, the chains with single-lined gaps are often \ due to processed pseudogenes, while chains with double-lined gaps are more \ often due to paralogs and unprocessed pseudogenes.
\\ In the "pack" and "full" display\ modes, the individual feature names indicate the chromosome, strand, and\ location (in thousands) of the match for each matching alignment.
\ \ \Display Conventions and Configuration
\By default, the chains to chromosome-based assemblies are colored\ based on which chromosome they map to in the aligning organism. To turn\ off the coloring, check the "off" button next to: Color\ track based on chromosome.
\\ To display only the chains of one chromosome in the aligning\ organism, enter the name of that chromosome (e.g. chr4) in box next to: \ Filter by chromosome.
\ \Methods
\\ The D. persimilis/D. melanogaster genomes were aligned with \ blastz and converted into axt format using the lavToAxt program.\ The axt alignments were fed into axtChain, which organizes all \ alignments between a single D. persimilis chromosome and a single \ D. melanogaster chromosome into a group and creates a kd-tree out \ of the gapless subsections (blocks) of the alignments. A dynamic program \ was then run over the kd-trees to find the maximally scoring chains of these \ blocks. Chains scoring below a threshold were discarded; the remaining \ chains are displayed in this track.
\ \Credits
\\ Blastz was developed at Pennsylvania State University by \ Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from\ Ross Hardison.
\\ Lineage-specific repeats were identified by Arian Smit and his \ RepeatMasker\ program.
\\ The axtChain program was developed at the University of California at \ Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler.
\\ The browser display and database storage of the chains were generated\ by Robert Baertsch and Jim Kent.
\ \References
\\ Chiaromonte F, Yap VB, Miller W.\ Scoring pairwise genomic sequence alignments.\ Pac Symp Biocomput. 2002:115-26.\ PMID: 11928468\
\ \\ Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D.\ Evolution's cauldron:\ duplication, deletion, and rearrangement in the mouse and human genomes.\ Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9.\ PMID: 14500911; PMC: PMC208784\
\ \\ Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC,\ Haussler D, Miller W.\ Human-mouse alignments with BLASTZ.\ Genome Res. 2003 Jan;13(1):103-7.\ PMID: 12529312; PMC: PMC430961\
\ compGeno 1 longLabel D. persimilis (Oct. 2005 (Broad/droPer1)) Chained Alignments\ otherDb droPer1\ parent insectsChainNetViewchain off\ shortLabel D. persimilis Chain\ subGroups view=chain species=s018 clade=c00\ track chainDroPer1\ type chain droPer1\ netDroPer1 D. persimilis Net netAlign droPer1 chainDroPer1 D. persimilis (Oct. 2005 (Broad/droPer1)) Alignment Net 1 38 0 0 0 255 255 0 0 0 0Description
\\ This track shows the best D. persimilis/D. melanogaster chain for \ every part of the D. melanogaster genome. It is useful for\ finding orthologous regions and for studying genome\ rearrangement. The D. persimilis sequence used in this annotation is \ from the Oct. 2005 (Broad/droPer1) (droPer1) assembly.
\ \Display Conventions and Configuration
\\ In full display mode, the top-level (level 1)\ chains are the largest, highest-scoring chains that\ span this region. In many cases gaps exist in the\ top-level chain. When possible, these are filled in by\ other chains that are displayed at level 2. The gaps in \ level 2 chains may be filled by level 3 chains and so\ forth.
\\ In the graphical display, the boxes represent ungapped \ alignments; the lines represent gaps. Click\ on a box to view detailed information about the chain\ as a whole; click on a line to display information\ about the gap. The detailed information is useful in determining\ the cause of the gap or, for lower level chains, the genomic\ rearrangement.
\\ Individual items in the display are categorized as one of four types\ (other than gap):
\\
\ \- Top - the best, longest match. Displayed on level 1.\
- Syn - line-ups on the same chromosome as the gap in the level above\ it.\
- Inv - a line-up on the same chromosome as the gap above it, but in \ the opposite orientation.\
- NonSyn - a match to a chromosome different from the gap in the \ level above.\
Methods
\\ Chains were derived from blastz alignments, using the methods\ described on the chain tracks description pages, and sorted with the \ highest-scoring chains in the genome ranked first. The program\ chainNet was then used to place the chains one at a time, trimming them as \ necessary to fit into sections not already covered by a higher-scoring chain. \ During this process, a natural hierarchy emerged in which a chain that filled \ a gap in a higher-scoring chain was placed underneath that chain. The program \ netSyntenic was used to fill in information about the relationship between \ higher- and lower-level chains, such as whether a lower-level\ chain was syntenic or inverted relative to the higher-level chain. \ The program netClass was then used to fill in how much of the gaps and chains \ contained Ns (sequencing gaps) in one or both species and how much\ was filled with transposons inserted before and after the two organisms \ diverged.
\ \Credits
\\ The chainNet, netSyntenic, and netClass programs were\ developed at the University of California\ Santa Cruz by Jim Kent.
\\ Blastz was developed at Pennsylvania State University by\ Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from\ Ross Hardison.
\\ Lineage-specific repeats were identified by Arian Smit and his program \ RepeatMasker.
\\ The browser display and database storage of the nets were made\ by Robert Baertsch and Jim Kent.
\ \References
\\ Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D.\ Evolution's cauldron:\ duplication, deletion, and rearrangement in the mouse and human genomes.\ Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9.\ PMID: 14500911; PMC: PMC208784\
\ \\ Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC,\ Haussler D, Miller W.\ Human-mouse alignments with BLASTZ.\ Genome Res. 2003 Jan;13(1):103-7.\ PMID: 12529312; PMC: PMC430961\
\ compGeno 0 longLabel D. persimilis (Oct. 2005 (Broad/droPer1)) Alignment Net\ otherDb droPer1\ parent insectsChainNetViewnet off\ shortLabel D. persimilis Net\ subGroups view=net species=s018 clade=c00\ track netDroPer1\ type netAlign droPer1 chainDroPer1\ chainD_subobscura D_subobscura Chain chain D_subobscura D_subobscura (D_subobscura) Chained Alignments 3 39 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel D_subobscura (D_subobscura) Chained Alignments\ otherDb D_subobscura\ parent insectsChainNetViewchain off\ shortLabel D_subobscura Chain\ subGroups view=chain species=s019 clade=c00\ track chainD_subobscura\ type chain D_subobscura\ chainD_athabasca D_athabasca Chain chain D_athabasca D_athabasca (D_athabasca) Chained Alignments 3 40 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel D_athabasca (D_athabasca) Chained Alignments\ otherDb D_athabasca\ parent insectsChainNetViewchain off\ shortLabel D_athabasca Chain\ subGroups view=chain species=s020 clade=c00\ track chainD_athabasca\ type chain D_athabasca\ chainDroVir3 droVir3 Chain chain droVir3 D. virilis (Feb. 2006 (Agencourt CAF1/droVir3)) Chained Alignments 3 41 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel D. virilis (Feb. 2006 (Agencourt CAF1/droVir3)) Chained Alignments\ otherDb droVir3\ parent insectsChainNetViewchain off\ shortLabel droVir3 Chain\ subGroups view=chain species=s021 clade=c00\ track chainDroVir3\ type chain droVir3\ netDroVir3 droVir3 Net netAlign droVir3 chainDroVir3 D. virilis (Feb. 2006 (Agencourt CAF1/droVir3)) Alignment Net 1 42 0 0 0 255 255 0 0 0 0 compGeno 0 longLabel D. virilis (Feb. 2006 (Agencourt CAF1/droVir3)) Alignment Net\ otherDb droVir3\ parent insectsChainNetViewnet off\ shortLabel droVir3 Net\ subGroups view=net species=s021 clade=c00\ track netDroVir3\ type netAlign droVir3 chainDroVir3\ chainDroWil2 droWil2 Chain chain droWil2 D. willistoni (03 Aug 2006 (dwil_caf1/droWil2)) Chained Alignments 3 43 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel D. willistoni (03 Aug 2006 (dwil_caf1/droWil2)) Chained Alignments\ otherDb droWil2\ parent insectsChainNetViewchain off\ shortLabel droWil2 Chain\ subGroups view=chain species=s022 clade=c00\ track chainDroWil2\ type chain droWil2\ netDroWil2 droWil2 Net netAlign droWil2 chainDroWil2 D. willistoni (03 Aug 2006 (dwil_caf1/droWil2)) Alignment Net 1 44 0 0 0 255 255 0 0 0 0 compGeno 0 longLabel D. willistoni (03 Aug 2006 (dwil_caf1/droWil2)) Alignment Net\ otherDb droWil2\ parent insectsChainNetViewnet off\ shortLabel droWil2 Net\ subGroups view=net species=s022 clade=c00\ track netDroWil2\ type netAlign droWil2 chainDroWil2\ chainDroGri2 droGri2 Chain chain droGri2 D. grimshawi (Feb. 2006 (Agencourt CAF1/droGri2)) Chained Alignments 3 45 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel D. grimshawi (Feb. 2006 (Agencourt CAF1/droGri2)) Chained Alignments\ otherDb droGri2\ parent insectsChainNetViewchain off\ shortLabel droGri2 Chain\ subGroups view=chain species=s023 clade=c00\ track chainDroGri2\ type chain droGri2\ netDroGri2 droGri2 Net netAlign droGri2 chainDroGri2 D. grimshawi (Feb. 2006 (Agencourt CAF1/droGri2)) Alignment Net 1 46 0 0 0 255 255 0 0 0 0 compGeno 0 longLabel D. grimshawi (Feb. 2006 (Agencourt CAF1/droGri2)) Alignment Net\ otherDb droGri2\ parent insectsChainNetViewnet off\ shortLabel droGri2 Net\ subGroups view=net species=s023 clade=c00\ track netDroGri2\ type netAlign droGri2 chainDroGri2\ chainDroMoj3 droMoj3 Chain chain droMoj3 D. mojavensis (Feb. 2006 (Agencourt CAF1/droMoj3)) Chained Alignments 3 47 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel D. mojavensis (Feb. 2006 (Agencourt CAF1/droMoj3)) Chained Alignments\ otherDb droMoj3\ parent insectsChainNetViewchain off\ shortLabel droMoj3 Chain\ subGroups view=chain species=s024 clade=c00\ track chainDroMoj3\ type chain droMoj3\ netDroMoj3 droMoj3 Net netAlign droMoj3 chainDroMoj3 D. mojavensis (Feb. 2006 (Agencourt CAF1/droMoj3)) Alignment Net 1 48 0 0 0 255 255 0 0 0 0 compGeno 0 longLabel D. mojavensis (Feb. 2006 (Agencourt CAF1/droMoj3)) Alignment Net\ otherDb droMoj3\ parent insectsChainNetViewnet off\ shortLabel droMoj3 Net\ subGroups view=net species=s024 clade=c00\ track netDroMoj3\ type netAlign droMoj3 chainDroMoj3\ chainD_pseudoobscura_1 D_pseudoobscura_1 Chain chain D_pseudoobscura_1 D_pseudoobscura_1 (D_pseudoobscura_1) Chained Alignments 3 49 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel D_pseudoobscura_1 (D_pseudoobscura_1) Chained Alignments\ otherDb D_pseudoobscura_1\ parent insectsChainNetViewchain off\ shortLabel D_pseudoobscura_1 Chain\ subGroups view=chain species=s025 clade=c00\ track chainD_pseudoobscura_1\ type chain D_pseudoobscura_1\ chainD_novamexicana D_novamexicana Chain chain D_novamexicana D_novamexicana (D_novamexicana) Chained Alignments 3 50 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel D_novamexicana (D_novamexicana) Chained Alignments\ otherDb D_novamexicana\ parent insectsChainNetViewchain off\ shortLabel D_novamexicana Chain\ subGroups view=chain species=s026 clade=c00\ track chainD_novamexicana\ type chain D_novamexicana\ chainD_hydei D_hydei Chain chain D_hydei D_hydei (D_hydei) Chained Alignments 3 51 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel D_hydei (D_hydei) Chained Alignments\ otherDb D_hydei\ parent insectsChainNetViewchain off\ shortLabel D_hydei Chain\ subGroups view=chain species=s027 clade=c00\ track chainD_hydei\ type chain D_hydei\ chainD_americana D_americana Chain chain D_americana D_americana (D_americana) Chained Alignments 3 52 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel D_americana (D_americana) Chained Alignments\ otherDb D_americana\ parent insectsChainNetViewchain off\ shortLabel D_americana Chain\ subGroups view=chain species=s028 clade=c00\ track chainD_americana\ type chain D_americana\ chainD_montana D_montana Chain chain D_montana D_montana (D_montana) Chained Alignments 3 53 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel D_montana (D_montana) Chained Alignments\ otherDb D_montana\ parent insectsChainNetViewchain off\ shortLabel D_montana Chain\ subGroups view=chain species=s029 clade=c00\ track chainD_montana\ type chain D_montana\ chainDroAlb1 droAlb1 Chain chain droAlb1 D. albomicans (21 May 2012 (DroAlb_1.0/droAlb1)) Chained Alignments 3 54 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel D. albomicans (21 May 2012 (DroAlb_1.0/droAlb1)) Chained Alignments\ otherDb droAlb1\ parent insectsChainNetViewchain off\ shortLabel droAlb1 Chain\ subGroups view=chain species=s030 clade=c00\ track chainDroAlb1\ type chain droAlb1\ netDroAlb1 droAlb1 Net netAlign droAlb1 chainDroAlb1 D. albomicans (21 May 2012 (DroAlb_1.0/droAlb1)) Alignment Net 1 55 0 0 0 255 255 0 0 0 0 compGeno 0 longLabel D. albomicans (21 May 2012 (DroAlb_1.0/droAlb1)) Alignment Net\ otherDb droAlb1\ parent insectsChainNetViewnet off\ shortLabel droAlb1 Net\ subGroups view=net species=s030 clade=c00\ track netDroAlb1\ type netAlign droAlb1 chainDroAlb1\ chainScaptodrosophila_lebanonensis Scaptodrosophila_lebanonensis Chain chain Scaptodrosophila_lebanonensis Scaptodrosophila_lebanonensis (Scaptodrosophila_lebanonensis) Chained Alignments 3 56 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Scaptodrosophila_lebanonensis (Scaptodrosophila_lebanonensis) Chained Alignments\ otherDb Scaptodrosophila_lebanonensis\ parent insectsChainNetViewchain off\ shortLabel Scaptodrosophila_lebanonensis Chain\ subGroups view=chain species=s031 clade=c00\ track chainScaptodrosophila_lebanonensis\ type chain Scaptodrosophila_lebanonensis\ intronEst Spliced ESTs psl est D. melanogaster ESTs That Have Been Spliced 1 56 0 0 0 127 127 127 1 0 0Description
\\ This track shows alignments between D. melanogaster expressed sequence tags\ (ESTs) in GenBank and the genome that show signs of splicing when\ aligned against the genome. ESTs are single-read sequences, typically about \ 500 bases in length, that usually represent fragments of transcribed genes.\
\\ To be considered spliced, an EST must show \ evidence of at least one canonical intron, i.e. one that is at least\ 32 bases in length and has GT/AG ends. By requiring splicing, the level \ of contamination in the EST databases is drastically reduced\ at the expense of eliminating many genuine 3' ESTs.\ For a display of all ESTs (including unspliced), see the \ D. melanogaster EST track.
\ \Display Conventions and Configuration
\\ This track follows the display conventions for \ PSL alignment tracks. In dense display mode, darker shading\ indicates a larger number of aligned ESTs.
\\ The strand information (+/-) indicates the\ direction of the match between the EST and the matching\ genomic sequence. It bears no relationship to the direction\ of transcription of the RNA with which it might be associated.
\\ The description page for this track has a filter that can be used to change \ the display mode, alter the color, and include/exclude a subset of items \ within the track. This may be helpful when many items are shown in the track \ display, especially when only some are relevant to the current task.
\\ To use the filter:\
\
\- Type a term in one or more of the text boxes to filter the EST\ display. For example, to apply the filter to all ESTs expressed in a specific\ organ, type the name of the organ in the tissue box. To view the list of \ valid terms for each text box, consult the table in the Table Browser that \ corresponds to the factor on which you wish to filter. For example, the \ "tissue" table contains all the types of tissues that can be \ entered into the tissue text box. Wildcards may also be used in the\ filter.\
- If filtering on more than one value, choose the desired combination\ logic. If "and" is selected, only ESTs that match all filter \ criteria will be highlighted. If "or" is selected, ESTs that \ match any one of the filter criteria will be highlighted.\
- Choose the color or display characteristic that should be used to \ highlight or include/exclude the filtered items. If "exclude" is \ chosen, the browser will not display ESTs that match the filter criteria. \ If "include" is selected, the browser will display only those \ ESTs that match the filter criteria.\
\ This track may also be configured to display base labeling, a feature that\ allows the user to display all bases in the aligning sequence or only those \ that differ from the genomic sequence. For more information about this option,\ click \ here.\
\ \Methods
\\ To make an EST, RNA is isolated from cells and reverse\ transcribed into cDNA. Typically, the cDNA is cloned\ into a plasmid vector and a read is taken from the 5'\ and/or 3' primer. For most — but not all — ESTs, the\ reverse transcription is primed by an oligo-dT, which\ hybridizes with the poly-A tail of mature mRNA. The\ reverse transcriptase may or may not make it to the 5'\ end of the mRNA, which may or may not be degraded.
\\ In general, the 3' ESTs mark the end of transcription\ reasonably well, but the 5' ESTs may end at any point\ within the transcript. Some of the newer cap-selected\ libraries cover transcription start reasonably well. Before the \ cap-selection techniques\ emerged, some projects used random rather than poly-A\ priming in an attempt to retrieve sequence distant from the\ 3' end. These projects were successful at this, but as\ a side effect also deposited sequences from unprocessed\ mRNA and perhaps even genomic sequences into the EST databases.\ Even outside of the random-primed projects, there is a\ degree of non-mRNA contamination. Because of this, a\ single unspliced EST should be viewed with considerable\ skepticism.
\\ To generate this track, D. melanogaster ESTs from GenBank were aligned \ against the genome using blat. Note that the maximum intron length\ allowed by blat is 750,000 bases, which may eliminate some ESTs with very \ long introns that might otherwise align. When a single \ EST aligned in multiple places, the alignment having the \ highest base identity was identified. Only alignments having\ a base identity level within 0.5% of the best and at least 96% base identity \ with the genomic sequence are displayed in this track.
\ \Credits
\\ This track was produced at UCSC from EST sequence data\ submitted to the international public sequence databases by \ scientists worldwide.
\ \References
\\ Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL. \ GenBank: update.\ Nucleic Acids Res. 2004 Jan 1;32(Database issue):D23-6.\ PMID: 14681350; PMC: PMC308779\
\ \\ Kent WJ.\ BLAT - the BLAST-like alignment tool.\ Genome Res. 2002 Apr;12(4):656-64.\ PMID: 11932250; PMC: PMC187518\
\ rna 1 baseColorUseSequence genbank\ group rna\ indelDoubleInsert on\ indelQueryInsert on\ intronGap 30\ longLabel D. melanogaster ESTs That Have Been Spliced\ priority 56\ shortLabel Spliced ESTs\ showDiffBasesAllScales .\ spectrum on\ track intronEst\ type psl est\ visibility dense\ chainD_busckii D_busckii Chain chain D_busckii D_busckii (D_busckii) Chained Alignments 3 57 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel D_busckii (D_busckii) Chained Alignments\ otherDb D_busckii\ parent insectsChainNetViewchain off\ shortLabel D_busckii Chain\ subGroups view=chain species=s032 clade=c00\ track chainD_busckii\ type chain D_busckii\ chainD_arizonae D_arizonae Chain chain D_arizonae D_arizonae (D_arizonae) Chained Alignments 3 58 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel D_arizonae (D_arizonae) Chained Alignments\ otherDb D_arizonae\ parent insectsChainNetViewchain off\ shortLabel D_arizonae Chain\ subGroups view=chain species=s033 clade=c00\ track chainD_arizonae\ type chain D_arizonae\ chainD_nasuta D_nasuta Chain chain D_nasuta D_nasuta (D_nasuta) Chained Alignments 3 59 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel D_nasuta (D_nasuta) Chained Alignments\ otherDb D_nasuta\ parent insectsChainNetViewchain off\ shortLabel D_nasuta Chain\ subGroups view=chain species=s034 clade=c00\ track chainD_nasuta\ type chain D_nasuta\ chainZaprionus_indianus Zaprionus_indianus Chain chain Zaprionus_indianus Zaprionus_indianus (Zaprionus_indianus) Chained Alignments 3 60 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Zaprionus_indianus (Zaprionus_indianus) Chained Alignments\ otherDb Zaprionus_indianus\ parent insectsChainNetViewchain off\ shortLabel Zaprionus_indianus Chain\ subGroups view=chain species=s035 clade=c00\ track chainZaprionus_indianus\ type chain Zaprionus_indianus\ chainD_navojoa D_navojoa Chain chain D_navojoa D_navojoa (D_navojoa) Chained Alignments 3 61 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel D_navojoa (D_navojoa) Chained Alignments\ otherDb D_navojoa\ parent insectsChainNetViewchain off\ shortLabel D_navojoa Chain\ subGroups view=chain species=s036 clade=c00\ track chainD_navojoa\ type chain D_navojoa\ chainPhortica_variegata Phortica_variegata Chain chain Phortica_variegata Phortica_variegata (Phortica_variegata) Chained Alignments 3 62 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Phortica_variegata (Phortica_variegata) Chained Alignments\ otherDb Phortica_variegata\ parent insectsChainNetViewchain off\ shortLabel Phortica_variegata Chain\ subGroups view=chain species=s037 clade=c00\ track chainPhortica_variegata\ type chain Phortica_variegata\ chainTeleopsis_dalmanni Teleopsis_dalmanni Chain chain Teleopsis_dalmanni Teleopsis_dalmanni (Teleopsis_dalmanni) Chained Alignments 3 63 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Teleopsis_dalmanni (Teleopsis_dalmanni) Chained Alignments\ otherDb Teleopsis_dalmanni\ parent insectsChainNetViewchain off\ shortLabel Teleopsis_dalmanni Chain\ subGroups view=chain species=s038 clade=c00\ track chainTeleopsis_dalmanni\ type chain Teleopsis_dalmanni\ chainRhagoletis_zephyria Rhagoletis_zephyria Chain chain Rhagoletis_zephyria Rhagoletis_zephyria (Rhagoletis_zephyria) Chained Alignments 3 64 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Rhagoletis_zephyria (Rhagoletis_zephyria) Chained Alignments\ otherDb Rhagoletis_zephyria\ parent insectsChainNetViewchain off\ shortLabel Rhagoletis_zephyria Chain\ subGroups view=chain species=s039 clade=c00\ track chainRhagoletis_zephyria\ type chain Rhagoletis_zephyria\ chainLucilia_cuprina Lucilia_cuprina Chain chain Lucilia_cuprina Lucilia_cuprina (Lucilia_cuprina) Chained Alignments 3 65 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Lucilia_cuprina (Lucilia_cuprina) Chained Alignments\ otherDb Lucilia_cuprina\ parent insectsChainNetViewchain off\ shortLabel Lucilia_cuprina Chain\ subGroups view=chain species=s040 clade=c00\ track chainLucilia_cuprina\ type chain Lucilia_cuprina\ chainBactrocera_latifrons Bactrocera_latifrons Chain chain Bactrocera_latifrons Bactrocera_latifrons (Bactrocera_latifrons) Chained Alignments 3 66 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Bactrocera_latifrons (Bactrocera_latifrons) Chained Alignments\ otherDb Bactrocera_latifrons\ parent insectsChainNetViewchain off\ shortLabel Bactrocera_latifrons Chain\ subGroups view=chain species=s041 clade=c00\ track chainBactrocera_latifrons\ type chain Bactrocera_latifrons\ chainBactrocera_oleae Bactrocera_oleae Chain chain Bactrocera_oleae Bactrocera_oleae (Bactrocera_oleae) Chained Alignments 3 67 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Bactrocera_oleae (Bactrocera_oleae) Chained Alignments\ otherDb Bactrocera_oleae\ parent insectsChainNetViewchain off\ shortLabel Bactrocera_oleae Chain\ subGroups view=chain species=s042 clade=c00\ track chainBactrocera_oleae\ type chain Bactrocera_oleae\ chainZeugodacus_cucurbitae Zeugodacus_cucurbitae Chain chain Zeugodacus_cucurbitae Zeugodacus_cucurbitae (Zeugodacus_cucurbitae) Chained Alignments 3 68 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Zeugodacus_cucurbitae (Zeugodacus_cucurbitae) Chained Alignments\ otherDb Zeugodacus_cucurbitae\ parent insectsChainNetViewchain off\ shortLabel Zeugodacus_cucurbitae Chain\ subGroups view=chain species=s043 clade=c00\ track chainZeugodacus_cucurbitae\ type chain Zeugodacus_cucurbitae\ chainPhormia_regina Phormia_regina Chain chain Phormia_regina Phormia_regina (Phormia_regina) Chained Alignments 3 69 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Phormia_regina (Phormia_regina) Chained Alignments\ otherDb Phormia_regina\ parent insectsChainNetViewchain off\ shortLabel Phormia_regina Chain\ subGroups view=chain species=s044 clade=c00\ track chainPhormia_regina\ type chain Phormia_regina\ chainCeratitis_capitata Ceratitis_capitata Chain chain Ceratitis_capitata Ceratitis_capitata (Ceratitis_capitata) Chained Alignments 3 70 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Ceratitis_capitata (Ceratitis_capitata) Chained Alignments\ otherDb Ceratitis_capitata\ parent insectsChainNetViewchain off\ shortLabel Ceratitis_capitata Chain\ subGroups view=chain species=s045 clade=c00\ track chainCeratitis_capitata\ type chain Ceratitis_capitata\ chainPaykullia_maculata Paykullia_maculata Chain chain Paykullia_maculata Paykullia_maculata (Paykullia_maculata) Chained Alignments 3 71 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Paykullia_maculata (Paykullia_maculata) Chained Alignments\ otherDb Paykullia_maculata\ parent insectsChainNetViewchain off\ shortLabel Paykullia_maculata Chain\ subGroups view=chain species=s046 clade=c00\ track chainPaykullia_maculata\ type chain Paykullia_maculata\ chainBactrocera_tryoni Bactrocera_tryoni Chain chain Bactrocera_tryoni Bactrocera_tryoni (Bactrocera_tryoni) Chained Alignments 3 72 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Bactrocera_tryoni (Bactrocera_tryoni) Chained Alignments\ otherDb Bactrocera_tryoni\ parent insectsChainNetViewchain off\ shortLabel Bactrocera_tryoni Chain\ subGroups view=chain species=s047 clade=c00\ track chainBactrocera_tryoni\ type chain Bactrocera_tryoni\ chainMusDom2 musDom2 Chain chain musDom2 M. domestica (22 Apr 2013 (Musca_domestica-2.0.2/musDom2)) Chained Alignments 3 73 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel M. domestica (22 Apr 2013 (Musca_domestica-2.0.2/musDom2)) Chained Alignments\ otherDb musDom2\ parent insectsChainNetViewchain off\ shortLabel musDom2 Chain\ subGroups view=chain species=s048 clade=c00\ track chainMusDom2\ type chain musDom2\ netMusDom2 musDom2 Net netAlign musDom2 chainMusDom2 M. domestica (22 Apr 2013 (Musca_domestica-2.0.2/musDom2)) Alignment Net 1 74 0 0 0 255 255 0 0 0 0 compGeno 0 longLabel M. domestica (22 Apr 2013 (Musca_domestica-2.0.2/musDom2)) Alignment Net\ otherDb musDom2\ parent insectsChainNetViewnet off\ shortLabel musDom2 Net\ subGroups view=net species=s048 clade=c00\ track netMusDom2\ type netAlign musDom2 chainMusDom2\ chainBactrocera_dorsalis Bactrocera_dorsalis Chain chain Bactrocera_dorsalis Bactrocera_dorsalis (Bactrocera_dorsalis) Chained Alignments 3 75 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Bactrocera_dorsalis (Bactrocera_dorsalis) Chained Alignments\ otherDb Bactrocera_dorsalis\ parent insectsChainNetViewchain off\ shortLabel Bactrocera_dorsalis Chain\ subGroups view=chain species=s049 clade=c00\ track chainBactrocera_dorsalis\ type chain Bactrocera_dorsalis\ chainStomoxys_calcitrans Stomoxys_calcitrans Chain chain Stomoxys_calcitrans Stomoxys_calcitrans (Stomoxys_calcitrans) Chained Alignments 3 76 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Stomoxys_calcitrans (Stomoxys_calcitrans) Chained Alignments\ otherDb Stomoxys_calcitrans\ parent insectsChainNetViewchain off\ shortLabel Stomoxys_calcitrans Chain\ subGroups view=chain species=s050 clade=c00\ track chainStomoxys_calcitrans\ type chain Stomoxys_calcitrans\ chainGlossina_pallidipes Glossina_pallidipes Chain chain Glossina_pallidipes Glossina_pallidipes (Glossina_pallidipes) Chained Alignments 3 77 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Glossina_pallidipes (Glossina_pallidipes) Chained Alignments\ otherDb Glossina_pallidipes\ parent insectsChainNetViewchain off\ shortLabel Glossina_pallidipes Chain\ subGroups view=chain species=s051 clade=c00\ track chainGlossina_pallidipes\ type chain Glossina_pallidipes\ chainGlossina_fuscipes Glossina_fuscipes Chain chain Glossina_fuscipes Glossina_fuscipes (Glossina_fuscipes) Chained Alignments 3 78 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Glossina_fuscipes (Glossina_fuscipes) Chained Alignments\ otherDb Glossina_fuscipes\ parent insectsChainNetViewchain off\ shortLabel Glossina_fuscipes Chain\ subGroups view=chain species=s052 clade=c00\ track chainGlossina_fuscipes\ type chain Glossina_fuscipes\ chainGlossina_brevipalpis Glossina_brevipalpis Chain chain Glossina_brevipalpis Glossina_brevipalpis (Glossina_brevipalpis) Chained Alignments 3 79 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Glossina_brevipalpis (Glossina_brevipalpis) Chained Alignments\ otherDb Glossina_brevipalpis\ parent insectsChainNetViewchain off\ shortLabel Glossina_brevipalpis Chain\ subGroups view=chain species=s053 clade=c00\ track chainGlossina_brevipalpis\ type chain Glossina_brevipalpis\ chainGlossina_morsitans_2 Glossina_morsitans_2 Chain chain Glossina_morsitans_2 Glossina_morsitans_2 (Glossina_morsitans_2) Chained Alignments 3 80 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Glossina_morsitans_2 (Glossina_morsitans_2) Chained Alignments\ otherDb Glossina_morsitans_2\ parent insectsChainNetViewchain off\ shortLabel Glossina_morsitans_2 Chain\ subGroups view=chain species=s054 clade=c00\ track chainGlossina_morsitans_2\ type chain Glossina_morsitans_2\ chainGlossina_austeni Glossina_austeni Chain chain Glossina_austeni Glossina_austeni (Glossina_austeni) Chained Alignments 3 81 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Glossina_austeni (Glossina_austeni) Chained Alignments\ otherDb Glossina_austeni\ parent insectsChainNetViewchain off\ shortLabel Glossina_austeni Chain\ subGroups view=chain species=s055 clade=c00\ track chainGlossina_austeni\ type chain Glossina_austeni\ chainGlossina_palpalis_gambiensis Glossina_palpalis_gambiensis Chain chain Glossina_palpalis_gambiensis Glossina_palpalis_gambiensis (Glossina_palpalis_gambiensis) Chained Alignments 3 82 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Glossina_palpalis_gambiensis (Glossina_palpalis_gambiensis) Chained Alignments\ otherDb Glossina_palpalis_gambiensis\ parent insectsChainNetViewchain off\ shortLabel Glossina_palpalis_gambiensis Chain\ subGroups view=chain species=s056 clade=c00\ track chainGlossina_palpalis_gambiensis\ type chain Glossina_palpalis_gambiensis\ chainEphydra_gracilis Ephydra_gracilis Chain chain Ephydra_gracilis Ephydra_gracilis (Ephydra_gracilis) Chained Alignments 3 83 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Ephydra_gracilis (Ephydra_gracilis) Chained Alignments\ otherDb Ephydra_gracilis\ parent insectsChainNetViewchain off\ shortLabel Ephydra_gracilis Chain\ subGroups view=chain species=s057 clade=c00\ track chainEphydra_gracilis\ type chain Ephydra_gracilis\ chainLucilia_sericata Lucilia_sericata Chain chain Lucilia_sericata Lucilia_sericata (Lucilia_sericata) Chained Alignments 3 84 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Lucilia_sericata (Lucilia_sericata) Chained Alignments\ otherDb Lucilia_sericata\ parent insectsChainNetViewchain off\ shortLabel Lucilia_sericata Chain\ subGroups view=chain species=s058 clade=c00\ track chainLucilia_sericata\ type chain Lucilia_sericata\ chainGlossina_morsitans_1 Glossina_morsitans_1 Chain chain Glossina_morsitans_1 Glossina_morsitans_1 (Glossina_morsitans_1) Chained Alignments 3 85 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Glossina_morsitans_1 (Glossina_morsitans_1) Chained Alignments\ otherDb Glossina_morsitans_1\ parent insectsChainNetViewchain off\ shortLabel Glossina_morsitans_1 Chain\ subGroups view=chain species=s059 clade=c00\ track chainGlossina_morsitans_1\ type chain Glossina_morsitans_1\ chainCalliphora_vicina Calliphora_vicina Chain chain Calliphora_vicina Calliphora_vicina (Calliphora_vicina) Chained Alignments 3 86 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Calliphora_vicina (Calliphora_vicina) Chained Alignments\ otherDb Calliphora_vicina\ parent insectsChainNetViewchain off\ shortLabel Calliphora_vicina Chain\ subGroups view=chain species=s060 clade=c00\ track chainCalliphora_vicina\ type chain Calliphora_vicina\ chainSphyracephala_brevicornis Sphyracephala_brevicornis Chain chain Sphyracephala_brevicornis Sphyracephala_brevicornis (Sphyracephala_brevicornis) Chained Alignments 3 87 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Sphyracephala_brevicornis (Sphyracephala_brevicornis) Chained Alignments\ otherDb Sphyracephala_brevicornis\ parent insectsChainNetViewchain off\ shortLabel Sphyracephala_brevicornis Chain\ subGroups view=chain species=s061 clade=c00\ track chainSphyracephala_brevicornis\ type chain Sphyracephala_brevicornis\ chainProctacanthus_coquilletti Proctacanthus_coquilletti Chain chain Proctacanthus_coquilletti Proctacanthus_coquilletti (Proctacanthus_coquilletti) Chained Alignments 3 88 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Proctacanthus_coquilletti (Proctacanthus_coquilletti) Chained Alignments\ otherDb Proctacanthus_coquilletti\ parent insectsChainNetViewchain off\ shortLabel Proctacanthus_coquilletti Chain\ subGroups view=chain species=s062 clade=c00\ track chainProctacanthus_coquilletti\ type chain Proctacanthus_coquilletti\ chainHaematobia_irritans Haematobia_irritans Chain chain Haematobia_irritans Haematobia_irritans (Haematobia_irritans) Chained Alignments 3 89 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Haematobia_irritans (Haematobia_irritans) Chained Alignments\ otherDb Haematobia_irritans\ parent insectsChainNetViewchain off\ shortLabel Haematobia_irritans Chain\ subGroups view=chain species=s063 clade=c00\ track chainHaematobia_irritans\ type chain Haematobia_irritans\ chainThemira_minor Themira_minor Chain chain Themira_minor Themira_minor (Themira_minor) Chained Alignments 3 90 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Themira_minor (Themira_minor) Chained Alignments\ otherDb Themira_minor\ parent insectsChainNetViewchain off\ shortLabel Themira_minor Chain\ subGroups view=chain species=s064 clade=c00\ track chainThemira_minor\ type chain Themira_minor\ chainMegaselia_abdita Megaselia_abdita Chain chain Megaselia_abdita Megaselia_abdita (Megaselia_abdita) Chained Alignments 3 91 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Megaselia_abdita (Megaselia_abdita) Chained Alignments\ otherDb Megaselia_abdita\ parent insectsChainNetViewchain off\ shortLabel Megaselia_abdita Chain\ subGroups view=chain species=s065 clade=c00\ track chainMegaselia_abdita\ type chain Megaselia_abdita\ chainTephritis_californica Tephritis_californica Chain chain Tephritis_californica Tephritis_californica (Tephritis_californica) Chained Alignments 3 92 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Tephritis_californica (Tephritis_californica) Chained Alignments\ otherDb Tephritis_californica\ parent insectsChainNetViewchain off\ shortLabel Tephritis_californica Chain\ subGroups view=chain species=s066 clade=c00\ track chainTephritis_californica\ type chain Tephritis_californica\ chainCirrula_hians Cirrula_hians Chain chain Cirrula_hians Cirrula_hians (Cirrula_hians) Chained Alignments 3 93 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Cirrula_hians (Cirrula_hians) Chained Alignments\ otherDb Cirrula_hians\ parent insectsChainNetViewchain off\ shortLabel Cirrula_hians Chain\ subGroups view=chain species=s067 clade=c00\ track chainCirrula_hians\ type chain Cirrula_hians\ chainHermetia_illucens Hermetia_illucens Chain chain Hermetia_illucens Hermetia_illucens (Hermetia_illucens) Chained Alignments 3 94 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Hermetia_illucens (Hermetia_illucens) Chained Alignments\ otherDb Hermetia_illucens\ parent insectsChainNetViewchain off\ shortLabel Hermetia_illucens Chain\ subGroups view=chain species=s068 clade=c00\ track chainHermetia_illucens\ type chain Hermetia_illucens\ chainNeobellieria_bullata Neobellieria_bullata Chain chain Neobellieria_bullata Neobellieria_bullata (Neobellieria_bullata) Chained Alignments 3 95 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Neobellieria_bullata (Neobellieria_bullata) Chained Alignments\ otherDb Neobellieria_bullata\ parent insectsChainNetViewchain off\ shortLabel Neobellieria_bullata Chain\ subGroups view=chain species=s069 clade=c00\ track chainNeobellieria_bullata\ type chain Neobellieria_bullata\ chainEutreta_diana Eutreta_diana Chain chain Eutreta_diana Eutreta_diana (Eutreta_diana) Chained Alignments 3 96 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Eutreta_diana (Eutreta_diana) Chained Alignments\ otherDb Eutreta_diana\ parent insectsChainNetViewchain off\ shortLabel Eutreta_diana Chain\ subGroups view=chain species=s070 clade=c00\ track chainEutreta_diana\ type chain Eutreta_diana\ chainHolcocephala_fusca Holcocephala_fusca Chain chain Holcocephala_fusca Holcocephala_fusca (Holcocephala_fusca) Chained Alignments 3 97 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Holcocephala_fusca (Holcocephala_fusca) Chained Alignments\ otherDb Holcocephala_fusca\ parent insectsChainNetViewchain off\ shortLabel Holcocephala_fusca Chain\ subGroups view=chain species=s071 clade=c00\ track chainHolcocephala_fusca\ type chain Holcocephala_fusca\ chainSarcophagidae_BV_2014 Sarcophagidae_BV_2014 Chain chain Sarcophagidae_BV_2014 Sarcophagidae_BV_2014 (Sarcophagidae_BV_2014) Chained Alignments 3 98 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Sarcophagidae_BV_2014 (Sarcophagidae_BV_2014) Chained Alignments\ otherDb Sarcophagidae_BV_2014\ parent insectsChainNetViewchain off\ shortLabel Sarcophagidae_BV_2014 Chain\ subGroups view=chain species=s072 clade=c00\ track chainSarcophagidae_BV_2014\ type chain Sarcophagidae_BV_2014\ chainLiriomyza_trifolii Liriomyza_trifolii Chain chain Liriomyza_trifolii Liriomyza_trifolii (Liriomyza_trifolii) Chained Alignments 3 99 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Liriomyza_trifolii (Liriomyza_trifolii) Chained Alignments\ otherDb Liriomyza_trifolii\ parent insectsChainNetViewchain off\ shortLabel Liriomyza_trifolii Chain\ subGroups view=chain species=s073 clade=c00\ track chainLiriomyza_trifolii\ type chain Liriomyza_trifolii\ chainEristalis_dimidiata Eristalis_dimidiata Chain chain Eristalis_dimidiata Eristalis_dimidiata (Eristalis_dimidiata) Chained Alignments 3 100 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Eristalis_dimidiata (Eristalis_dimidiata) Chained Alignments\ otherDb Eristalis_dimidiata\ parent insectsChainNetViewchain off\ shortLabel Eristalis_dimidiata Chain\ subGroups view=chain species=s074 clade=c00\ track chainEristalis_dimidiata\ type chain Eristalis_dimidiata\ est D. melanogaster ESTs psl est D. melanogaster ESTs Including Unspliced 0 100 0 0 0 127 127 127 1 0 0Description
\\ This track shows alignments between D. melanogaster expressed sequence tags\ (ESTs) in GenBank and the genome. ESTs are single-read sequences, \ typically about 500 bases in length, that usually represent fragments of \ transcribed genes.
\ \Display Conventions and Configuration
\\ This track follows the display conventions for \ PSL alignment tracks. In dense display mode, the items that\ are more darkly shaded indicate matches of better quality.
\\ The strand information (+/-) indicates the\ direction of the match between the EST and the matching\ genomic sequence. It bears no relationship to the direction\ of transcription of the RNA with which it might be associated.
\\ The description page for this track has a filter that can be used to change \ the display mode, alter the color, and include/exclude a subset of items \ within the track. This may be helpful when many items are shown in the track \ display, especially when only some are relevant to the current task.
\\ To use the filter:\
\
\- Type a term in one or more of the text boxes to filter the EST\ display. For example, to apply the filter to all ESTs expressed in a specific\ organ, type the name of the organ in the tissue box. To view the list of \ valid terms for each text box, consult the table in the Table Browser that \ corresponds to the factor on which you wish to filter. For example, the \ "tissue" table contains all the types of tissues that can be \ entered into the tissue text box. Wildcards may also be used in the\ filter.\
- If filtering on more than one value, choose the desired combination\ logic. If "and" is selected, only ESTs that match all filter \ criteria will be highlighted. If "or" is selected, ESTs that \ match any one of the filter criteria will be highlighted.\
- Choose the color or display characteristic that should be used to \ highlight or include/exclude the filtered items. If "exclude" is \ chosen, the browser will not display ESTs that match the filter criteria. \ If "include" is selected, the browser will display only those \ ESTs that match the filter criteria.\
\ This track may also be configured to display base labeling, a feature that\ allows the user to display all bases in the aligning sequence or only those \ that differ from the genomic sequence. For more information about this option,\ click \ here.\
\ \Methods
\\ To make an EST, RNA is isolated from cells and reverse\ transcribed into cDNA. Typically, the cDNA is cloned\ into a plasmid vector and a read is taken from the 5'\ and/or 3' primer. For most — but not all — ESTs, the\ reverse transcription is primed by an oligo-dT, which\ hybridizes with the poly-A tail of mature mRNA. The\ reverse transcriptase may or may not make it to the 5'\ end of the mRNA, which may or may not be degraded.
\\ In general, the 3' ESTs mark the end of transcription\ reasonably well, but the 5' ESTs may end at any point\ within the transcript. Some of the newer cap-selected\ libraries cover transcription start reasonably well. Before the \ cap-selection techniques\ emerged, some projects used random rather than poly-A\ priming in an attempt to retrieve sequence distant from the\ 3' end. These projects were successful at this, but as\ a side effect also deposited sequences from unprocessed\ mRNA and perhaps even genomic sequences into the EST databases.\ Even outside of the random-primed projects, there is a\ degree of non-mRNA contamination. Because of this, a\ single unspliced EST should be viewed with considerable\ skepticism.
\\ To generate this track, D. melanogaster ESTs from GenBank were aligned \ against the genome using blat. Note that the maximum intron length\ allowed by blat is 750,000 bases, which may eliminate some ESTs with very \ long introns that might otherwise align. When a single \ EST aligned in multiple places, the alignment having the \ highest base identity was identified. Only alignments having\ a base identity level within 0.5% of the best and at least 96% base identity \ with the genomic sequence are displayed in this track.
\ \Credits
\\ This track was produced at UCSC from EST sequence data\ submitted to the international public sequence databases by \ scientists worldwide.
\ \References
\\ Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL.\ GenBank: update.\ Nucleic Acids Res. 2004 Jan 1;32(Database issue):D23-6.\ PMID: 14681350; PMC: PMC308779\
\ \\ Kent WJ.\ BLAT - the BLAST-like alignment tool.\ Genome Res. 2002 Apr;12(4):656-64.\ PMID: 11932250; PMC: PMC187518\
\ rna 1 baseColorUseSequence genbank\ group rna\ indelDoubleInsert on\ indelQueryInsert on\ intronGap 30\ longLabel D. melanogaster ESTs Including Unspliced\ maxItems 300\ shortLabel D. melanogaster ESTs\ spectrum on\ table all_est\ track est\ type psl est\ visibility hide\ mrna D. melanogaster mRNAs psl . D. melanogaster mRNAs from GenBank 3 100 0 0 0 127 127 127 0 0 0Description
\\ The mRNA track shows alignments between D. melanogaster mRNAs\ in GenBank and the genome.
\ \Display Conventions and Configuration
\\ This track follows the display conventions for \ PSL alignment tracks. In dense display mode, the items that\ are more darkly shaded indicate matches of better quality.
\\ The description page for this track has a filter that can be used to change \ the display mode, alter the color, and include/exclude a subset of items \ within the track. This may be helpful when many items are shown in the track \ display, especially when only some are relevant to the current task.
\\ To use the filter:\
\
\- Type a term in one or more of the text boxes to filter the mRNA \ display. For example, to apply the filter to all mRNAs expressed in a specific\ organ, type the name of the organ in the tissue box. To view the list of \ valid terms for each text box, consult the table in the Table Browser that \ corresponds to the factor on which you wish to filter. For example, the \ "tissue" table contains all the types of tissues that can be \ entered into the tissue text box. Wildcards may also be used in the\ filter.\
- If filtering on more than one value, choose the desired combination\ logic. If "and" is selected, only mRNAs that match all filter \ criteria will be highlighted. If "or" is selected, mRNAs that \ match any one of the filter criteria will be highlighted.\
- Choose the color or display characteristic that should be used to \ highlight or include/exclude the filtered items. If "exclude" is \ chosen, the browser will not display mRNAs that match the filter criteria. \ If "include" is selected, the browser will display only those \ mRNAs that match the filter criteria.\
\ This track may also be configured to display codon coloring, a feature that\ allows the user to quickly compare mRNAs against the genomic sequence. For more \ information about this option, click \ here.\
\ \Methods
\\ GenBank D. melanogaster mRNAs were aligned against the genome using the \ blat program. When a single mRNA aligned in multiple places, \ the alignment having the highest base identity was found. \ Only alignments having a base identity level within 0.5% of\ the best and at least 96% base identity with the genomic sequence were kept.\
\ \Credits
\\ The mRNA track was produced at UCSC from mRNA sequence data\ submitted to the international public sequence databases by \ scientists worldwide.
\ \References
\\ Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL.\ GenBank: update.\ Nucleic Acids Res. 2004 Jan 1;32(Database issue):D23-6.\ PMID: 14681350; PMC: PMC308779\
\ \\ Kent WJ.\ BLAT - the BLAST-like alignment tool.\ Genome Res. 2002 Apr;12(4):656-64.\ PMID: 11932250; PMC: PMC187518\
\ \ rna 1 baseColorDefault diffCodons\ baseColorUseCds genbank\ baseColorUseSequence genbank\ group rna\ indelDoubleInsert on\ indelPolyA on\ indelQueryInsert on\ longLabel D. melanogaster mRNAs from GenBank\ shortLabel D. melanogaster mRNAs\ showDiffBasesAllScales .\ table all_mrna\ track mrna\ type psl .\ visibility pack\ gold Assembly bed 3 + Assembly from Fragments 0 100 150 100 30 230 170 40 0 0 0Description
\\ This track shows the sequences used in the Aug. 2014 D. melanogaster genome assembly.\
\\ Genome assembly procedures are covered in the NCBI\ assembly documentation.
\
\ NCBI also provides\ specific information about this assembly.\\ The definition of this assembly is from the\ AGP file delivered with the sequence. The NCBI document\ AGP Specification describes the format of the AGP file.\
\\ In dense mode, this track depicts the contigs that make up the \ currently viewed scaffold. \ Contig boundaries are distinguished by the use of alternating gold and brown \ coloration. Where gaps\ exist between contigs, spaces are shown between the gold and brown\ blocks. The relative order and orientation of the contigs\ within a scaffold is always known; therefore, a line is drawn in the graphical\ display to bridge the blocks.
\\ Component types found in this track (with counts of that type in parentheses):\
\
\ map 1 altColor 230,170,40\ color 150,100,30\ group map\ html gold\ longLabel Assembly from Fragments\ shortLabel Assembly\ track gold\ type bed 3 +\ visibility hide\ augustusGene AUGUSTUS genePred AUGUSTUS ab initio gene predictions v3.1 0 100 12 105 0 133 180 127 0 0 0- W - whole genome shotgun (1,862)
\- O - other sequence (8)
\Description
\ \\ This track shows ab initio predictions from the program\ AUGUSTUS (version 3.1).\ The predictions are based on the genome sequence alone.\
\ \\ For more information on the different gene tracks, see our Genes FAQ.
\ \Methods
\ \\ Statistical signal models were built for splice sites, branch-point\ patterns, translation start sites, and the poly-A signal.\ Furthermore, models were built for the sequence content of\ protein-coding and non-coding regions as well as for the length distributions\ of different exon and intron types. Detailed descriptions of most of these different models\ can be found in Mario Stanke's\ dissertation.\ This track shows the most likely gene structure according to a\ Semi-Markov Conditional Random Field model.\ Alternative splicing transcripts were obtained with\ a sampling algorithm (--alternatives-from-sampling=true --sample=100 --minexonintronprob=0.2\ --minmeanexonintronprob=0.5 --maxtracks=3 --temperature=2).\
\ \\ The different models used by Augustus were trained on a number of different species-specific\ gene sets, which included 1000-2000 training gene structures. The --species option allows\ one to choose the species used for training the models. Different training species were used\ for the --species option when generating these predictions for different groups of\ assemblies.\
\ \
\\ \ \ \ \Assembly Group \ \ \Training Species \ \\ \ \ \ \Fish \ \ \zebrafish\ \ \ \ \ \ \Birds \ \ \chicken\ \ \ \ \ \ \Human and all other vertebrates \ \ \human\ \ \ \ \ \ \Nematodes \ \ \caenorhabditis \ \\ \ \ \ \Drosophila \ \ \fly \ \\ \ \ \ \A. mellifera \ \ \honeybee1 \ \\ \ \ \ \A. gambiae \ \ \culex \ \\ \ \ \S. cerevisiae \ \ \saccharomyces \ \\ This table describes which training species was used for a particular group of assemblies.\ When available, the closest related training species was used.\
\ \Credits
\ \ Thanks to the\ Stanke lab\ for providing the AUGUSTUS program. The training for the chicken version was\ done by Stefanie König and the training for the\ human and zebrafish versions was done by Mario Stanke.\ \References
\ \\ Stanke M, Diekhans M, Baertsch R, Haussler D.\ \ Using native and syntenically mapped cDNA alignments to improve de novo gene finding.\ Bioinformatics. 2008 Mar 1;24(5):637-44.\ PMID: 18218656\
\ \\ Stanke M, Waack S.\ \ Gene prediction with a hidden Markov model and a new intron submodel.\ Bioinformatics. 2003 Oct;19 Suppl 2:ii215-25.\ PMID: 14534192\
\ genes 1 baseColorDefault genomicCodons\ baseColorUseCds given\ color 12,105,0\ group genes\ longLabel AUGUSTUS ab initio gene predictions v3.1\ shortLabel AUGUSTUS\ track augustusGene\ type genePred\ visibility hide\ insectsChainNetViewchain Chains bed 3 Insects Chain and Net Alignments 3 100 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Insects Chain and Net Alignments\ parent insectsChainNet\ shortLabel Chains\ spectrum on\ track insectsChainNetViewchain\ view chain\ visibility pack\ cpgIslandSuper CpG Islands bed 4 + CpG Islands (Islands < 300 Bases are Light Green) 0 100 0 100 0 128 228 128 0 0 0Description
\ \CpG islands are associated with genes, particularly housekeeping\ genes, in vertebrates. CpG islands are typically common near\ transcription start sites and may be associated with promoter\ regions. Normally a C (cytosine) base followed immediately by a \ G (guanine) base (a CpG) is rare in\ vertebrate DNA because the Cs in such an arrangement tend to be\ methylated. This methylation helps distinguish the newly synthesized\ DNA strand from the parent strand, which aids in the final stages of\ DNA proofreading after duplication. However, over evolutionary time,\ methylated Cs tend to turn into Ts because of spontaneous\ deamination. The result is that CpGs are relatively rare unless\ there is selective pressure to keep them or a region is not methylated\ for some other reason, perhaps having to do with the regulation of gene\ expression. CpG islands are regions where CpGs are present at\ significantly higher levels than is typical for the genome as a whole.
\ \\ The unmasked version of the track displays potential CpG islands\ that exist in repeat regions and would otherwise not be visible\ in the repeat masked version.\
\ \\ By default, only the masked version of the track is displayed. To view the\ unmasked version, change the visibility settings in the track controls at\ the top of this page.\
\ \Methods
\ \CpG islands were predicted by searching the sequence one base at a\ time, scoring each dinucleotide (+17 for CG and -1 for others) and\ identifying maximally scoring segments. Each segment was then\ evaluated for the following criteria:\ \
\ \
\ \- GC content of 50% or greater
\ \- length greater than 200 bp
\ \- ratio greater than 0.6 of observed number of CG dinucleotides to the expected number on the \ \ basis of the number of Gs and Cs in the segment
\\ The entire genome sequence, masking areas included, was\ used for the construction of the track Unmasked CpG.\ The track CpG Islands is constructed on the sequence after\ all masked sequence is removed.\
\ \The CpG count is the number of CG dinucleotides in the island. \ The Percentage CpG is the ratio of CpG nucleotide bases\ (twice the CpG count) to the length. The ratio of observed to expected \ CpG is calculated according to the formula (cited in \ Gardiner-Garden et al. (1987)):\ \
Obs/Exp CpG = Number of CpG * N / (Number of C * Number of G)\ \ where N = length of sequence.\\ The calculation of the track data is performed by the following command sequence:\
\ twoBitToFa assembly.2bit stdout | maskOutFa stdin hard stdout \\\ | cpg_lh /dev/stdin 2> cpg_lh.err \\\ | awk '{$2 = $2 - 1; width = $3 - $2; printf("%s\\t%d\\t%s\\t%s %s\\t%s\\t%s\\t%0.0f\\t%0.1f\\t%s\\t%s\\n", $1, $2, $3, $5, $6, width, $6, width*$7*0.01, 100.0*2*$6/width, $7, $9);}' \\\ | sort -k1,1 -k2,2n > cpgIsland.bed\\ The unmasked track data is constructed from\ twoBitToFa -noMask output for the twoBitToFa command.\ \ \Data access
\\ CpG islands and its associated tables can be explored interactively using the\ REST API, the\ Table Browser or the\ Data Integrator.\ All the tables can also be queried directly from our public MySQL\ servers, with more information available on our\ help page as well as on\ our blog.
\\ The source for the cpg_lh program can be obtained from\ src/utils/cpgIslandExt/.\ The cpg_lh program binary can be obtained from: http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/cpg_lh (choose "save file")\
\ \Credits
\ \This track was generated using a modification of a program developed by G. Miklem and L. Hillier \ (unpublished).
\ \References
\ \\ Gardiner-Garden M, Frommer M.\ \ CpG islands in vertebrate genomes.\ J Mol Biol. 1987 Jul 20;196(2):261-82.\ PMID: 3656447\
\ regulation 1 altColor 128,228,128\ color 0,100,0\ group regulation\ html cpgIslandSuper\ longLabel CpG Islands (Islands < 300 Bases are Light Green)\ shortLabel CpG Islands\ superTrack on\ track cpgIslandSuper\ type bed 4 +\ crispr CRISPR bed 3 CRISPR/Cas9 Sp. Pyog. target sites 0 100 0 0 0 127 127 127 0 0 0Description
\ \\ This track shows regions of the genome within 200 bp of transcribed regions and\ DNA sequences targetable by CRISPR RNA guides using the Cas9 enzyme\ from S. pyogenes (PAM: NGG).\ CRISPR target sites were annotated with predicted specificity\ (off-target effects) and predicted efficiency (on-target cleavage) by various\ algorithms through the tool CRISPOR.\
\ \Display Conventions and Configuration
\ \\ The track "CRISPR Regions" shows the regions of the genome where target\ sites were analyzed, i.e. within 200 bp of transcribed regions as annotated by\ Ensembl transcript models.
\ \\ The track "CRISPR Targets" shows the target sites in these regions.\ The target sequence of the guide is shown with a thick (exon) bar. The PAM\ motif match (NGG) is shown with a thinner bar. Guides\ are colored to reflect both predicted specificity and efficiency. Specificity\ reflects the "uniqueness" of a 20mer sequence in the genome; the less unique a\ sequence is, the more likely it is to cleave other locations of the genome\ (off-target effects). Efficiency is the frequency of cleavage at the target\ site (on-target efficiency).
\ \Shades of gray stand for sites that are hard to target specifically, as the\ 20mer is not very unique in the genome:
\\
\ \\ impossible to target: target site has at least one identical copy in the genome and was not scored \ hard to target: many similar sequences in the genome that alignment stopped, repeat? \ hard to target: target site was aligned but results in a low specificity score <= 50 (see below) Colors highlight targets that are specific in the genome (MIT specificity > 50) but have different predicted efficiencies:
\\
\ unable to calculate Doench/Fusi 2016 efficiency score \ low predicted cleavage: Doench/Fusi 2016 Efficiency percentile <= 30 \ medium predicted cleavage: Doench/Fusi 2016 Efficiency percentile > 30 and < 55 \ high predicted cleavage: Doench/Fusi 2016 Efficiency > 55
\ \\ Mouse-over a target site to show predicted specificity and efficiency scores:
\\
\ \ \- The MIT Specificity score summarizes all off-targets into a single number from\ 0-100. The higher the number, the fewer off-target effects are expected. We\ recommend guides with an MIT specificity > 50.
\- The efficiency score tries to predict if a guide leads to rather strong or\ weak cleavage. According to (Haeussler et al. 2016), the Doench\ 2016 Efficiency score should be used to select the guide with the highest\ cleavage efficiency when expressing guides from RNA PolIII Promoters such as\ U6. Scores are given as percentiles, e.g. "70%" means that 70% of mammalian\ guides have a score equal or lower than this guide. The raw score number is\ also shown in parentheses after the percentile.
\- The Moreno-Mateos 2015 Efficiency\ score should be used instead of the Doench 2016 score when transcribing the\ guide in vitro with a T7 promoter, e.g. for injections in mouse, zebrafish or\ Xenopus embryos. The Moreno-Mateos score is given in percentiles and the raw value in parentheses, see the note above.
Click onto features to show all scores and predicted off-targets with up to\ four mismatches. The Out-of-Frame score by Bae et al. 2014\ is correlated with\ the probability that mutations induced by the guide RNA will disrupt the open\ reading frame. The authors recommend out-of-frame scores > 66 to create\ knock-outs with a single guide efficiently.
\ \
Off-target sites are sorted by the CFD (Cutting Frequency Determination) \ score (Doench et al. 2016). \ The higher the CFD score, the more likely there is off-target cleavage at that site. \ Off-targets with a CFD score < 0.023 are not shown on this page, but are availble when \ following the link to the external CRISPOR tool. \ When compared against experimentally validated off-targets by \ Haeussler et al. 2016, the large majority of predicted\ off-targets with CFD scores < 0.023 were false-positives.
\ \Methods
\ \Relationship between predictions and experimental data
\ \\ Like most algorithms, the MIT specificity score is not always a perfect\ predictor of off-target effects. Despite low scores, many tested guides \ caused few and/or weak off-target cleavage when tested with whole-genome assays\ (Figure 2 from Haeussler\ et al. 2016), as shown below, and the published data contains few data points\ with high specificity scores. Overall though, the assays showed that the higher\ the specificity score, the lower the off-target effects.
\ \\ \
Similarly, efficiency scoring is not very accurate: guides with low\ scores can be efficient and vice versa. As a general rule, however, the higher\ the score, the less likely that a guide is very inefficient. The\ following histograms illustrate, for each type of score, how the share of\ inefficient guides drops with increasing efficiency scores:\
\ \\ \
When reading this plot, keep in mind that both scores were evaluated on\ their own training data. Especially for the Moreno-Mateos score, the\ results are too optimistic, due to overfitting. When evaluated on independent\ datasets, the correlation of the prediction with other assays was around 25%\ lower, see Haeussler et al. 2016. At the time of\ writing, there is no independent dataset available yet to determine the\ Moreno-Mateos accuracy for each score percentile range.
\ \Track methods
\\ Exons as predicted by Ensembl Gene models were used, extended by 200 basepairs\ on each side, searched for the -NGG motif. Flanking 20mer guide sequences were\ aligned to the genome with BWA and scored with MIT Specificity scores using the\ command-line version of crispor.org. Non-unique guide sequences were skipped.\ Flanking sequences were extracted from the genome and input for Crispor\ efficiency scoring, available from the Crispor downloads page, which\ includes the Doench 2016, Moreno-Mateos 2015 and Bae\ 2014 algorithms, among others.\
\ \Data Access
\\ The raw data can be explored interactively with the Table Browser.\ For automated analysis, the genome annotation is stored in a bigBed file that\ can be downloaded from\ our download server.\ The files for this track are called crispr.bb and crisprDetails.tab and are located in the /gbdb/dm6/crispr directory of our downloads server. Individual\ regions or the whole genome annotation can be obtained using our tool bigBedToBed,\ which can be compiled from the source code or downloaded as a precompiled\ binary for your system. Instructions for downloading source code and binaries can be found\ here. The tool\ can also be used to obtain only features within a given range, e.g. bigBedToBed\ http://hgdownload.soe.ucsc.edu/gbdb/hg19/crispr/crispr.bb -chrom=chr21\ -start=0 -end=10000000 stdout
\ \\ The file crisprDetails.tab includes the details of the off-targets. The last\ column of the bigBed file is the offset of the respective line in\ crisprDetails.tab. E.g. if the last column is 14227033723, then the following\ command will extract the line with the corresponding off-target details:\ curl -s -r 14227033723-14227043723 http://hgdownload.soe.ucsc.edu/gbdb/hg19/crispr/crisprDetails.tab | head -n1. The off-target details can currently not be joined with the table\ browser.
\ \\ The file crisprDetails.tab is a tab-separated text file with two fields. The\ first field contains the numbers of off-targets for each mismatch, e.g. "0,0,1,3,49" \ means 0 off-targets at zero mismatches, 1 at two mismatches, 3 at three and 49\ off-targets at four mismatches. The second field is a pipe-separated list of\ semicolon-separated tuples with the genome coordinates and the CFD score. E.g.\ "chr10;123376795+;42|chr5;148353274-;39" describes two off-targets, with the\ first at chr1:123376795 on the positive strand and a CFD score 0.42
\ \Credits
\ \\ Track created by Maximilian Haeussler and Hiram Clawson, with helpful input from Jean-Paul Concordet (MNHN Paris) and Alberto Stolfi (NYU).\
\ \References
\ \\ Haeussler M, Schönig K, Eckert H, Eschstruth A, Mianné J, Renaud JB, Schneider-Maunoury S,\ Shkumatava A, Teboul L, Kent J et al.\ Evaluation of off-target and on-target scoring algorithms and integration into the\ guide RNA selection tool CRISPOR.\ Genome Biol. 2016 Jul 5;17(1):148.\ PMID: 27380939; PMC: PMC4934014\
\ \\ Bae S, Kweon J, Kim HS, Kim JS.\ \ Microhomology-based choice of Cas9 nuclease target sites.\ Nat Methods. 2014 Jul;11(7):705-6.\ PMID: 24972169\
\ \\ Doench JG, Fusi N, Sullender M, Hegde M, Vaimberg EW, Donovan KF, Smith I, Tothova Z, Wilen C,\ Orchard R et al.\ \ Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9.\ Nat Biotechnol. 2016 Feb;34(2):184-91.\ PMID: 26780180; PMC: PMC4744125\
\ \\ Hsu PD, Scott DA, Weinstein JA, Ran FA, Konermann S, Agarwala V, Li Y, Fine EJ, Wu X, Shalem O\ et al.\ \ DNA targeting specificity of RNA-guided Cas9 nucleases.\ Nat Biotechnol. 2013 Sep;31(9):827-32.\ PMID: 23873081; PMC: PMC3969858\
\ \\ Moreno-Mateos MA, Vejnar CE, Beaudoin JD, Fernandez JP, Mis EK, Khokha MK, Giraldez AJ.\ \ CRISPRscan: designing highly efficient sgRNAs for CRISPR-Cas9 targeting in vivo.\ Nat Methods. 2015 Oct;12(10):982-8.\ PMID: 26322839; PMC: PMC4589495\
\ genes 1 group genes\ html crispr\ longLabel CRISPR/Cas9 Sp. Pyog. target sites\ shortLabel CRISPR\ superTrack on\ track crispr\ type bed 3\ visibility hide\ crisprRanges CRISPR Regions bed 3 Genome regions processed to find CRISPR/Cas9 target sites (exons +/- 200 bp) 1 100 110 110 110 182 182 182 0 0 0Description
\ \\ This track shows regions of the genome within 200 bp of transcribed regions and\ DNA sequences targetable by CRISPR RNA guides using the Cas9 enzyme\ from S. pyogenes (PAM: NGG).\ CRISPR target sites were annotated with predicted specificity\ (off-target effects) and predicted efficiency (on-target cleavage) by various\ algorithms through the tool CRISPOR.\
\ \Display Conventions and Configuration
\ \\ The track "CRISPR Regions" shows the regions of the genome where target\ sites were analyzed, i.e. within 200 bp of transcribed regions as annotated by\ Ensembl transcript models.
\ \\ The track "CRISPR Targets" shows the target sites in these regions.\ The target sequence of the guide is shown with a thick (exon) bar. The PAM\ motif match (NGG) is shown with a thinner bar. Guides\ are colored to reflect both predicted specificity and efficiency. Specificity\ reflects the "uniqueness" of a 20mer sequence in the genome; the less unique a\ sequence is, the more likely it is to cleave other locations of the genome\ (off-target effects). Efficiency is the frequency of cleavage at the target\ site (on-target efficiency).
\ \Shades of gray stand for sites that are hard to target specifically, as the\ 20mer is not very unique in the genome:
\\
\ \\ impossible to target: target site has at least one identical copy in the genome and was not scored \ hard to target: many similar sequences in the genome that alignment stopped, repeat? \ hard to target: target site was aligned but results in a low specificity score <= 50 (see below) Colors highlight targets that are specific in the genome (MIT specificity > 50) but have different predicted efficiencies:
\\
\ unable to calculate Doench/Fusi 2016 efficiency score \ low predicted cleavage: Doench/Fusi 2016 Efficiency percentile <= 30 \ medium predicted cleavage: Doench/Fusi 2016 Efficiency percentile > 30 and < 55 \ high predicted cleavage: Doench/Fusi 2016 Efficiency > 55
\ \\ Mouse-over a target site to show predicted specificity and efficiency scores:
\\
\ \ \- The MIT Specificity score summarizes all off-targets into a single number from\ 0-100. The higher the number, the fewer off-target effects are expected. We\ recommend guides with an MIT specificity > 50.
\- The efficiency score tries to predict if a guide leads to rather strong or\ weak cleavage. According to (Haeussler et al. 2016), the Doench\ 2016 Efficiency score should be used to select the guide with the highest\ cleavage efficiency when expressing guides from RNA PolIII Promoters such as\ U6. Scores are given as percentiles, e.g. "70%" means that 70% of mammalian\ guides have a score equal or lower than this guide. The raw score number is\ also shown in parentheses after the percentile.
\- The Moreno-Mateos 2015 Efficiency\ score should be used instead of the Doench 2016 score when transcribing the\ guide in vitro with a T7 promoter, e.g. for injections in mouse, zebrafish or\ Xenopus embryos. The Moreno-Mateos score is given in percentiles and the raw value in parentheses, see the note above.
Click onto features to show all scores and predicted off-targets with up to\ four mismatches. The Out-of-Frame score by Bae et al. 2014\ is correlated with\ the probability that mutations induced by the guide RNA will disrupt the open\ reading frame. The authors recommend out-of-frame scores > 66 to create\ knock-outs with a single guide efficiently.
\ \
Off-target sites are sorted by the CFD (Cutting Frequency Determination) \ score (Doench et al. 2016). \ The higher the CFD score, the more likely there is off-target cleavage at that site. \ Off-targets with a CFD score < 0.023 are not shown on this page, but are availble when \ following the link to the external CRISPOR tool. \ When compared against experimentally validated off-targets by \ Haeussler et al. 2016, the large majority of predicted\ off-targets with CFD scores < 0.023 were false-positives.
\ \Methods
\ \Relationship between predictions and experimental data
\ \\ Like most algorithms, the MIT specificity score is not always a perfect\ predictor of off-target effects. Despite low scores, many tested guides \ caused few and/or weak off-target cleavage when tested with whole-genome assays\ (Figure 2 from Haeussler\ et al. 2016), as shown below, and the published data contains few data points\ with high specificity scores. Overall though, the assays showed that the higher\ the specificity score, the lower the off-target effects.
\ \\ \
Similarly, efficiency scoring is not very accurate: guides with low\ scores can be efficient and vice versa. As a general rule, however, the higher\ the score, the less likely that a guide is very inefficient. The\ following histograms illustrate, for each type of score, how the share of\ inefficient guides drops with increasing efficiency scores:\
\ \\ \
When reading this plot, keep in mind that both scores were evaluated on\ their own training data. Especially for the Moreno-Mateos score, the\ results are too optimistic, due to overfitting. When evaluated on independent\ datasets, the correlation of the prediction with other assays was around 25%\ lower, see Haeussler et al. 2016. At the time of\ writing, there is no independent dataset available yet to determine the\ Moreno-Mateos accuracy for each score percentile range.
\ \Track methods
\\ Exons as predicted by Ensembl Gene models were used, extended by 200 basepairs\ on each side, searched for the -NGG motif. Flanking 20mer guide sequences were\ aligned to the genome with BWA and scored with MIT Specificity scores using the\ command-line version of crispor.org. Non-unique guide sequences were skipped.\ Flanking sequences were extracted from the genome and input for Crispor\ efficiency scoring, available from the Crispor downloads page, which\ includes the Doench 2016, Moreno-Mateos 2015 and Bae\ 2014 algorithms, among others.\
\ \Data Access
\\ The raw data can be explored interactively with the Table Browser.\ For automated analysis, the genome annotation is stored in a bigBed file that\ can be downloaded from\ our download server.\ The files for this track are called crispr.bb and crisprDetails.tab and are located in the /gbdb/dm6/crispr directory of our downloads server. Individual\ regions or the whole genome annotation can be obtained using our tool bigBedToBed,\ which can be compiled from the source code or downloaded as a precompiled\ binary for your system. Instructions for downloading source code and binaries can be found\ here. The tool\ can also be used to obtain only features within a given range, e.g. bigBedToBed\ http://hgdownload.soe.ucsc.edu/gbdb/hg19/crisprRanges/crispr.bb -chrom=chr21\ -start=0 -end=10000000 stdout
\ \\ The file crisprDetails.tab includes the details of the off-targets. The last\ column of the bigBed file is the offset of the respective line in\ crisprDetails.tab. E.g. if the last column is 14227033723, then the following\ command will extract the line with the corresponding off-target details:\ curl -s -r 14227033723-14227043723 http://hgdownload.soe.ucsc.edu/gbdb/hg19/crispr/crisprDetails.tab | head -n1. The off-target details can currently not be joined with the table\ browser.
\ \\ The file crisprDetails.tab is a tab-separated text file with two fields. The\ first field contains the numbers of off-targets for each mismatch, e.g. "0,0,1,3,49" \ means 0 off-targets at zero mismatches, 1 at two mismatches, 3 at three and 49\ off-targets at four mismatches. The second field is a pipe-separated list of\ semicolon-separated tuples with the genome coordinates and the CFD score. E.g.\ "chr10;123376795+;42|chr5;148353274-;39" describes two off-targets, with the\ first at chr1:123376795 on the positive strand and a CFD score 0.42
\ \Credits
\ \\ Track created by Maximilian Haeussler and Hiram Clawson, with helpful input from Jean-Paul Concordet (MNHN Paris) and Alberto Stolfi (NYU).\
\ \References
\ \\ Haeussler M, Schönig K, Eckert H, Eschstruth A, Mianné J, Renaud JB, Schneider-Maunoury S,\ Shkumatava A, Teboul L, Kent J et al.\ Evaluation of off-target and on-target scoring algorithms and integration into the\ guide RNA selection tool CRISPOR.\ Genome Biol. 2016 Jul 5;17(1):148.\ PMID: 27380939; PMC: PMC4934014\
\ \\ Bae S, Kweon J, Kim HS, Kim JS.\ \ Microhomology-based choice of Cas9 nuclease target sites.\ Nat Methods. 2014 Jul;11(7):705-6.\ PMID: 24972169\
\ \\ Doench JG, Fusi N, Sullender M, Hegde M, Vaimberg EW, Donovan KF, Smith I, Tothova Z, Wilen C,\ Orchard R et al.\ \ Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9.\ Nat Biotechnol. 2016 Feb;34(2):184-91.\ PMID: 26780180; PMC: PMC4744125\
\ \\ Hsu PD, Scott DA, Weinstein JA, Ran FA, Konermann S, Agarwala V, Li Y, Fine EJ, Wu X, Shalem O\ et al.\ \ DNA targeting specificity of RNA-guided Cas9 nucleases.\ Nat Biotechnol. 2013 Sep;31(9):827-32.\ PMID: 23873081; PMC: PMC3969858\
\ \\ Moreno-Mateos MA, Vejnar CE, Beaudoin JD, Fernandez JP, Mis EK, Khokha MK, Giraldez AJ.\ \ CRISPRscan: designing highly efficient sgRNAs for CRISPR-Cas9 targeting in vivo.\ Nat Methods. 2015 Oct;12(10):982-8.\ PMID: 26322839; PMC: PMC4589495\
\ genes 1 color 110,110,110\ html crispr\ longLabel Genome regions processed to find CRISPR/Cas9 target sites (exons +/- 200 bp)\ parent crispr\ shortLabel CRISPR Regions\ track crisprRanges\ type bed 3\ visibility dense\ crisprTargets CRISPR Targets bigBed 9 + CRISPR/Cas9 -NGG Targets 1 100 0 0 0 127 127 127 0 0 0 http://crispor.tefor.net/crispor.py?org=$D&pos=$S:${&pam=NGGDescription
\ \\ This track shows regions of the genome within 200 bp of transcribed regions and\ DNA sequences targetable by CRISPR RNA guides using the Cas9 enzyme\ from S. pyogenes (PAM: NGG).\ CRISPR target sites were annotated with predicted specificity\ (off-target effects) and predicted efficiency (on-target cleavage) by various\ algorithms through the tool CRISPOR.\
\ \Display Conventions and Configuration
\ \\ The track "CRISPR Regions" shows the regions of the genome where target\ sites were analyzed, i.e. within 200 bp of transcribed regions as annotated by\ Ensembl transcript models.
\ \\ The track "CRISPR Targets" shows the target sites in these regions.\ The target sequence of the guide is shown with a thick (exon) bar. The PAM\ motif match (NGG) is shown with a thinner bar. Guides\ are colored to reflect both predicted specificity and efficiency. Specificity\ reflects the "uniqueness" of a 20mer sequence in the genome; the less unique a\ sequence is, the more likely it is to cleave other locations of the genome\ (off-target effects). Efficiency is the frequency of cleavage at the target\ site (on-target efficiency).
\ \Shades of gray stand for sites that are hard to target specifically, as the\ 20mer is not very unique in the genome:
\\
\ \\ impossible to target: target site has at least one identical copy in the genome and was not scored \ hard to target: many similar sequences in the genome that alignment stopped, repeat? \ hard to target: target site was aligned but results in a low specificity score <= 50 (see below) Colors highlight targets that are specific in the genome (MIT specificity > 50) but have different predicted efficiencies:
\\
\ unable to calculate Doench/Fusi 2016 efficiency score \ low predicted cleavage: Doench/Fusi 2016 Efficiency percentile <= 30 \ medium predicted cleavage: Doench/Fusi 2016 Efficiency percentile > 30 and < 55 \ high predicted cleavage: Doench/Fusi 2016 Efficiency > 55
\ \\ Mouse-over a target site to show predicted specificity and efficiency scores:
\\
\ \ \- The MIT Specificity score summarizes all off-targets into a single number from\ 0-100. The higher the number, the fewer off-target effects are expected. We\ recommend guides with an MIT specificity > 50.
\- The efficiency score tries to predict if a guide leads to rather strong or\ weak cleavage. According to (Haeussler et al. 2016), the Doench\ 2016 Efficiency score should be used to select the guide with the highest\ cleavage efficiency when expressing guides from RNA PolIII Promoters such as\ U6. Scores are given as percentiles, e.g. "70%" means that 70% of mammalian\ guides have a score equal or lower than this guide. The raw score number is\ also shown in parentheses after the percentile.
\- The Moreno-Mateos 2015 Efficiency\ score should be used instead of the Doench 2016 score when transcribing the\ guide in vitro with a T7 promoter, e.g. for injections in mouse, zebrafish or\ Xenopus embryos. The Moreno-Mateos score is given in percentiles and the raw value in parentheses, see the note above.
Click onto features to show all scores and predicted off-targets with up to\ four mismatches. The Out-of-Frame score by Bae et al. 2014\ is correlated with\ the probability that mutations induced by the guide RNA will disrupt the open\ reading frame. The authors recommend out-of-frame scores > 66 to create\ knock-outs with a single guide efficiently.
\ \
Off-target sites are sorted by the CFD (Cutting Frequency Determination) \ score (Doench et al. 2016). \ The higher the CFD score, the more likely there is off-target cleavage at that site. \ Off-targets with a CFD score < 0.023 are not shown on this page, but are availble when \ following the link to the external CRISPOR tool. \ When compared against experimentally validated off-targets by \ Haeussler et al. 2016, the large majority of predicted\ off-targets with CFD scores < 0.023 were false-positives.
\ \Methods
\ \Relationship between predictions and experimental data
\ \\ Like most algorithms, the MIT specificity score is not always a perfect\ predictor of off-target effects. Despite low scores, many tested guides \ caused few and/or weak off-target cleavage when tested with whole-genome assays\ (Figure 2 from Haeussler\ et al. 2016), as shown below, and the published data contains few data points\ with high specificity scores. Overall though, the assays showed that the higher\ the specificity score, the lower the off-target effects.
\ \\ \
Similarly, efficiency scoring is not very accurate: guides with low\ scores can be efficient and vice versa. As a general rule, however, the higher\ the score, the less likely that a guide is very inefficient. The\ following histograms illustrate, for each type of score, how the share of\ inefficient guides drops with increasing efficiency scores:\
\ \\ \
When reading this plot, keep in mind that both scores were evaluated on\ their own training data. Especially for the Moreno-Mateos score, the\ results are too optimistic, due to overfitting. When evaluated on independent\ datasets, the correlation of the prediction with other assays was around 25%\ lower, see Haeussler et al. 2016. At the time of\ writing, there is no independent dataset available yet to determine the\ Moreno-Mateos accuracy for each score percentile range.
\ \Track methods
\\ Exons as predicted by Ensembl Gene models were used, extended by 200 basepairs\ on each side, searched for the -NGG motif. Flanking 20mer guide sequences were\ aligned to the genome with BWA and scored with MIT Specificity scores using the\ command-line version of crispor.org. Non-unique guide sequences were skipped.\ Flanking sequences were extracted from the genome and input for Crispor\ efficiency scoring, available from the Crispor downloads page, which\ includes the Doench 2016, Moreno-Mateos 2015 and Bae\ 2014 algorithms, among others.\
\ \Data Access
\\ The raw data can be explored interactively with the Table Browser.\ For automated analysis, the genome annotation is stored in a bigBed file that\ can be downloaded from\ our download server.\ The files for this track are called crispr.bb and crisprDetails.tab and are located in the /gbdb/dm6/crispr directory of our downloads server. Individual\ regions or the whole genome annotation can be obtained using our tool bigBedToBed,\ which can be compiled from the source code or downloaded as a precompiled\ binary for your system. Instructions for downloading source code and binaries can be found\ here. The tool\ can also be used to obtain only features within a given range, e.g. bigBedToBed\ http://hgdownload.soe.ucsc.edu/gbdb/hg19/crisprTargets/crispr.bb -chrom=chr21\ -start=0 -end=10000000 stdout
\ \\ The file crisprDetails.tab includes the details of the off-targets. The last\ column of the bigBed file is the offset of the respective line in\ crisprDetails.tab. E.g. if the last column is 14227033723, then the following\ command will extract the line with the corresponding off-target details:\ curl -s -r 14227033723-14227043723 http://hgdownload.soe.ucsc.edu/gbdb/hg19/crispr/crisprDetails.tab | head -n1. The off-target details can currently not be joined with the table\ browser.
\ \\ The file crisprDetails.tab is a tab-separated text file with two fields. The\ first field contains the numbers of off-targets for each mismatch, e.g. "0,0,1,3,49" \ means 0 off-targets at zero mismatches, 1 at two mismatches, 3 at three and 49\ off-targets at four mismatches. The second field is a pipe-separated list of\ semicolon-separated tuples with the genome coordinates and the CFD score. E.g.\ "chr10;123376795+;42|chr5;148353274-;39" describes two off-targets, with the\ first at chr1:123376795 on the positive strand and a CFD score 0.42
\ \Credits
\ \\ Track created by Maximilian Haeussler and Hiram Clawson, with helpful input from Jean-Paul Concordet (MNHN Paris) and Alberto Stolfi (NYU).\
\ \References
\ \\ Haeussler M, Schönig K, Eckert H, Eschstruth A, Mianné J, Renaud JB, Schneider-Maunoury S,\ Shkumatava A, Teboul L, Kent J et al.\ Evaluation of off-target and on-target scoring algorithms and integration into the\ guide RNA selection tool CRISPOR.\ Genome Biol. 2016 Jul 5;17(1):148.\ PMID: 27380939; PMC: PMC4934014\
\ \\ Bae S, Kweon J, Kim HS, Kim JS.\ \ Microhomology-based choice of Cas9 nuclease target sites.\ Nat Methods. 2014 Jul;11(7):705-6.\ PMID: 24972169\
\ \\ Doench JG, Fusi N, Sullender M, Hegde M, Vaimberg EW, Donovan KF, Smith I, Tothova Z, Wilen C,\ Orchard R et al.\ \ Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9.\ Nat Biotechnol. 2016 Feb;34(2):184-91.\ PMID: 26780180; PMC: PMC4744125\
\ \\ Hsu PD, Scott DA, Weinstein JA, Ran FA, Konermann S, Agarwala V, Li Y, Fine EJ, Wu X, Shalem O\ et al.\ \ DNA targeting specificity of RNA-guided Cas9 nucleases.\ Nat Biotechnol. 2013 Sep;31(9):827-32.\ PMID: 23873081; PMC: PMC3969858\
\ \\ Moreno-Mateos MA, Vejnar CE, Beaudoin JD, Fernandez JP, Mis EK, Khokha MK, Giraldez AJ.\ \ CRISPRscan: designing highly efficient sgRNAs for CRISPR-Cas9 targeting in vivo.\ Nat Methods. 2015 Oct;12(10):982-8.\ PMID: 26322839; PMC: PMC4589495\
\ genes 1 detailsTabUrls _offset=/gbdb/$db/crispr/crisprDetails.tab\ html crispr\ itemRgb on\ longLabel CRISPR/Cas9 -NGG Targets\ mouseOverField _mouseOver\ parent crispr\ scoreLabel MIT Guide Specificity Score\ shortLabel CRISPR Targets\ track crisprTargets\ type bigBed 9 +\ url http://crispor.tefor.net/crispor.py?org=$D&pos=$S:${&pam=NGG\ urlLabel Click here to show this guide on Crispor.org, with expression oligos, validation primers and more\ visibility dense\ ensGene Ensembl Genes genePred ensPep Ensembl Genes 0 100 150 0 0 202 127 127 0 0 0Description
\ \\ These gene predictions were generated by Ensembl.\
\ \\ For more information on the different gene tracks, see our Genes FAQ.
\ \Methods
\ \\ For a description of the methods used in Ensembl gene predictions, please refer to\ Hubbard et al. (2002), also listed in the References section below. \
\ \Data access
\\ Ensembl Gene data can be explored interactively using the\ Table Browser or the\ Data Integrator. \ For local downloads, the genePred format files for dm6 are available in our\ \ downloads directory as ensGene.txt.gz or in our\ \ genes download directory in GTF format.
\ \
\ For programmatic access, the data can be queried from the \ REST API or\ directly from our public MySQL\ servers. Instructions on this method are available on our\ MySQL help page and on\ our blog.\ Previous versions of this track can be found on our archive download server.\
\ \Credits
\ \\ We would like to thank Ensembl for providing these gene annotations. For more information, please see\ Ensembl's genome annotation page.\
\ \References
\ \\ Hubbard T, Barker D, Birney E, Cameron G, Chen Y, Clark L, Cox T, Cuff J,\ Curwen V, Down T et al.\ The Ensembl genome database project.\ Nucleic Acids Res. 2002 Jan 1;30(1):38-41.\ PMID: 11752248; PMC: PMC99161\
\ genes 1 color 150,0,0\ exonNumbers on\ group genes\ longLabel Ensembl Genes\ shortLabel Ensembl Genes\ track ensGene\ type genePred ensPep\ visibility hide\ evaSnp EVA SNP Release 3 bigBed 9 + Short Genetic Variants from European Variant Archive Release 3 0 100 0 0 0 127 127 127 0 0 0 https://www.ebi.ac.uk/eva/?variant&accessionID=$$Description
\\ This track contains mappings of single nucleotide variants\ and small insertions and deletions (indels)\ from the European Variation Archive\ (EVA)\ Release 3 for the D. melanogaster dm6 genome. The dbSNP database at NCBI no longer\ hosts non-human variants.\
\ \Interpreting and Configuring the Graphical Display
\\ Variants are shown as single tick marks at most zoom levels.\ When viewing the track at or near base-level resolution, the displayed\ width of the SNP variant corresponds to the width of the variant in the\ reference sequence. Insertions are indicated by a single tick mark displayed\ between two nucleotides, single nucleotide polymorphisms are displayed as the\ width of a single base, and multiple nucleotide variants are represented by a\ block that spans two or more bases. The display is set to automatically collapse to \ dense visibility when there are more than 100k variants in the window. \ When the window size is more than 250k bp, the display is switched to density graph mode.\
\ \Searching, details, and filtering
\\ Navigation to an individual variant can be accomplished by typing or copying\ the variant identifier (rsID) or the genomic coordinates into the Position/Search box on the \ Browser.
\ \\ A click on an item in the graphical display displays a page with data about\ that variant. Data fields include the Reference and Alternate Alleles, the\ class of the variant as reported by EVA, the source of the data, the amino acid\ change, if any, and the functional class as determined by UCSC's Variant Annotation\ Integrator.\
\ \Variants can be filtered using the track controls to show subsets of the \ data by either EVA Sequence Ontology (SO) term, UCSC-generated functional effect, or\ by color, which bins the UCSC functional effects into general classes.
\ \Mouse-over
\\ Mousing over an item shows the ucscClass, which is the consequence according to the\ Variant Annotation Integrator, and\ the aaChange when one is available, which is the change in amino acid in HGVS.p\ terms. Items may have multiple ucscClasses, which will all be shown in the mouse-over\ in a comma-separated list. Likewise, multiple HGVS.p terms may be shown for each rsID\ separated by spaces describing all possible AA changes.
\\ Multiple items may appear due to different variant predictions on multiple gene transcripts.\ For all organisms the gene models used were ncbiRefSeqCurated, except for mm39 which\ used ncbiRefSeqSelect.
\ \ \Track colors
\ \\ Variants are colored according to the most potentially deleterious functional effect prediction\ according to the Variant Annotation Integrator. Specific bins can be seen in the Methods section\ below.\
\ \\
\
\ \ \\ \Color \Variant Type \\ Protein-altering variants and splice site variants \ Synonymous codon variants \ Non-coding transcript or Untranslated Region (UTR) variants \ Intergenic and intronic variants Sequence ontology (SO)
\ \\ Variants are classified by EVA into one of the following sequence ontology terms:\
\ \\
\ \ \- substitution —\ A single nucleotide in the reference is replaced by another, alternate allele\
- deletion — \ One or more nucleotides is deleted. The representation in the database is to\ display one additional nucleotide in both the Reference field (Ref) and the \ Alternate Allele field (Alt). E.g. a variant that is a deletion of an A\ maybe be represented as Ref = GA and Alt = G.\
- insertion — \ One or more nucleotides is inserted. The representation in the database is to\ display one additional nucleotide in both the Reference field (Ref) and the \ Alternate Allele field (Alt). E.g. a variant that is an insertion of a T maybe \ be represented as Ref = G and Alt = GT \
- delins — \ Similar to tandemRepeat, in that the runs of Ref and Alt Alleles are of\ different length, except that there is more than one type of nucleotide,\ e.g., Ref = CCAAAAACAAAAACA, Alt = ACAAAAAC.\
- multipleNucleotideVariant — \ More than one nucleotide is substituted by an equal number of different \ nucleotides, e.g., Ref = AA, Alt = GC.\
- sequence alteration —\ A parent term meant to signify a deviation from another sequence. Can be\ assigned to variants that have not been characterized yet.\
Methods
\\ Data were downloaded from the European Variation Archive EVA release 3 (2022-02-24)\ current_ids.vcf.gz files corresponding to the proper assembly.
\\ Chromosome names were converted to UCSC-style, a few problematic variants were removed,\ and the variants passed through the\ Variant Annotation Integrator to\ predict consequence. For every organism the ncbiRefSeqCurated gene models were used to\ predict the consequences, except for mm39 which used the ncbiRefSeqSelect models.
\\ Variants were then colored according to their predicted consequence in the following fashion:\
\
\ \ \- Protein-altering variants and \ splice site variants \ - exon_loss_variant, frameshift_variant, \ inframe_deletion, inframe_insertion, initiator_codon_variant, missense_variant, \ splice_acceptor_variant, splice_donor_variant, splice_region_variant, stop_gained, \ stop_lost, coding_sequence_variant, transcript_ablation
\- Synonymous codon variants\ - synonymous_variant, stop_retained_variant
\- Non-coding transcript or\ Untranslated Region (UTR) variants\ - 5_prime_UTR_variant,\ 3_prime_UTR_variant, complex_transcript_variant, non_coding_transcript_exon_variant
\- Intergenic and intronic variants - upstream_gene_variant, downstream_gene_variant,\ intron_variant, intergenic_variant, NMD_transcript_variant, no_sequence_alteration
\ Sequence Ontology ("SO:")\ terms were converted to the variant classes, then the files were converted to BED,\ and then bigBed format.\
\\ No functional annotations were provided by the EVA (e.g., missense, nonsense, etc).\ These were computed using UCSC's Variant Annotation Integrator (Hinrichs, et al., 2016).\ Amino-acid substitutions for missense variants are based\ on RefSeq alignments of mRNA transcripts, which do not always match the amino acids\ predicted from translating the genomic sequence. Therefore, in some instances, the\ variant and the genomic nucleotide and associated amino acid may be reversed.\ E.g., a Pro > Arg change from the perspective of the mRNA would be Arg > Pro from\ the persepective the genomic sequence.\ For complete documentation of the processing of these tracks, read the\ \ EVA Release 3 MakeDoc.
\ \Data Access
\\ Note: It is not recommeneded to use LiftOver to convert SNPs between assemblies,\ and more information about how to convert SNPs between assemblies can be found on the following\ FAQ entry.
\\ The data can be explored interactively with the Table Browser,\ or the Data Integrator. For automated analysis, the data may be\ queried from our REST API. Please refer to our\ mailing list archives\ for questions, or our Data Access FAQ for more\ information.
\ \\ For automated download and analysis, this annotation is stored in a bigBed file that\ can be downloaded from our download server. The file for this track is called evaSnp.bb.\ Individual regions or the whole genome annotation can be obtained using our tool\ bigBedToBed which can be compiled from the source code or downloaded as a precompiled\ binary for your system. Instructions for downloading source code and binaries can be found\ here.\ The tool can also be used to obtain only features within a given range, e.g.\
\ \
\ bigBedToBed https://hgdownload.soe.ucsc.edu/gbdb/dm6/bbi/evaSnp.bb -chrom=chr21 -start=0 -end=100000000 stdout\Credits
\\ This track was produced from the European\ Variation Archive release 3 data. Consequences were predicted using UCSC's Variant Annotation\ Integrator and NCBI's RefSeq gene models. \
\ \References
\\ Cezard T, Cunningham F, Hunt SE, Koylass B, Kumar N, Saunders G, Shen A, Silva AF,\ Tsukanov K, Venkataraman S et al. The European Variation Archive: a FAIR resource of genomic variation for all\ species. Nucleic Acids Res. 2021 Oct 28:gkab960.\ doi:10.1093/nar/gkab960.\ Epub ahead of print. PMID: 34718739. PMID: PMC8728205.\
\\ Hinrichs AS, Raney BJ, Speir ML, Rhead B, Casper J, Karolchik D, Kuhn RM, Rosenbloom KR, Zweig AS,\ Haussler D, Kent WJ.\ UCSC Data Integrator and Variant Annotation Integrator.\ Bioinformatics. 2016 May 1;32(9):1430-2.\ PMID: 26740527; PMC:\ PMC4848401\
\ varRep 1 bigDataUrl /gbdb/dm6/bbi/evaSnp.bb\ filterLabel.itemRgb General variant types by color grouping\ filterLabel.ucscClass Functional effect per UCSC Variant Annotation\ filterLabel.varClass Variant class from EVA SO term\ filterType.ucscClass multipleListOnlyOr\ filterValues.itemRgb 255,,0,,0|Protein-altering and splice variants,0,,128,,0|Synonymous variants,0,,0,,255|Non-coding transcripts or UTR variants,0,,0,,0|Intergenic and intronic variants\ filterValues.ucscClass downstream_gene_variant|Downstream gene variant,upstream_gene_variant|Upstream gene variant,intron_variant|Intron variant,NMD_transcript_variant|Nonsense-mediated mRNA decay (NMD) variant,5_prime_UTR_variant|5 prime UTR variant,3_prime_UTR_variant|3 prime UTR variant,missense_variant|Missense variant,synonymous_variant|Synonymous variant,non_coding_transcript_exon_variant|Non-coding transcript exon variant,no_sequence_alteration|No sequence alteration,splice_region_variant|Splice region variant,frameshift_variant|Frameshift variant,stop_gained|Stop gained,splice_acceptor_variant|Splice acceptor variant,inframe_deletion|Inframe deletion,inframe_insertion|Inframe insertion,splice_donor_variant|Splice donor variant,coding_sequence_variant|Coding sequence variant,initiator_codon_variant|Initiator codon variant,stop_lost|Stop lost,stop_retained_variant|Stop retained variant,intergenic_variant|Intergenic variant\ filterValues.varClass deletion|Deletion,delins|Deletion-Insertion,insertion|Insertion,multipleNucleotideSubstitution|Multiple nucleotide substitution,substitution|Substitution,sequence alteration|Sequence alteration\ group varRep\ itemRgb on\ longLabel Short Genetic Variants from European Variant Archive Release 3\ maxItems 1000000\ maxWindowCoverage 250000\ mouseOver $ref>$alt $ucscClass $aaChange\ shortLabel EVA SNP Release 3\ track evaSnp\ type bigBed 9 +\ url https://www.ebi.ac.uk/eva/?variant&accessionID=$$\ visibility hide\ evaSnp4 EVA SNP Release 4 bigBed 9 + Short Genetic Variants from European Variant Archive Release 4 0 100 0 0 0 127 127 127 0 0 0 https://www.ebi.ac.uk/eva/?variant&accessionID=$$Description
\\ This track contains mappings of single nucleotide variants\ and small insertions and deletions (indels)\ from the European Variation Archive\ (EVA)\ Release 4 for the D. melanogaster dm6 genome. The dbSNP database at NCBI no longer\ hosts non-human variants.\
\ \Interpreting and Configuring the Graphical Display
\\ Variants are shown as single tick marks at most zoom levels.\ When viewing the track at or near base-level resolution, the displayed\ width of the SNP variant corresponds to the width of the variant in the\ reference sequence. Insertions are indicated by a single tick mark displayed\ between two nucleotides, single nucleotide polymorphisms are displayed as the\ width of a single base, and multiple nucleotide variants are represented by a\ block that spans two or more bases. The display is set to automatically collapse to \ dense visibility when there are more than 100k variants in the window. \ When the window size is more than 250k bp, the display is switched to density graph mode.\
\ \Searching, details, and filtering
\\ Navigation to an individual variant can be accomplished by typing or copying\ the variant identifier (rsID) or the genomic coordinates into the Position/Search box on the \ Browser.
\ \\ A click on an item in the graphical display displays a page with data about\ that variant. Data fields include the Reference and Alternate Alleles, the\ class of the variant as reported by EVA, the source of the data, the amino acid\ change, if any, and the functional class as determined by UCSC's Variant Annotation\ Integrator.\
\ \Variants can be filtered using the track controls to show subsets of the \ data by either EVA Sequence Ontology (SO) term, UCSC-generated functional effect, or\ by color, which bins the UCSC functional effects into general classes.
\ \Mouse-over
\\ Mousing over an item shows the ucscClass, which is the consequence according to the\ Variant Annotation Integrator, and\ the aaChange when one is available, which is the change in amino acid in HGVS.p\ terms. Items may have multiple ucscClasses, which will all be shown in the mouse-over\ in a comma-separated list. Likewise, multiple HGVS.p terms may be shown for each rsID\ separated by spaces describing all possible AA changes.
\\ Multiple items may appear due to different variant predictions on multiple gene transcripts.\ For all organisms the gene models used were the NCBI RefSeq curated when available, if not then \ ensembl genes, or finally UCSC mappings of RefSeq if neither of the previous models was possible.\
\ \Track colors
\ \\ Variants are colored according to the most potentially deleterious functional effect prediction\ according to the Variant Annotation Integrator. Specific bins can be seen in the Methods section\ below.\
\ \\
\
\ \ \\ \Color \Variant Type \\ Protein-altering variants and splice site variants \ Synonymous codon variants \ Non-coding transcript or Untranslated Region (UTR) variants \ Intergenic and intronic variants Sequence ontology (SO)
\ \\ Variants are classified by EVA into one of the following sequence ontology terms:\
\ \\
\ \ \- substitution —\ A single nucleotide in the reference is replaced by another, alternate allele\
- deletion — \ One or more nucleotides is deleted. The representation in the database is to\ display one additional nucleotide in both the Reference field (Ref) and the \ Alternate Allele field (Alt). E.g. a variant that is a deletion of an A\ maybe be represented as Ref = GA and Alt = G.\
- insertion — \ One or more nucleotides is inserted. The representation in the database is to\ display one additional nucleotide in both the Reference field (Ref) and the \ Alternate Allele field (Alt). E.g. a variant that is an insertion of a T maybe \ be represented as Ref = G and Alt = GT \
- delins — \ Similar to tandemRepeat, in that the runs of Ref and Alt Alleles are of\ different length, except that there is more than one type of nucleotide,\ e.g., Ref = CCAAAAACAAAAACA, Alt = ACAAAAAC.\
- multipleNucleotideVariant — \ More than one nucleotide is substituted by an equal number of different \ nucleotides, e.g., Ref = AA, Alt = GC.\
- sequence alteration —\ A parent term meant to signify a deviation from another sequence. Can be\ assigned to variants that have not been characterized yet.\
Methods
\\ Data were downloaded from the European Variation Archive EVA release 4 (2022-11-21)\ current_ids.vcf.gz files corresponding to the proper assembly.
\\ Chromosome names were converted to UCSC-style\ and the variants passed through the\ Variant Annotation Integrator to\ predict consequence. For every organism the NCBI RefSeq curated models were used when available, \ followed by ensembl genes, and finally UCSC mapping of RefSeq when neither of the previous models\ were possible.
\\ Variants were then colored according to their predicted consequence in the following fashion:\
\
\ \ \- Protein-altering variants and \ splice site variants \ - exon_loss_variant, frameshift_variant, \ inframe_deletion, inframe_insertion, initiator_codon_variant, missense_variant, \ splice_acceptor_variant, splice_donor_variant, splice_region_variant, stop_gained, \ stop_lost, coding_sequence_variant, transcript_ablation
\- Synonymous codon variants\ - synonymous_variant, stop_retained_variant
\- Non-coding transcript or\ Untranslated Region (UTR) variants\ - 5_prime_UTR_variant,\ 3_prime_UTR_variant, complex_transcript_variant, non_coding_transcript_exon_variant
\- Intergenic and intronic variants - upstream_gene_variant, downstream_gene_variant,\ intron_variant, intergenic_variant, NMD_transcript_variant, no_sequence_alteration
\ Sequence Ontology ("SO:")\ terms were converted to the variant classes, then the files were converted to BED,\ and then bigBed format.\
\\ No functional annotations were provided by the EVA (e.g., missense, nonsense, etc).\ These were computed using UCSC's Variant Annotation Integrator (Hinrichs, et al., 2016).\ Amino-acid substitutions for missense variants are based\ on RefSeq alignments of mRNA transcripts, which do not always match the amino acids\ predicted from translating the genomic sequence. Therefore, in some instances, the\ variant and the genomic nucleotide and associated amino acid may be reversed.\ E.g., a Pro > Arg change from the perspective of the mRNA would be Arg > Pro from\ the persepective the genomic sequence. Also, in bosTau9, galGal5, rheMac8, \ danRer10 and danRer11 the mitochondrial sequence was removed or renamed to match UCSC. \ For complete documentation of the processing of these tracks, read the\ \ EVA Release 4 MakeDoc.
\ \Data Access
\\ Note: It is not recommeneded to use LiftOver to convert SNPs between assemblies,\ and more information about how to convert SNPs between assemblies can be found on the following\ FAQ entry.
\\ The data can be explored interactively with the Table Browser,\ or the Data Integrator. For automated analysis, the data may be\ queried from our REST API. Please refer to our\ mailing list archives\ for questions, or our Data Access FAQ for more\ information.
\ \\ For automated download and analysis, this annotation is stored in a bigBed file that\ can be downloaded from our download server. The file for this track is called evaSnp4.bb.\ Individual regions or the whole genome annotation can be obtained using our tool\ bigBedToBed which can be compiled from the source code or downloaded as a precompiled\ binary for your system. Instructions for downloading source code and binaries can be found\ here.\ The tool can also be used to obtain only features within a given range, e.g.\
\ \
\ bigBedToBed https://hgdownload.soe.ucsc.edu/gbdb/dm6/bbi/evaSnp4.bb -chrom=chr21 -start=0 -end=100000000 stdout\Credits
\\ This track was produced from the European\ Variation Archive release 4 data. Consequences were predicted using UCSC's Variant Annotation\ Integrator and NCBI's RefSeq as well as ensembl gene models. \
\ \References
\\ Cezard T, Cunningham F, Hunt SE, Koylass B, Kumar N, Saunders G, Shen A, Silva AF,\ Tsukanov K, Venkataraman S et al. The European Variation Archive: a FAIR resource of genomic variation for all\ species. Nucleic Acids Res. 2021 Oct 28:gkab960.\ doi:10.1093/nar/gkab960.\ Epub ahead of print. PMID: 34718739. PMID: PMC8728205.\
\\ Hinrichs AS, Raney BJ, Speir ML, Rhead B, Casper J, Karolchik D, Kuhn RM, Rosenbloom KR, Zweig AS,\ Haussler D, Kent WJ.\ UCSC Data Integrator and Variant Annotation Integrator.\ Bioinformatics. 2016 May 1;32(9):1430-2.\ PMID: 26740527; PMC:\ PMC4848401\
\ varRep 1 bigDataUrl /gbdb/dm6/bbi/evaSnp4.bb\ filterLabel.itemRgb General variant types by color grouping\ filterLabel.ucscClass Functional effect per UCSC Variant Annotation\ filterLabel.varClass Variant class from EVA SO term\ filterType.ucscClass multipleListOnlyOr\ filterValues.itemRgb 255,,0,,0|Protein-altering and splice variants,0,,128,,0|Synonymous variants,0,,0,,255|Non-coding transcripts or UTR variants,0,,0,,0|Intergenic and intronic variants\ filterValues.ucscClass downstream_gene_variant|Downstream gene variant,upstream_gene_variant|Upstream gene variant,intron_variant|Intron variant,NMD_transcript_variant|Nonsense-mediated mRNA decay (NMD) variant,5_prime_UTR_variant|5 prime UTR variant,3_prime_UTR_variant|3 prime UTR variant,missense_variant|Missense variant,synonymous_variant|Synonymous variant,non_coding_transcript_exon_variant|Non-coding transcript exon variant,no_sequence_alteration|No sequence alteration,splice_region_variant|Splice region variant,frameshift_variant|Frameshift variant,stop_gained|Stop gained,splice_acceptor_variant|Splice acceptor variant,inframe_deletion|Inframe deletion,inframe_insertion|Inframe insertion,splice_donor_variant|Splice donor variant,coding_sequence_variant|Coding sequence variant,initiator_codon_variant|Initiator codon variant,stop_lost|Stop lost,stop_retained_variant|Stop retained variant,intergenic_variant|Intergenic variant\ filterValues.varClass deletion|Deletion,delins|Deletion-Insertion,insertion|Insertion,multipleNucleotideSubstitution|Multiple nucleotide substitution,substitution|Substitution,sequence alteration|Sequence alteration\ group varRep\ itemRgb on\ longLabel Short Genetic Variants from European Variant Archive Release 4\ maxItems 1000000\ maxWindowCoverage 250000\ mouseOver $ref>$alt $ucscClass $aaChange\ shortLabel EVA SNP Release 4\ track evaSnp4\ type bigBed 9 +\ url https://www.ebi.ac.uk/eva/?variant&accessionID=$$\ visibility hide\ evaSnp5 EVA SNP Release 5 bigBed 9 + Short Genetic Variants from European Variant Archive Release 5 1 100 0 0 0 127 127 127 0 0 0 https://www.ebi.ac.uk/eva/?variant&accessionID=$$Description
\\ This track contains mappings of single nucleotide variants\ and small insertions and deletions (indels)\ from the European Variation Archive\ (EVA)\ Release 5 for the D. melanogaster dm6 genome. The dbSNP database at NCBI no longer\ hosts non-human variants.\
\ \Interpreting and Configuring the Graphical Display
\\ Variants are shown as single tick marks at most zoom levels.\ When viewing the track at or near base-level resolution, the displayed\ width of the SNP variant corresponds to the width of the variant in the\ reference sequence. Insertions are indicated by a single tick mark displayed\ between two nucleotides, single nucleotide polymorphisms are displayed as the\ width of a single base, and multiple nucleotide variants are represented by a\ block that spans two or more bases. The display is set to automatically collapse to \ dense visibility when there are more than 100k variants in the window. \ When the window size is more than 250k bp, the display is switched to density graph mode.\
\ \Searching, details, and filtering
\\ Navigation to an individual variant can be accomplished by typing or copying\ the variant identifier (rsID) or the genomic coordinates into the Position/Search box on the \ Browser.
\ \\ A click on an item in the graphical display displays a page with data about\ that variant. Data fields include the Reference and Alternate Alleles, the\ class of the variant as reported by EVA, the source of the data, the amino acid\ change, if any, and the functional class as determined by UCSC's Variant Annotation\ Integrator.\
\ \Variants can be filtered using the track controls to show subsets of the \ data by either EVA Sequence Ontology (SO) term, UCSC-generated functional effect, or\ by color, which bins the UCSC functional effects into general classes.
\ \Mouse-over
\\ Mousing over an item shows the ucscClass, which is the consequence according to the\ Variant Annotation Integrator, and\ the aaChange when one is available, which is the change in amino acid in HGVS.p\ terms. Items may have multiple ucscClasses, which will all be shown in the mouse-over\ in a comma-separated list. Likewise, multiple HGVS.p terms may be shown for each rsID\ separated by spaces describing all possible AA changes.
\\ Multiple items may appear due to different variant predictions on multiple gene transcripts.\ For all organisms the gene models used were the NCBI RefSeq curated when available, if not then \ ensembl genes, or finally UCSC mappings of RefSeq if neither of the previous models was possible.\
\ \Track colors
\ \\ Variants are colored according to the most potentially deleterious functional effect prediction\ according to the Variant Annotation Integrator. Specific bins can be seen in the Methods section\ below.\
\ \\
\
\ \ \\ \Color \Variant Type \\ Protein-altering variants and splice site variants \ Synonymous codon variants \ Non-coding transcript or Untranslated Region (UTR) variants \ Intergenic and intronic variants Sequence ontology (SO)
\ \\ Variants are classified by EVA into one of the following sequence ontology terms:\
\ \\
\ \ \- substitution —\ A single nucleotide in the reference is replaced by another, alternate allele\
- deletion — \ One or more nucleotides is deleted. The representation in the database is to\ display one additional nucleotide in both the Reference field (Ref) and the \ Alternate Allele field (Alt). E.g. a variant that is a deletion of an A\ maybe be represented as Ref = GA and Alt = G.\
- insertion — \ One or more nucleotides is inserted. The representation in the database is to\ display one additional nucleotide in both the Reference field (Ref) and the \ Alternate Allele field (Alt). E.g. a variant that is an insertion of a T maybe \ be represented as Ref = G and Alt = GT \
- delins — \ Similar to tandemRepeat, in that the runs of Ref and Alt Alleles are of\ different length, except that there is more than one type of nucleotide,\ e.g., Ref = CCAAAAACAAAAACA, Alt = ACAAAAAC.\
- multipleNucleotideVariant — \ More than one nucleotide is substituted by an equal number of different \ nucleotides, e.g., Ref = AA, Alt = GC.\
- sequence alteration —\ A parent term meant to signify a deviation from another sequence. Can be\ assigned to variants that have not been characterized yet.\
Methods
\\ Data were downloaded from the European Variation Archive EVA release 5 (2023-9-7)\ current_ids.vcf.gz files corresponding to the proper assembly.
\\ Chromosome names were converted to UCSC-style\ and the variants passed through the\ Variant Annotation Integrator to\ predict consequence. For every organism the NCBI RefSeq curated models were used when available, \ followed by ensembl genes, and finally UCSC mapping of RefSeq when neither of the previous models\ were possible.
\\ Variants were then colored according to their predicted consequence in the following fashion:\
\
\ \ \- Protein-altering variants and \ splice site variants \ - exon_loss_variant, frameshift_variant, \ inframe_deletion, inframe_insertion, initiator_codon_variant, missense_variant, \ splice_acceptor_variant, splice_donor_variant, splice_region_variant, stop_gained, \ stop_lost, coding_sequence_variant, transcript_ablation
\- Synonymous codon variants\ - synonymous_variant, stop_retained_variant
\- Non-coding transcript or\ Untranslated Region (UTR) variants\ - 5_prime_UTR_variant,\ 3_prime_UTR_variant, complex_transcript_variant, non_coding_transcript_exon_variant
\- Intergenic and intronic variants - upstream_gene_variant, downstream_gene_variant,\ intron_variant, intergenic_variant, NMD_transcript_variant, no_sequence_alteration
\ Sequence Ontology ("SO:")\ terms were converted to the variant classes, then the files were converted to BED,\ and then bigBed format.\
\\ No functional annotations were provided by the EVA (e.g., missense, nonsense, etc).\ These were computed using UCSC's Variant Annotation Integrator (Hinrichs, et al., 2016).\ Amino-acid substitutions for missense variants are based\ on RefSeq alignments of mRNA transcripts, which do not always match the amino acids\ predicted from translating the genomic sequence. Therefore, in some instances, the\ variant and the genomic nucleotide and associated amino acid may be reversed.\ E.g., a Pro > Arg change from the perspective of the mRNA would be Arg > Pro from\ the persepective the genomic sequence. Also, in bosTau9, galGal5, rheMac8, \ danRer10 and danRer11 the mitochondrial sequence was removed or renamed to match UCSC. \ For complete documentation of the processing of these tracks, read the\ \ EVA Release 5 MakeDoc.
\ \Data Access
\\ Note: It is not recommeneded to use LiftOver to convert SNPs between assemblies,\ and more information about how to convert SNPs between assemblies can be found on the following\ FAQ entry.
\\ The data can be explored interactively with the Table Browser,\ or the Data Integrator. For automated analysis, the data may be\ queried from our REST API. Please refer to our\ mailing list archives\ for questions, or our Data Access FAQ for more\ information.
\ \\ For automated download and analysis, this annotation is stored in a bigBed file that\ can be downloaded from our download server. The file for this track is called evaSnp5.bb.\ Individual regions or the whole genome annotation can be obtained using our tool\ bigBedToBed which can be compiled from the source code or downloaded as a precompiled\ binary for your system. Instructions for downloading source code and binaries can be found\ here.\ The tool can also be used to obtain only features within a given range, e.g.\
\ \
\ bigBedToBed https://hgdownload.soe.ucsc.edu/gbdb/dm6/bbi/evaSnp5.bb -chrom=chr21 -start=0 -end=100000000 stdout\Credits
\\ This track was produced from the European\ Variation Archive release 5 data. Consequences were predicted using UCSC's Variant Annotation\ Integrator and NCBI's RefSeq as well as ensembl gene models. \
\ \References
\\ Cezard T, Cunningham F, Hunt SE, Koylass B, Kumar N, Saunders G, Shen A, Silva AF,\ Tsukanov K, Venkataraman S et al. The European Variation Archive: a FAIR resource of genomic variation for all\ species. Nucleic Acids Res. 2021 Oct 28:gkab960.\ doi:10.1093/nar/gkab960.\ Epub ahead of print. PMID: 34718739. PMID: PMC8728205.\
\\ Hinrichs AS, Raney BJ, Speir ML, Rhead B, Casper J, Karolchik D, Kuhn RM, Rosenbloom KR, Zweig AS,\ Haussler D, Kent WJ.\ UCSC Data Integrator and Variant Annotation Integrator.\ Bioinformatics. 2016 May 1;32(9):1430-2.\ PMID: 26740527; PMC:\ PMC4848401\
\ varRep 1 bigDataUrl /gbdb/dm6/bbi/evaSnp5.bb\ filterLabel.itemRgb General variant types by color grouping\ filterLabel.ucscClass Functional effect per UCSC Variant Annotation\ filterLabel.varClass Variant class from EVA SO term\ filterType.ucscClass multipleListOnlyOr\ filterValues.itemRgb 255,,0,,0|Protein-altering and splice variants,0,,128,,0|Synonymous variants,0,,0,,255|Non-coding transcripts or UTR variants,0,,0,,0|Intergenic and intronic variants\ filterValues.ucscClass downstream_gene_variant|Downstream gene variant,upstream_gene_variant|Upstream gene variant,intron_variant|Intron variant,NMD_transcript_variant|Nonsense-mediated mRNA decay (NMD) variant,5_prime_UTR_variant|5 prime UTR variant,3_prime_UTR_variant|3 prime UTR variant,missense_variant|Missense variant,synonymous_variant|Synonymous variant,non_coding_transcript_exon_variant|Non-coding transcript exon variant,no_sequence_alteration|No sequence alteration,splice_region_variant|Splice region variant,frameshift_variant|Frameshift variant,stop_gained|Stop gained,splice_acceptor_variant|Splice acceptor variant,inframe_deletion|Inframe deletion,inframe_insertion|Inframe insertion,splice_donor_variant|Splice donor variant,coding_sequence_variant|Coding sequence variant,initiator_codon_variant|Initiator codon variant,stop_lost|Stop lost,stop_retained_variant|Stop retained variant,intergenic_variant|Intergenic variant\ filterValues.varClass deletion|Deletion,delins|Deletion-Insertion,insertion|Insertion,multipleNucleotideSubstitution|Multiple nucleotide substitution,substitution|Substitution,sequence alteration|Sequence alteration\ group varRep\ itemRgb on\ longLabel Short Genetic Variants from European Variant Archive Release 5\ maxItems 1000000\ maxWindowCoverage 250000\ mouseOver $ref>$alt $ucscClass $aaChange\ shortLabel EVA SNP Release 5\ track evaSnp5\ type bigBed 9 +\ url https://www.ebi.ac.uk/eva/?variant&accessionID=$$\ visibility dense\ gap Gap bed 3 + Gap Locations 0 100 0 0 0 127 127 127 0 0 0Description
\\ This track shows the gaps in the Aug. 2014 D. melanogaster genome assembly.\
\\ Genome assembly procedures are covered in the NCBI\ assembly documentation.
\
\ NCBI also provides\ specific information about this assembly.\\ The definition of the gaps in this assembly is from the\ AGP file delivered with the sequence. The NCBI document\ AGP Specification describes the format of the AGP file.\
\\ Gaps are represented as black boxes in this track.\ If the relative order and orientation of the contigs on either side\ of the gap is supported by read pair data, \ it is a bridged gap and a white line is drawn \ through the black box representing the gap. \
\This assembly contains the following principal types of gaps:\
\
\ map 1 group map\ html gap\ longLabel Gap Locations\ shortLabel Gap\ track gap\ type bed 3 +\ visibility hide\ gc5BaseBw GC Percent bigWig 0 100 GC Percent in 5-Base Windows 0 100 0 0 0 128 128 128 0 0 0- other - gaps added at UCSC to annotate strings of Ns that were not marked in the AGP file (count: 572; size range: 10 - 53,860 bases)
\Description
\\ The GC percent track shows the percentage of G (guanine) and C (cytosine) bases\ in 5-base windows. High GC content is typically associated with\ gene-rich areas.\
\\ This track may be configured in a variety of ways to highlight different\ apsects of the displayed information. Click the\ "Graph configuration help"\ link for an explanation of the configuration options.\ \
Credits
\The data and presentation of this graph were prepared by\ Hiram Clawson.\
\ \ map 0 altColor 128,128,128\ autoScale Off\ color 0,0,0\ graphTypeDefault Bar\ gridDefault OFF\ group map\ html gc5Base\ longLabel GC Percent in 5-Base Windows\ maxHeightPixels 128:36:16\ shortLabel GC Percent\ track gc5BaseBw\ type bigWig 0 100\ viewLimits 30:70\ visibility hide\ windowingFunction Mean\ genscan Genscan Genes genePred genscanPep Genscan Gene Predictions 0 100 170 100 0 212 177 127 0 0 0Description
\ \\ This track shows predictions from the\ Genscan program\ written by Chris Burge.\ The predictions are based on transcriptional, translational and donor/acceptor\ splicing signals as well as the length and compositional distributions of exons,\ introns and intergenic regions.\
\ \\ For more information on the different gene tracks, see our Genes FAQ.
\ \Display Conventions and Configuration
\ \\ This track follows the display conventions for\ gene prediction\ tracks.\
\ \\ The track description page offers the following filter and configuration\ options:\
\
\ \ \- Color track by codons: Select the genomic codons option\ to color and label each codon in a zoomed-in display to facilitate validation\ and comparison of gene predictions. Go to the\ \ Coloring Gene Predictions and Annotations by Codon page for more\ information about this feature.
\Methods
\ \\ For a description of the Genscan program and the model that underlies it,\ refer to Burge and Karlin (1997) in the References section below.\ The splice site models used are described in more detail in Burge (1998)\ below.\
\ \Credits
\ \ Thanks to Chris Burge for providing the Genscan program.\ \References
\ \\ Burge C.\ Modeling Dependencies in Pre-mRNA Splicing Signals.\ In: Salzberg S, Searls D, Kasif S, editors.\ Computational Methods in Molecular Biology.\ Amsterdam: Elsevier Science; 1998. p. 127-163.\
\ \\ Burge C, Karlin S.\ \ Prediction of complete gene structures in human genomic DNA.\ J. Mol. Biol. 1997 Apr 25;268(1):78-94.\ PMID: 9149143\
\ genes 1 color 170,100,0\ group genes\ longLabel Genscan Gene Predictions\ shortLabel Genscan Genes\ track genscan\ type genePred genscanPep\ visibility hide\ ucscToINSDC INSDC bed 4 Accession at INSDC - International Nucleotide Sequence Database Collaboration 0 100 0 0 0 127 127 127 0 0 0 https://www.ncbi.nlm.nih.gov/nuccore/$$Description
\\ This track associates UCSC Genome Browser chromosome names to accession\ names from the International Nucleotide Sequence Database Collaboration (INSDC).\
\ \\ The data were downloaded from the NCBI assembly database.\
\ \Credits
\The data for this track was prepared by\ Hiram Clawson.\ \ map 1 group map\ longLabel Accession at INSDC - International Nucleotide Sequence Database Collaboration\ shortLabel INSDC\ track ucscToINSDC\ type bed 4\ url https://www.ncbi.nlm.nih.gov/nuccore/$$\ urlLabel INSDC link:\ visibility hide\ insectsChainNet Insects Chain/Net bed 3 Insects Chain and Net Alignments 0 100 0 0 0 255 255 0 0 0 0
Description
\\ This track shows regions of the genome that are alignable\ to other genomes ("chain" subtracks) or in synteny ("net" subtracks).\ The alignable parts are shown with thick blocks that look like exons. \ Non-alignable parts between these are shown like introns.\
\ \Chain Track
\\ The chain track shows alignments of a query genome sequence to the\ D. melanogaster genome using a gap scoring system that allows longer gaps \ than traditional affine gap scoring systems. It can also tolerate gaps in both\ the query sequence and D. melanogaster simultaneously. These \ "double-sided" gaps can be caused by local inversions and \ overlapping deletions in both species. \
\ The chain track displays boxes joined together by either single or\ double lines. The boxes represent aligning regions.\ Single lines indicate gaps that are largely due to a deletion in the\ the query sequence assembly or an insertion in the D. melanogaster \ assembly. Double lines represent more complex gaps that involve substantial\ sequence in both species. This may result from inversions, overlapping\ deletions, an abundance of local mutation, or an unsequenced gap in one\ species. In cases where multiple chains align over a particular region of\ the D. melanogaster genome, the chains with single-lined gaps are often \ due to processed pseudogenes, while chains with double-lined gaps are more \ often due to paralogs and unprocessed pseudogenes.
\\ In the "pack" and "full" display\ modes, the individual feature names indicate the chromosome, strand, and\ location (in thousands) of the match for each matching alignment.
\ \Net Track
\\ The net track shows the best query sequence/D. melanogaster chain for \ every part of the D. melanogaster genome. It is useful for\ finding syntenic regions, possibly orthologs, and for studying genome\ rearrangement.
\ \Display Conventions and Configuration
\Chain Track
\By default, the chains to chromosome-based assemblies are colored\ based on which chromosome they map to in the aligning organism. To turn\ off the coloring, check the "off" button next to: Color\ track based on chromosome.
\\ To display only the chains of one chromosome in the aligning\ organism, enter the name of that chromosome (e.g. chr4) in box next to: \ Filter by chromosome.
\ \Net Track
\\ In full display mode, the top-level (level 1)\ chains are the largest, highest-scoring chains that\ span this region. In many cases gaps exist in the\ top-level chain. When possible, these are filled in by\ other chains that are displayed at level 2. The gaps in \ level 2 chains may be filled by level 3 chains and so\ forth.
\\ In the graphical display, the boxes represent ungapped \ alignments; the lines represent gaps. Click\ on a box to view detailed information about the chain\ as a whole; click on a line to display information\ about the gap. The detailed information is useful in determining\ the cause of the gap or, for lower level chains, the genomic\ rearrangement.
\\ Individual items in the display are categorized as one of four types\ (other than gap):
\\
\ \- Top - the best, longest match. Displayed on level 1.
\- Syn - line-ups on the same chromosome as the gap in the level above\ it.
\- Inv - a line-up on the same chromosome as the gap above it, but in\ the opposite orientation.
\- NonSyn - a match to a chromosome different from the gap in the \ level above.
\Methods
\Chain track
\\ Transposons that have been inserted since the query sequence/D. melanogaster\ split were removed from the assemblies. The abbreviated genomes were\ aligned with lastz, and the transposons were added back in.\ The resulting alignments were converted into axt format using the lavToAxt\ program. The axt alignments were fed into axtChain, which organizes all\ alignments between a single query sequence chromosome and a single\ D. melanogaster chromosome into a group and creates a kd-tree out\ of the gapless subsections (blocks) of the alignments. A dynamic program\ was then run over the kd-trees to find the maximally scoring chains of these\ blocks.\ \ \ \ Chains scoring below a minimum score of "5000" were discarded;\ the remaining chains are displayed in this track. The linear gap\ matrix used with axtChain:
\-linearGap=loose\ \ tablesize 11\ smallSize 111\ position 1 2 3 11 111 2111 12111 32111 72111 152111 252111\ qGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600\ tGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600\ bothGap 625 660 700 750 900 1400 4000 8000 16000 32000 57000\\ \ \Net track
\\ Chains were derived from lastz alignments, using the methods\ described on the chain tracks description pages, and sorted with the \ highest-scoring chains in the genome ranked first. The program\ chainNet was then used to place the chains one at a time, trimming them as \ necessary to fit into sections not already covered by a higher-scoring chain. \ During this process, a natural hierarchy emerged in which a chain that filled \ a gap in a higher-scoring chain was placed underneath that chain. The program \ netSyntenic was used to fill in information about the relationship between \ higher- and lower-level chains, such as whether a lower-level\ chain was syntenic or inverted relative to the higher-level chain. \ The program netClass was then used to fill in how much of the gaps and chains \ contained Ns (sequencing gaps) in one or both species and how much\ was filled with transposons inserted before and after the two organisms \ diverged.
\ \Credits
\\ Lastz (previously known as blastz) was developed at\ Pennsylvania State University by \ Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from\ Ross Hardison.
\\ Lineage-specific repeats were identified by Arian Smit and his \ RepeatMasker\ program.
\\ The axtChain program was developed at the University of California at \ Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler.
\\ The browser display and database storage of the chains and nets were created\ by Robert Baertsch and Jim Kent.
\\ The chainNet, netSyntenic, and netClass programs were\ developed at the University of California\ Santa Cruz by Jim Kent.
\\ \
References
\ \\ Chiaromonte F, Yap VB, Miller W.\ Scoring pairwise genomic sequence alignments.\ Pac Symp Biocomput. 2002:115-26.\ PMID: 11928468\
\ \\ Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D.\ Evolution's cauldron:\ duplication, deletion, and rearrangement in the mouse and human genomes.\ Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9.\ PMID: 14500911; PMC: PMC208784\
\ \\ Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC,\ Haussler D, Miller W.\ Human-mouse alignments with BLASTZ.\ Genome Res. 2003 Jan;13(1):103-7.\ PMID: 12529312; PMC: PMC430961\
\ compGeno 1 altColor 255,255,0\ chainLinearGap loose\ chainMinScore 5000\ color 0,0,0\ compositeTrack on\ configurable on\ dimensions dimensionX=clade dimensionY=species\ dragAndDrop subTracks\ group compGeno\ html chainNetInsects\ longLabel Insects Chain and Net Alignments\ noInherit on\ shortLabel Insects Chain/Net\ sortOrder species=+ view=+ clade=+\ subGroup1 view Views chain=Chains net=Nets\ subGroup2 species Species s000a=D._simulans s000b=D._simulans s001=D._sechellia s002=D._yakuba s003=D._erecta s004=D._takahashii s005=D._elegans s006=D._eugracilis s007=D._biarmipes s008=D._rhopaloa s009=D._ficusphila s010=D._suzukii s011=D._kikkawai s012=D_serrata s013=D._ananassae s014=D._bipectinata s015=D_obscura s016a=D._pseudoobscura s016b=D._pseudoobscura s017=D._miranda s018=D._persimilis s019=D_subobscura s020=D_athabasca s021=D._virilis s022=D._willistoni s023=D._grimshawi s024=D._mojavensis s025=D_pseudoobscura_1 s026=D_novamexicana s027=D_hydei s028=D_americana s029=D_montana s030=D._albomicans s031=Scaptodrosophila_lebanonensis s032=D_busckii s033=D_arizonae s034=D_nasuta s035=Zaprionus_indianus s036=D_navojoa s037=Phortica_variegata s038=Teleopsis_dalmanni s039=Rhagoletis_zephyria s040=Lucilia_cuprina s041=Bactrocera_latifrons s042=Bactrocera_oleae s043=Zeugodacus_cucurbitae s044=Phormia_regina s045=Ceratitis_capitata s046=Paykullia_maculata s047=Bactrocera_tryoni s048=M._domestica s049=Bactrocera_dorsalis s050=Stomoxys_calcitrans s051=Glossina_pallidipes s052=Glossina_fuscipes s053=Glossina_brevipalpis s054=Glossina_morsitans_2 s055=Glossina_austeni s056=Glossina_palpalis_gambiensis s057=Ephydra_gracilis s058=Lucilia_sericata s059=Glossina_morsitans_1 s060=Calliphora_vicina s061=Sphyracephala_brevicornis s062=Proctacanthus_coquilletti s063=Haematobia_irritans s064=Themira_minor s065=Megaselia_abdita s066=Tephritis_californica s067=Cirrula_hians s068=Hermetia_illucens s069=Neobellieria_bullata s070=Eutreta_diana s071=Holcocephala_fusca s072=Sarcophagidae_BV_2014 s073=Liriomyza_trifolii s074=Eristalis_dimidiata s075=Condylostylus_patibulatus s076=Megaselia_scalaris s077=Trupanea_jonesi s078=Aedes_albopictus s079=Aedes_aegypti s080a=A._gambiae s080b=A._gambiae s081=Culex_quinquefasciatus s082=A_maculatus s083=A_merus s084=A_dirus s085=A_arabiensis s086=A_sinensis s087=A_atroparvus s088=A_epiroticus s089=A_quadriannulatus s090=A_farauti s091=A_minimus s092=A_funestus s093=A_melas s094=A_coluzzii s095=Clogmia_albipunctata s096=Phlebotomus_papatasi s097=A_culicifacies s098=A_stephensi s099=Coboldia_fuscipes s100=Culicoides_sonorensis s101=A_albimanus s102=A_christyi s103=A_cracens s104=A_aquasalis s105=Mochlonyx_cinctipes s106=A_darlingi s107=A_farauti_No4 s108=Lutzomyia_longipalpis s109=A_koliensis s110=Belgica_antarctica s111=A_punctulatus s112=Clunio_marinus s113=Mayetiola_destructor s114=Chironomus_tentans s115=Chironomus_riparius s116=A_gambiae_1 s117=A_nili s118=Chaoborus_trivitattus s119=Tipula_oleracea s120=Trichoceridae_BV_2014 s121=T._castaneum s122=A._mellifera\ subGroup3 clade Clade c00=brachycera c01=nematocera c02=holometabola\ track insectsChainNet\ type bed 3\ visibility hide\ nestedRepeats Interrupted Rpts bed 12 + Fragments of Interrupted Repeats Joined by RepeatMasker ID 0 100 0 0 0 127 127 127 1 0 0Description
\ \\ This track shows joined fragments of interrupted repeats extracted\ from the output of the \ RepeatMasker program which screens DNA sequences\ for interspersed repeats and low complexity DNA sequences using the\ \ Repbase Update library of repeats from the\ Genetic\ Information Research Institute (GIRI). Repbase Update is described in\ Jurka (2000) in the References section below.\
\ \\ The detailed annotations from RepeatMasker are in the RepeatMasker track. This\ track shows fragments of original repeat insertions which have been interrupted\ by insertions of younger repeats or through local rearrangements. The fragments\ are joined using the ID column of RepeatMasker output.\
\ \Display Conventions and Configuration
\ \\ In pack or full mode, each interrupted repeat is displayed as boxes\ (fragments) joined by horizontal lines, labeled with the repeat name.\ If all fragments are on the same strand, arrows are added to the\ horizontal line to indicate the strand. In dense or squish mode, labels\ and arrows are omitted and in dense mode, all items are collapsed to\ fit on a single row.\
\ \\ Items are shaded according to the average identity score of their\ fragments. Usually, the shade of an item is similar to the shades of\ its fragments unless some fragments are much more diverged than\ others. The score displayed above is the average identity score,\ clipped to a range of 50% - 100% and then mapped to the range\ 0 - 1000 for shading in the browser.\
\ \Methods
\ \\ UCSC has used the most current versions of the RepeatMasker software\ and repeat libraries available to generate these data. Note that these\ versions may be newer than those that are publicly available on the Internet.\
\ \\ Data are generated using the RepeatMasker -s flag. Additional flags\ may be used for certain organisms. See the\ FAQ for more information.\
\ \Credits
\ \\ Thanks to Arian Smit, Robert Hubley and GIRI for providing the tools and\ repeat libraries used to generate this track.\
\ \References
\ \\ Smit AFA, Hubley R, Green P.\ RepeatMasker Open-3.0.\ \ http://www.repeatmasker.org. 1996-2010.\
\ \\ Repbase Update is described in:\
\ \\ Jurka J.\ \ Repbase Update: a database and an electronic journal of repetitive elements.\ Trends Genet. 2000 Sep;16(9):418-420.\ PMID: 10973072\
\ \\ For a discussion of repeats in mammalian genomes, see:\
\ \\ Smit AF.\ \ Interspersed repeats and other mementos of transposable elements in mammalian genomes.\ Curr Opin Genet Dev. 1999 Dec;9(6):657-63.\ PMID: 10607616\
\ \\ Smit AF.\ \ The origin of interspersed repeats in the human genome.\ Curr Opin Genet Dev. 1996 Dec;6(6):743-8.\ PMID: 8994846\
\ varRep 1 exonNumbers off\ group varRep\ longLabel Fragments of Interrupted Repeats Joined by RepeatMasker ID\ shortLabel Interrupted Rpts\ track nestedRepeats\ type bed 12 +\ useScore 1\ visibility hide\ microsat Microsatellite bed 4 Microsatellites - Di-nucleotide and Tri-nucleotide Repeats 0 100 0 0 0 127 127 127 0 0 0Description
\\ This track displays regions that are likely to be useful as microsatellite\ markers. These are sequences of at least 15 perfect di-nucleotide and \ tri-nucleotide repeats and tend to be highly polymorphic in the\ population.\
\ \Methods
\\ The data shown in this track are a subset of the Simple Repeats track, \ selecting only those \ repeats of period 2 and 3, with 100% identity and no indels and with\ at least 15 copies of the repeat. The Simple Repeats track is\ created using the \ Tandem Repeats Finder. For more information about this \ program, see Benson (1999).
\ \Credits
\\ Tandem Repeats Finder was written by \ Gary Benson.
\ \References
\ \\ Benson G.\ \ Tandem repeats finder: a program to analyze DNA sequences.\ Nucleic Acids Res. 1999 Jan 15;27(2):573-80.\ PMID: 9862982; PMC: PMC148217\
\ varRep 1 group varRep\ longLabel Microsatellites - Di-nucleotide and Tri-nucleotide Repeats\ shortLabel Microsatellite\ track microsat\ type bed 4\ visibility hide\ multiz124way Multiz Align wigMaf 0.0 1.0 Multiz Alignments of 124 insects 3 100 0 10 100 0 90 10 0 0 0 compGeno 1 altColor 0,90,10\ color 0, 10, 100\ defaultMaf multiz124wayDefault\ frames multiz124wayFrames\ group compGeno\ irows on\ itemFirstCharCase noChange\ longLabel Multiz Alignments of 124 insects\ noInherit on\ parent cons124wayViewalign on\ priority 100\ sGroup_Brachycera droSim2 droSec1 droYak3 droEre2 droTak2 droEle2 droEug2 droBia2 droRho2 droFic2 droSuz1 droKik2 D_serrata droAna3 droBip2 D_obscura droPse3 droMir2 droPer1 D_subobscura D_athabasca droVir3 droWil2 droGri2 droMoj3 D_pseudoobscura_1 D_novamexicana D_hydei D_americana D_montana droAlb1 Scaptodrosophila_lebanonensis D_busckii D_arizonae D_nasuta Zaprionus_indianus D_navojoa Phortica_variegata Teleopsis_dalmanni Rhagoletis_zephyria Lucilia_cuprina Bactrocera_latifrons Bactrocera_oleae Zeugodacus_cucurbitae Phormia_regina Ceratitis_capitata Paykullia_maculata Bactrocera_tryoni musDom2 Bactrocera_dorsalis Stomoxys_calcitrans Glossina_pallidipes Glossina_fuscipes Glossina_brevipalpis Glossina_morsitans_2 Glossina_austeni Glossina_palpalis_gambiensis Ephydra_gracilis Lucilia_sericata Glossina_morsitans_1 Calliphora_vicina Sphyracephala_brevicornis Proctacanthus_coquilletti Haematobia_irritans Themira_minor Megaselia_abdita Tephritis_californica Cirrula_hians Hermetia_illucens Neobellieria_bullata Eutreta_diana Holcocephala_fusca Sarcophagidae_BV_2014 Liriomyza_trifolii Eristalis_dimidiata Condylostylus_patibulatus Megaselia_scalaris Trupanea_jonesi\ sGroup_Holometabola triCas2 apiMel4\ sGroup_Nematocera Aedes_albopictus Aedes_aegypti anoGam3 Culex_quinquefasciatus A_maculatus A_merus A_dirus A_arabiensis A_sinensis A_atroparvus A_epiroticus A_quadriannulatus A_farauti A_minimus A_funestus A_melas A_coluzzii Clogmia_albipunctata Phlebotomus_papatasi A_culicifacies A_stephensi Coboldia_fuscipes Culicoides_sonorensis A_albimanus A_christyi A_cracens A_aquasalis Mochlonyx_cinctipes A_darlingi A_farauti_No4 Lutzomyia_longipalpis A_koliensis Belgica_antarctica A_punctulatus Clunio_marinus Mayetiola_destructor Chironomus_tentans Chironomus_riparius A_gambiae_1 A_nili Chaoborus_trivitattus Tipula_oleracea Trichoceridae_BV_2014\ shortLabel Multiz Align\ speciesCodonDefault dm6\ speciesDefaultOff A_aquasalis A_christyi A_cracens A_culicifacies A_darlingi A_epiroticus A_farauti_No4 A_funestus A_gambiae_1 A_koliensis A_maculatus A_melas A_merus A_nili A_punctulatus A_quadriannulatus A_sinensis Aedes_albopictus Bactrocera_dorsalis Bactrocera_latifrons Bactrocera_oleae Bactrocera_tryoni Belgica_antarctica Calliphora_vicina Ceratitis_capitata Chaoborus_trivitattus Chironomus_riparius Chironomus_tentans Cirrula_hians Clogmia_albipunctata Coboldia_fuscipes Condylostylus_patibulatus Culex_quinquefasciatus Culicoides_sonorensis D_americana D_hydei D_montana D_nasuta D_obscura D_pseudoobscura_1 D_serrata D_subobscura Ephydra_gracilis Eristalis_dimidiata Eutreta_diana Glossina_austeni Glossina_brevipalpis Glossina_fuscipes Glossina_morsitans_1 Glossina_morsitans_2 Glossina_pallidipes Glossina_palpalis_gambiensis Haematobia_irritans Hermetia_illucens Holcocephala_fusca Liriomyza_trifolii Lucilia_cuprina Lucilia_sericata Lutzomyia_longipalpis Mayetiola_destructor Megaselia_abdita Megaselia_scalaris Mochlonyx_cinctipes Neobellieria_bullata Paykullia_maculata Phlebotomus_papatasi Phormia_regina Phortica_variegata Proctacanthus_coquilletti Rhagoletis_zephyria Sarcophagidae_BV_2014 Sphyracephala_brevicornis Stomoxys_calcitrans Teleopsis_dalmanni Tephritis_californica Themira_minor Tipula_oleracea Trichoceridae_BV_2014 Trupanea_jonesi Zaprionus_indianus Zeugodacus_cucurbitae droAlb1 droBip2 droEle2 droEug2 droFic2 droKik2 droPer1 droRho2 droSuz1 droTak2 musDom2\ speciesDefaultOn A_albimanus A_arabiensis A_atroparvus A_coluzzii A_dirus A_farauti A_minimus A_stephensi Aedes_aegypti Clunio_marinus D_arizonae D_athabasca D_busckii D_navojoa D_novamexicana Scaptodrosophila_lebanonensis anoGam3 apiMel4 droAna3 droBia2 droEre2 droGri2 droMir2 droMoj3 droPse3 droSec1 droSim2 droVir3 droWil2 droYak3 triCas2\ speciesGroups Brachycera Nematocera Holometabola\ subGroups view=align\ summary multiz124waySummary\ track multiz124way\ treeImage phylo/dm6_124way.png\ type wigMaf 0.0 1.0\ multiz27way Multiz Align wigMaf 0.0 1.0 Multiz Alignments of 27 insects 3 100 0 10 100 0 90 10 0 0 0 compGeno 1 altColor 0,90,10\ color 0, 10, 100\ frames multiz27wayFrames\ group compGeno\ irows on\ itemFirstCharCase noChange\ longLabel Multiz Alignments of 27 insects\ noInherit on\ parent cons27wayViewalign on\ priority 100\ sGroup_Drosophila droSim1 droSec1 droYak3 droEre2 droBia2 droSuz1 droAna3 droBip2 droEug2 droEle2 droKik2 droTak2 droRho2 droFic2 droPse3 droPer1 droMir2 droWil2 droVir3 droMoj3 droAlb1 droGri2\ sGroup_Others musDom2 anoGam1 apiMel4 triCas2\ shortLabel Multiz Align\ speciesCodonDefault dm6\ speciesGroups Drosophila Others\ subGroups view=align\ summary multiz27waySummary\ track multiz27way\ treeImage phylo/dm6_27way.png\ type wigMaf 0.0 1.0\ insectsChainNetViewnet Nets bed 3 Insects Chain and Net Alignments 1 100 0 0 0 255 255 0 0 0 0 compGeno 1 longLabel Insects Chain and Net Alignments\ parent insectsChainNet\ shortLabel Nets\ track insectsChainNetViewnet\ view net\ visibility dense\ oreganno ORegAnno bed 4 + Regulatory elements from ORegAnno 0 100 102 102 0 178 178 127 0 0 0Description
\\ This track displays literature-curated regulatory regions, transcription\ factor binding sites, and regulatory polymorphisms from\ ORegAnno (Open Regulatory Annotation). For more detailed\ information on a particular regulatory element, follow the link to ORegAnno\ from the details page. \ \
\ \Display Conventions and Configuration
\ \The display may be filtered to show only selected region types, such as:
\ \\
\ \- regulatory regions (shown in light blue)
\- regulatory polymorphisms (shown in dark blue)
\- transcription factor binding sites (shown in orange)
\- regulatory haplotypes (shown in red)
\- miRNA binding sites (shown in blue-green)
\To exclude a region type, uncheck the appropriate box in the list at the top of \ the Track Settings page.
\ \Methods
\\ An ORegAnno record describes an experimentally proven and published regulatory\ region (promoter, enhancer, etc.), transcription factor binding site, or\ regulatory polymorphism. Each annotation must have the following attributes:\
\
\ The following attributes are optionally included:\- A stable ORegAnno identifier.\
- A valid taxonomy ID from the NCBI taxonomy database.\
- A valid PubMed reference. \
- A target gene that is either user-defined, in Entrez Gene or in EnsEMBL.\
- A sequence with at least 40 flanking bases (preferably more) to allow the\ site to be mapped to any release of an associated genome.\
- At least one piece of specific experimental evidence, including the\ biological technique used to discover the regulatory sequence. (Currently\ only the evidence subtypes are supplied with the UCSC track.)\
- A positive, neutral or negative outcome based on the experimental results\ from the primary reference. (Only records with a positive outcome are currently\ included in the UCSC track.)\
\
\ Mapping to genome coordinates is performed periodically to current genome\ builds by BLAST sequence alignment. \ The information provided in this track represents an abbreviated summary of the \ details for each ORegAnno record. Please visit the official ORegAnno entry\ (by clicking on the ORegAnno link on the details page of a specific regulatory\ element) for complete details such as evidence descriptions, comments,\ validation score history, etc.\ \ \- A transcription factor that is either user-defined, in Entrez Gene\ or in EnsEMBL.\
- A specific cell type for each piece of experimental evidence, using the\ eVOC cell type ontology.\
- A specific dataset identifier (e.g. the REDfly dataset) that allows\ external curators to manage particular annotation sets using ORegAnno's\ curation tools.\
- A "search space" sequence that specifies the region that was\ assayed, not just the regulatory sequence. \
- A dbSNP identifier and type of variant (germline, somatic or artificial)\ for regulatory polymorphisms.\
Credits
\\ ORegAnno core team and principal contacts: Stephen Montgomery, Obi Griffith, \ and Steven Jones from Canada's Michael Smith Genome Sciences Centre, Vancouver, \ British Columbia, Canada.
\\ The ORegAnno community (please see individual citations for various\ features): ORegAnno Citation.\ \
References
\\ Lesurf R, Cotto KC, Wang G, Griffith M, Kasaian K, Jones SJ, Montgomery SB, Griffith OL, Open\ Regulatory Annotation Consortium..\ \ ORegAnno 3.0: a community-driven resource for curated regulatory annotation.\ Nucleic Acids Res. 2016 Jan 4;44(D1):D126-32.\ PMID: 26578589; PMC: PMC4702855\
\ \\ Griffith OL, Montgomery SB, Bernier B, Chu B, Kasaian K, Aerts S, Mahony S, Sleumer MC, Bilenky M,\ Haeussler M et al.\ \ ORegAnno: an open-access community-driven resource for regulatory annotation.\ Nucleic Acids Res. 2008 Jan;36(Database issue):D107-13.\ PMID: 18006570; PMC: PMC2239002\
\ \\ Montgomery SB, Griffith OL, Sleumer MC, Bergman CM, Bilenky M, Pleasance ED, \ Prychyna Y, Zhang X, Jones SJ. \ ORegAnno: an open access database and curation system for \ literature-derived promoters, transcription factor binding sites and regulatory variation.\ Bioinformatics. 2006 Mar 1;22(5):637-40.\ PMID: 16397004\
\ \ regulation 1 color 102,102,0\ group regulation\ longLabel Regulatory elements from ORegAnno\ shortLabel ORegAnno\ track oreganno\ type bed 4 +\ visibility hide\ xenoMrna Other mRNAs psl xeno Non-D. melanogaster mRNAs from GenBank 0 100 0 0 0 127 127 127 1 0 0Description
\\ This track displays translated blat alignments of vertebrate and\ invertebrate mRNA in \ GenBank from organisms other than D. melanogaster.\ \
Display Conventions and Configuration
\\ This track follows the display conventions for \ PSL alignment tracks. In dense display mode, the items that\ are more darkly shaded indicate matches of better quality.
\\ The strand information (+/-) for this track is in two parts. The\ first + indicates the orientation of the query sequence whose\ translated protein produced the match (here always 5' to 3', hence +).\ The second + or - indicates the orientation of the matching \ translated genomic sequence. Because the two orientations of a DNA \ sequence give different predicted protein sequences, there are four \ combinations. ++ is not the same as --, nor is +- the same as -+.
\\ The description page for this track has a filter that can be used to change \ the display mode, alter the color, and include/exclude a subset of items \ within the track. This may be helpful when many items are shown in the track \ display, especially when only some are relevant to the current task.
\\ To use the filter:\
\
\- Type a term in one or more of the text boxes to filter the mRNA \ display. For example, to apply the filter to all mRNAs expressed in a specific\ organ, type the name of the organ in the tissue box. To view the list of \ valid terms for each text box, consult the table in the Table Browser that \ corresponds to the factor on which you wish to filter. For example, the \ "tissue" table contains all the types of tissues that can be \ entered into the tissue text box. Wildcards may also be used in the\ filter.\
- If filtering on more than one value, choose the desired combination\ logic. If "and" is selected, only mRNAs that match all filter \ criteria will be highlighted. If "or" is selected, mRNAs that \ match any one of the filter criteria will be highlighted.\
- Choose the color or display characteristic that should be used to \ highlight or include/exclude the filtered items. If "exclude" is \ chosen, the browser will not display mRNAs that match the filter criteria. \ If "include" is selected, the browser will display only those \ mRNAs that match the filter criteria.\
\ This track may also be configured to display codon coloring, a feature that\ allows the user to quickly compare mRNAs against the genomic sequence. For more \ information about this option, click \ here.\
\ \Methods
\\ The mRNAs were aligned against the D. melanogaster genome using translated \ blat. When a single mRNA aligned in multiple places, the alignment having the\ highest base identity was found. Only those alignments having a base \ identity level within 1% of the best and at least 25% base identity with the \ genomic sequence were kept.
\ \Credits
\\ The mRNA track was produced at UCSC from mRNA sequence data\ submitted to the international public sequence databases by \ scientists worldwide.
\ \References
\\ Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL.\ GenBank: update.\ Nucleic Acids Res. 2004 Jan 1;32(Database issue):D23-6.\ PMID: 14681350; PMC: PMC308779\
\ \\ Kent WJ.\ \ BLAT--the BLAST-like alignment tool.\ Genome Res. 2002 Apr;12(4):656-64.\ PMID: 11932250; PMC: PMC187518\
\ rna 1 baseColorUseCds genbank\ baseColorUseSequence genbank\ group rna\ indelDoubleInsert on\ indelQueryInsert on\ longLabel Non-D. melanogaster mRNAs from GenBank\ shortLabel Other mRNAs\ showDiffBasesAllScales .\ spectrum on\ track xenoMrna\ type psl xeno\ visibility hide\ xenoRefGene Other RefSeq genePred xenoRefPep xenoRefMrna Non-D. melanogaster RefSeq Genes 1 100 12 12 120 133 133 187 0 0 0Description
\\ This track shows known protein-coding and non-protein-coding genes \ for organisms other than D. melanogaster, taken from the NCBI RNA reference\ sequences collection (RefSeq). The data underlying this track are \ updated weekly.
\ \Display Conventions and Configuration
\\ This track follows the display conventions for \ gene prediction \ tracks.\ The color shading indicates the level of review the RefSeq record has \ undergone: predicted (light), provisional (medium), reviewed (dark).
\\ The item labels and display colors of features within this track can be\ configured through the controls at the top of the track description page. \
\
\ \- Label: By default, items are labeled by gene name. Click the \ appropriate Label option to display the accession name instead of the gene\ name, show both the gene and accession names, or turn off the label \ completely.\
- Codon coloring: This track contains an optional codon coloring \ feature that allows users to quickly validate and compare gene predictions.\ To display codon colors, select the genomic codons option from the\ Color track by codons pull-down menu. Click \ here for more \ information about this feature.\
- Hide non-coding genes: By default, both the protein-coding and\ non-protein-coding genes are displayed. If you wish to see only the coding\ genes, click this box.\
Methods
\\ The RNAs were aligned against the D. melanogaster genome using blat; those\ with an alignment of less than 15% were discarded. When a single RNA aligned \ in multiple places, the alignment having the highest base identity was \ identified. Only alignments having a base identity level within 0.5% of \ the best and at least 25% base identity with the genomic sequence were kept.\
\ \Credits
\\ This track was produced at UCSC from RNA sequence data\ generated by scientists worldwide and curated by the \ NCBI RefSeq project.
\ \References
\\ Kent WJ.\ BLAT - the BLAST-like alignment tool.\ Genome Res. 2002 Apr;12(4):656-64.\ PMID: 11932250; PMC: PMC187518\
\ genes 1 color 12,12,120\ group genes\ longLabel Non-D. melanogaster RefSeq Genes\ shortLabel Other RefSeq\ track xenoRefGene\ type genePred xenoRefPep xenoRefMrna\ visibility dense\ refGenePfam Pfam in RefSeq bed 12 Pfam Domains in RefSeq Genes 0 100 20 0 250 137 127 252 0 0 0 http://pfam.xfam.org/family/$$Description
\ \\ Most proteins are composed of one or more conserved functional regions called\ domains. This track shows the high-quality, manually-curated\ \ Pfam-A\ domains found in proteins associated with the RefSeq Genes transcripts.\
\ \Display Conventions and Configuration
\ \\ This track follows the display conventions for\ gene\ tracks.\
\ \Methods
\ \\ The proteins associated with the transcripts in the refGene table (see \ RefSeq Genes description page)\ are submitted to the set of Pfam-A HMMs which annotate regions within the\ predicted peptide that are recognizable as Pfam protein domains. These regions\ are then mapped to the transcripts themselves using the\ \ pslMap utility.\
\ \\ Of the several options for filtering out false positives, the "Trusted cutoff (TC)"\ threshold method is used in this track to determine significance. For more information regarding\ thresholds and scores, see the HMMER \ documentation \ and\ results interpretation \ pages.\
\ \\ Note: There is currently an undocumented but known HMMER problem which results in lessened \ sensitivity and possible missed searches for some zinc finger domains. Until a fix is released for \ HMMER /PFAM thresholds, please also consult the "UniProt Domains" subtrack of the UniProt track for \ more comprehensive zinc finger annotations.\
\ \Credits
\ \\ pslMap was written by Mark Diekhans at UCSC.\
\ \References
\ \\ Finn RD, Mistry J, Tate J, Coggill P, Heger A, Pollington JE, Gavin OL, Gunasekaran P, Ceric G,\ Forslund K et al.\ The Pfam protein families database.\ Nucleic Acids Res. 2010 Jan;38(Database issue):D211-22.\ PMID: 19920124; PMC: PMC2808889\
\ genes 1 color 20,0,250\ group genes\ longLabel Pfam Domains in RefSeq Genes\ shortLabel Pfam in RefSeq\ track refGenePfam\ type bed 12\ url http://pfam.xfam.org/family/$$\ ucscToRefSeq RefSeq Acc bed 4 RefSeq Accession 0 100 0 0 0 127 127 127 0 0 0 https://www.ncbi.nlm.nih.gov/nuccore/$$Description
\\ This track associates UCSC Genome Browser chromosome names to accession\ identifiers from the NCBI Reference Sequence Database (RefSeq).\
\ \\ The data were downloaded from the NCBI assembly database.\
\ \Credits
\The data for this track was prepared by\ Hiram Clawson.\ map 1 group map\ longLabel RefSeq Accession\ shortLabel RefSeq Acc\ track ucscToRefSeq\ type bed 4\ url https://www.ncbi.nlm.nih.gov/nuccore/$$\ urlLabel RefSeq accession:\ visibility hide\ ReMap ReMap ChIP-seq bigBed 9 + ReMap Atlas of Regulatory Regions 0 100 0 0 0 127 127 127 0 0 0
Description
\\ This track represents the ReMap Atlas of regulatory regions, which consists of a\ large-scale integrative analysis of all Public ChIP-seq data for transcriptional\ regulators from GEO, ArrayExpress, and ENCODE. \
\ \\ Below is a schematic diagram of the types of regulatory regions: \
\
\ \ \- ReMap 2022 Atlas (all peaks for each analyzed data set)
\- ReMap 2022 Non-redundant peaks (merged similar target)
\- ReMap 2022 Cis Regulatory Modules
\\ \
Display Conventions and Configuration
\\
\ \- \ Each transcription factor follows a specific RGB color.\
\- \ ChIP-seq peak summits are represented by vertical bars.\
\- \ Hsap: A data set is defined as a ChIP/Exo-seq experiment in a given\ GEO/ArrayExpress/ENCODE series (e.g. GSE41561), for a given TF (e.g. ESR1), in\ a particular biological condition (e.g. MCF-7).\
\
Data sets are labeled with the concatenation of these three pieces of\ information (e.g. GSE41561.ESR1.MCF-7).\- \ Atha: The data set is defined as a ChIP-seq experiment in a given series\ (e.g. GSE94486), for a given target (e.g. ARR1), in a particular biological\ condition (i.e. ecotype, tissue type, experimental conditions; e.g.\ Col-0_seedling_3d-6BA-4h).\
\
Data sets are labeled with the concatenation of these three pieces of\ information (e.g. GSE94486.ARR1.Col-0_seedling_3d-6BA-4h).\Methods
\ \\ This 4th release of ReMap (2022) presents the analysis of 1,206 quality\ controlled ChIP-seq (n=1,315 before QCs) data sets from public sources (GEO,\ ENCODE). Those ChIP-seq data sets have been mapped to the dm6 drosophila\ assembly. The data set is defined as a ChIP-seq experiment in a given series\ (e.g. GSE107059), for a given TF (e.g. Trl), in a particular biological\ condition (i.e. cell line, tissue type, disease state, or experimental conditions;\ e.g. Schneider-2). Data sets were labeled by concatenating these three pieces of\ information, such as GSE107059.Trl.Schneider-2.\
\Those merged analyses cover a total of 550 DNA-binding proteins\ (transcriptional regulators) such as a variety of transcription factors (TFs),\ transcription co-activators (TCFs), and chromatin-remodeling factors (CRFs) for\ 16 million peaks.\
\ \ \\ \
ENCODE
\\ Available ENCODE ChIP-seq data sets for transcriptional regulators from the\ ENCODE portal were processed with the\ standardized ReMap pipeline. The list of ENCODE data was retrieved as FASTQ files from the\ ENCODE portal\ using filters. Metadata information in JSON format and FASTQ files were retrieved using the Python\ requests module.\
\ \ \ \ChIP-seq processing
\\ Both Public and ENCODE data were processed similarly. Bowtie 2 (PMC3322381) (version 2.2.9) with options -end-to-end -sensitive was used to align all\ reads on the genome. Biological and technical\ replicates for each unique combination of GSE/TF/Cell type or Biological condition\ were used for peak calling. TFBS were identified using MACS2 peak-calling tool\ (PMC3120977) (version 2.1.1.2) in order to follow ENCODE ChIP-seq guidelines,\ with stringent thresholds (MACS2 default thresholds, p-value: 1e-5). An input data\ set was used when available.\
\ \ \Quality assessment
\\ To assess the quality of public data sets, a score was computed based on the\ cross-correlation and the FRiP (fraction of reads in peaks) metrics developed by\ the ENCODE Consortium (https://genome.ucsc.edu/ENCODE/qualityMetrics.html). Two\ thresholds were defined for each of the two cross-correlation ratios (NSC,\ normalized strand coefficient: 1.05 and 1.10; RSC, relative strand coefficient:\ 0.8 and 1.0). Detailed descriptions of the ENCODE quality coefficients can be\ found at https://genome.ucsc.edu/ENCODE/qualityMetrics.html. The\ phantompeak tools suite was used\ (https://code.google.com/p/phantompeakqualtools/) to compute\ RSC and NSC.\
\\ Please refer to the ReMap 2022, 2020, and 2018 publications for more details\ (citation below).\
\ \ \ \Data Access
\\ ReMap Atlas of regulatory regions data can be explored interactively with the\ Table Browser and cross-referenced with the \ Data Integrator. For programmatic access,\ the track can be accessed using the Genome Browser's\ REST API.\ ReMap annotations can be downloaded from the\ Genome Browser's download server\ as a bigBed file. This compressed binary format can be remotely queried through\ command line utilities. Please note that some of the download files can be quite large.
\ \\ Individual BED files for specific TFs, cells/biotypes, or data sets can be\ found and downloaded on the ReMap website.\
\ \References
\ \\ Chèneby J, Gheorghe M, Artufel M, Mathelier A, Ballester B.\ \ ReMap 2018: an updated atlas of regulatory regions from an integrative analysis of DNA-binding ChIP-\ seq experiments.\ Nucleic Acids Res. 2018 Jan 4;46(D1):D267-D275.\ PMID: 29126285; PMC: PMC5753247\
\\ Chèneby J, Ménétrier Z, Mestdagh M, Rosnet T, Douida A, Rhalloussi W, Bergon A, Lopez\ F, Ballester B.\ \ ReMap 2020: a database of regulatory regions from an integrative analysis of Human and Arabidopsis\ DNA-binding sequencing experiments.\ Nucleic Acids Res. 2020 Jan 8;48(D1):D180-D188.\ PMID: 31665499; PMC: PMC7145625\
\\ Griffon A, Barbier Q, Dalino J, van Helden J, Spicuglia S, Ballester B.\ \ Integrative analysis of public ChIP-seq experiments reveals a complex multi-cell regulatory\ landscape.\ Nucleic Acids Res. 2015 Feb 27;43(4):e27.\ PMID: 25477382; PMC: PMC4344487\
\\ Hammal F, de Langen P, Bergon A, Lopez F, Ballester B.\ \ ReMap 2022: a database of Human, Mouse, Drosophila and Arabidopsis regulatory regions from an\ integrative analysis of DNA-binding sequencing experiments.\ Nucleic Acids Res. 2022 Jan 7;50(D1):D316-D325.\ PMID: 34751401; PMC: PMC8728178\
\ regulation 1 compositeTrack on\ group regulation\ html reMap\ longLabel ReMap Atlas of Regulatory Regions\ noScoreFilter on\ shortLabel ReMap ChIP-seq\ track ReMap\ type bigBed 9 +\ visibility hide\ simpleRepeat Simple Repeats bed 4 + Simple Tandem Repeats by TRF 0 100 0 0 0 127 127 127 0 0 0Description
\\ This track displays simple tandem repeats (possibly imperfect repeats) located\ by Tandem Repeats\ Finder (TRF) which is specialized for this purpose. These repeats can\ occur within coding regions of genes and may be quite\ polymorphic. Repeat expansions are sometimes associated with specific\ diseases.
\ \Methods
\\ For more information about the TRF program, see Benson (1999).\
\ \Credits
\\ TRF was written by \ Gary Benson.
\ \References
\ \\ Benson G.\ \ Tandem repeats finder: a program to analyze DNA sequences.\ Nucleic Acids Res. 1999 Jan 15;27(2):573-80.\ PMID: 9862982; PMC: PMC148217\
\ varRep 1 group varRep\ longLabel Simple Tandem Repeats by TRF\ shortLabel Simple Repeats\ track simpleRepeat\ type bed 4 +\ visibility hide\ uniprot UniProt bigBed 12 + UniProt SwissProt/TrEMBL Protein Annotations 0 100 0 0 0 127 127 127 0 0 0Description
\ \\ This track shows protein sequences and annotations on them from the UniProt/SwissProt database,\ mapped to genomic coordinates. \
\\ UniProt/SwissProt data has been curated from scientific publications by the UniProt staff,\ UniProt/TrEMBL data has been predicted by various computational algorithms.\ The annotations are divided into multiple subtracks, based on their "feature type" in UniProt.\ The first two subtracks below - one for SwissProt, one for TrEMBL - show the\ alignments of protein sequences to the genome, all other tracks below are the protein annotations\ mapped through these alignments to the genome.\
\ \\
\\ \Track Name \Description \\ \UCSC Alignment, SwissProt = curated protein sequences \Protein sequences from SwissProt mapped to the genome. All other\ tracks are (start,end) SwissProt annotations on these sequences mapped\ through this alignment. Even protein sequences without a single curated \ annotation (splice isoforms) are visible in this track. Each UniProt protein \ has one main isoform, which is colored in dark. Alternative isoforms are \ sequences that do not have annotations on them and are colored in light-blue. \ They can be hidden with the TrEMBL/Isoform filter (see below). \ \UCSC Alignment, TrEMBL = predicted protein sequences \Protein sequences from TrEMBL mapped to the genome. All other tracks\ below are (start,end) TrEMBL annotations mapped to the genome using\ this track. This track is hidden by default. To show it, click its\ checkbox on the track configuration page. \ \UniProt Signal Peptides \Regions found in proteins destined to be secreted, generally cleaved from mature protein. \\ \UniProt Extracellular Domains \Protein domains with the comment "Extracellular". \\ \UniProt Transmembrane Domains \Protein domains of the type "Transmembrane". \\ \UniProt Cytoplasmic Domains \Protein domains with the comment "Cytoplasmic". \\ \UniProt Polypeptide Chains \Polypeptide chain in mature protein after post-processing. \\ \UniProt Regions of Interest \Regions that have been experimentally defined, such as the role of a region in mediating protein-protein interactions or some other biological process. \\ \UniProt Domains \Protein domains, zinc finger regions and topological domains. \\ \UniProt Disulfide Bonds \Disulfide bonds. \\ \UniProt Amino Acid Modifications \Glycosylation sites, modified residues and lipid moiety-binding regions. \\ \UniProt Amino Acid Mutations \Mutagenesis sites and sequence variants. \\ \UniProt Protein Primary/Secondary Structure Annotations \Beta strands, helices, coiled-coil regions and turns. \\ \UniProt Sequence Conflicts \Differences between Genbank sequences and the UniProt sequence. \\ \UniProt Repeats \Regions of repeated sequence motifs or repeated domains. \\ \UniProt Other Annotations \All other annotations, e.g. compositional bias \\ For consistency and convenience for users of mutation-related tracks,\ the subtrack "UniProt/SwissProt Variants" is a copy of the track\ "UniProt Variants" in the track group "Phenotype and Literature", or \ "Variation and Repeats", depending on the assembly.\
\ \Display Conventions and Configuration
\ \\ Genomic locations of UniProt/SwissProt annotations are labeled with a short name for\ the type of annotation (e.g. "glyco", "disulf bond", "Signal peptide"\ etc.). A click on them shows the full annotation and provides a link to the UniProt/SwissProt\ record for more details. TrEMBL annotations are always shown in \ light blue, except in the Signal Peptides,\ Extracellular Domains, Transmembrane Domains, and Cytoplamsic domains subtracks.
\ \\ Mouse over a feature to see the full UniProt annotation comment. For variants, the mouse over will\ show the full name of the UniProt disease acronym.\
\ \\ The subtracks for domains related to subcellular location are sorted from outside to inside of \ the cell: Signal peptide, \ extracellular, \ transmembrane, and cytoplasmic.\
\ \\ Features in the "UniProt Modifications" (modified residues) track are drawn in \ light green. Disulfide bonds are shown in \ dark grey. Topological domains\ in maroon and zinc finger regions in \ olive green.\
\ \\ Duplicate annotations are removed as far as possible: if a TrEMBL annotation\ has the same genome position and same feature type, comment, disease and\ mutated amino acids as a SwissProt annotation, it is not shown again. Two\ annotations mapped through different protein sequence alignments but with the same genome\ coordinates are only shown once.
\ \On the configuration page of this track, you can choose to hide any TrEMBL annotations.\ This filter will also hide the UniProt alternative isoform protein sequences because\ both types of information are less relevant to most users. Please contact us if you\ want more detailed filtering features.
\ \Note that for the human hg38 assembly and SwissProt annotations, there\ also is a public\ track hub prepared by UniProt itself, with \ genome annotations maintained by UniProt using their own mapping\ method based on those Gencode/Ensembl gene models that are annotated in UniProt\ for a given protein. For proteins that differ from the genome, UniProt's mapping method\ will, in most cases, map a protein and its annotations to an unexpected location\ (see below for details on UCSC's mapping method).
\ \Methods
\ \\ Briefly, UniProt protein sequences were aligned to the transcripts associated\ with the protein, the top-scoring alignments were retained, and the result was\ projected to the genome through a transcript-to-genome alignment.\ Depending on the genome, the transcript-genome alignments was either\ provided by the source database (NBCI RefSeq), created at UCSC (UCSC RefSeq) or\ derived from the transcripts (Ensembl/Augustus). The transcript set is NCBI\ RefSeq for hg38, UCSC RefSeq for hg19 (due to alt/fix haplotype misplacements \ in the NCBI RefSeq set on hg19). For other genomes, RefSeq, Ensembl and Augustus \ are tried, in this order. The resulting protein-genome alignments of this process \ are available in the file formats for liftOver or pslMap from our data archive\ (see "Data Access" section below).\
\ \An important step of the mapping process protein -> transcript ->\ genome is filtering the alignment from protein to transcript. Due to\ differences between the UniProt proteins and the transcripts (proteins were\ made many years before the transcripts were made, and human genomes have\ variants), the transcript with the highest BLAST score when aligning the\ protein to all transcripts is not always the correct transcript for a protein\ sequence. Therefore, the protein sequence is aligned to only a very short list\ of one or sometimes more transcripts, selected by a three-step procedure:\
\
\ \ \- Use transcripts directly annotated by UniProt: for organisms that have a RefSeq transcript track,\ proteins are aligned to the RefSeq transcripts that are annotated\ by UniProt for this particular protein.\
- Use transcripts for NCBI Gene ID annotated by UniProt: If no transcripts are annotated on the\ protein, or the annotated ones have been deprecated by NCBI, but a NCBI Gene ID is\ annotated, the RefSeq transcripts for this Gene ID are used. This can result in multiple matching transcripts for a protein.\
- Use best matching transcript: If no NCBI Gene is\ annotated, then BLAST scores are used to pick the transcripts. There can be multiple transcripts for one\ protein, as their coding sequences can be identical. All transcripts within 1% of the highest observed BLAST score are used.\
\ For strategy 2 and 3, many of the transcripts found do not differ in coding\ sequence, so the resulting alignments on the genome will be identical.\ Therefore, any identical alignments are removed in a final filtering step. The\ details page of these alignments will contain a list of all transcripts that\ result in the same protein-genome alignment. On hg38, only a handful of edge\ cases (pseudogenes, very recently added proteins) remain in 2023 where strategy\ 3 has to be used.
\ \In other words, when an NCBI or UCSC RefSeq track is used for the mapping and to align a\ protein sequence to the correct transcript, we use a three stage process:\
\
\ \- If UniProt has annotated a given RefSeq transcript for a given protein\ sequence, the protein is aligned to this transcript. Any difference in the\ version suffix is tolerated in this comparison. \
- If no transcript is annotated or the transcript cannot be found in the\ NCBI/UCSC RefSeq track, the UniProt-annotated NCBI Gene ID is resolved to a\ set of NCBI RefSeq transcript IDs via the most current version of NCBI\ genes tables. Only the top match of the resulting alignments and all\ others within 1% of its score are used for the mapping.\
- If no transcript can be found after step (2), the protein is aligned to all transcripts,\ the top match, and all others within 1% of its score are used.\
This system was designed to resolve the problem of incorrect mappings of\ proteins, mostly on hg38, due to differences between the SwissProt\ sequences and the genome reference sequence, which has changed since the\ proteins were defined. The problem is most pronounced for gene families\ composed of either very repetitive or very similar proteins. To make sure that\ the alignments always go to the best chromosome location, all _alt and _fix\ reference patch sequences are ignored for the alignment, so the patches are\ entirely free of UniProt annotations. Please contact us if you have feedback on\ this process or example edge cases. We are not aware of a way to evaluate the\ results completely and in an automated manner.
\\ Proteins were aligned to transcripts with TBLASTN, converted to PSL, filtered\ with pslReps (93% query coverage, keep alignments within top 1% score), lifted to genome\ positions with pslMap and filtered again with pslReps. UniProt annotations were\ obtained from the UniProt XML file. The UniProt annotations were then mapped to the\ genome through the alignment described above using the pslMap program. This approach\ draws heavily on the LS-SNP pipeline by Mark Diekhans.\ Like all Genome Browser source code, the main script used to build this track\ can be found on Github.\
\ \Older releases
\\ This track is automatically updated on an ongoing basis, every 2-3 months.\ The current version name is always shown on the track details page, it includes the\ release of UniProt, the version of the transcript set and a unique MD5 that is\ based on the protein sequences, the transcript sequences, the mapping file\ between both and the transcript-genome alignment. The exact transcript\ that was used for the alignment is shown when clicking a protein alignment\ in one of the two alignment tracks.\
\ \\ For reproducibility of older analysis results and for manual inspection, previous versions of this track\ are available for browsing in the form of the UCSC UniProt Archive Track Hub (click this link to connect the hub now). The underlying data of\ all releases of this track (past and current) can be obtained from our downloads server, including the UniProt\ protein-to-genome alignment.
\ \Data Access
\ \\ The raw data of the current track can be explored interactively with the\ Table Browser, or the\ Data Integrator.\ For automated analysis, the genome annotation is stored in a bigBed file that \ can be downloaded from the\ download server.\ The exact filenames can be found in the \ track configuration file. \ Annotations can be converted to ASCII text by our tool bigBedToBed\ which can be compiled from the source code or downloaded as a precompiled\ binary for your system. Instructions for downloading source code and binaries can be found\ here.\ The tool can also be used to obtain only features within a given range, for example:\
\ bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/dm6/uniprot/unipStruct.bb -chrom=chr6 -start=0 -end=1000000 stdout \
\ Please refer to our\ mailing list archives\ for questions, or our\ Data Access FAQ\ for more information. \ \ \\ \
Lifting from UniProt to genome coordinates in pipelines
\To facilitate mapping protein coordinates to the genome, we provide the\ alignment files in formats that are suitable for our command line tools. Our\ command line programs liftOver or pslMap can be used to map\ coordinates on protein sequences to genome coordinates. The filenames are\ unipToGenome.over.chain.gz (liftOver) and unipToGenomeLift.psl.gz (pslMap).
\ \Example commands:\
\ wget -q https://hgdownload.soe.ucsc.edu/goldenPath/archive/hg38/uniprot/2022_03/unipToGenome.over.chain.gz\ wget -q https://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/liftOver\ chmod a+x liftOver\ echo 'Q99697 1 10 annotationOnProtein' > prot.bed\ liftOver prot.bed unipToGenome.over.chain.gz genome.bed\ cat genome.bed\\ \ \Credits
\ \\ This track was created by Maximilian Haeussler at UCSC, with a lot of input from Chris\ Lee, Mark Diekhans and Brian Raney, feedback from the UniProt staff, Alejo\ Mujica, Regeneron Pharmaceuticals and Pia Riestra, GeneDx. Thanks to UniProt for making all data\ available for download.\
\ \References
\ \\ UniProt Consortium.\ \ Reorganizing the protein space at the Universal Protein Resource (UniProt).\ Nucleic Acids Res. 2012 Jan;40(Database issue):D71-5.\ PMID: 22102590; PMC: PMC3245120\
\ \\ Yip YL, Scheib H, Diemand AV, Gattiker A, Famiglietti LM, Gasteiger E, Bairoch A.\ \ The Swiss-Prot variant page and the ModSNP database: a resource for sequence and structure\ information on human protein variants.\ Hum Mutat. 2004 May;23(5):464-70.\ PMID: 15108278\
\ genes 1 allButtonPair on\ compositeTrack on\ dataVersion /gbdb/$D/uniprot/version.txt\ exonNumbers off\ group genes\ hideEmptySubtracks on\ itemRgb on\ longLabel UniProt SwissProt/TrEMBL Protein Annotations\ mouseOverField comments\ shortLabel UniProt\ track uniprot\ type bigBed 12 +\ urls uniProtId="http://www.uniprot.org/uniprot/$$#section_features" pmids="https://www.ncbi.nlm.nih.gov/pubmed/$$"\ visibility hide\ spMut UniProt Variants bigBed 12 + UniProt/SwissProt Amino Acid Substitutions 0 100 0 0 0 127 127 127 0 0 0Description
\ \\\ \NOTE:
\ This track is intended for use primarily by physicians and other\ professionals concerned with genetic disorders, by genetics researchers, and\ by advanced students in science and medicine. While the genome browser database\ is open to the public, users seeking information about a personal medical or\ genetic condition are urged to consult with a qualified physician for\ diagnosis and for answers to personal questions.\ This track shows the genomic positions of natural and artifical amino acid variants\ in the UniProt/SwissProt database.\ The data has been curated from scientific publications by the UniProt staff.\
\ \Display Conventions and Configuration
\ \\ Genomic locations of UniProt/SwissProt variants are labeled with the amino acid\ change at a given position and, if known, the abbreviated disease name. A\ "?" is used if there is no disease annotated at this location, but the\ protein is described as being linked to only a single disease in UniProt.\
\ \\ Mouse over a mutation to see the UniProt comments.\
\ \\ Artificially-introduced mutations are colored green and naturally-occurring variants are colored\ red. For full information about a particular variant, click the "UniProt variant" linkout. \ The "UniProt record" linkout lists all variants of a particular protein sequence.\ The "Source articles" linkout lists the articles in PubMed that originally described\ the variant(s) and were used as evidence by the UniProt curators.\
\ \Methods
\ \\ UniProt sequences were aligned to RefSeq sequences first with BLAT, then lifted\ to genome positions with pslMap. UniProt variants were parsed from the UniProt\ XML file. The variants were then mapped to the genome through the alignment\ using the pslMap program. This mapping approach\ draws heavily on the LS-SNP pipeline by Mark Diekhans. The complete script is\ part of the kent source tree and is located in src/hg/utils/uniprotMutations. \
\ \Data Access
\ \\ The raw data can be explored interactively with the\ Table Browser, or the\ Data Integrator.\ For automated analysis, the genome annotation is stored in a bigBed file that\ can be downloaded from the\ download server.\ The underlying data file for this track is called spMut.bb. Individual \ regions or the whole genome annotation can be obtained using our tool bigBedToBed \ which can be compiled from the source code or downloaded as a precompiled binary\ for your system. Instructions for downloading source code and binaries can be found\ here. \ The tool can also be used to obtain only features within a given range, for example:\
\ \ \
\ bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/dm6/bbi/uniprot/spMut.bb -chrom=chr6 -start=0 -end=1000000 stdout \
\ Please refer to our\ mailing list archives\ for questions, or our\ Data Access FAQ\ for more information. \Credits
\ \\ This track was created by Maximilian Haeussler, with advice from Mark Diekhans and Brian Raney.\
\ \References
\ \\ UniProt Consortium.\ \ Reorganizing the protein space at the Universal Protein Resource (UniProt).\ Nucleic Acids Res. 2012 Jan;40(Database issue):D71-5.\ PMID: 22102590; PMC: PMC3245120\
\ \\ Yip YL, Scheib H, Diemand AV, Gattiker A, Famiglietti LM, Gasteiger E, Bairoch A.\ \ The Swiss-Prot variant page and the ModSNP database: a resource for sequence and structure\ information on human protein variants.\ Hum Mutat. 2004 May;23(5):464-70.\ PMID: 15108278\
\ varRep 1 bigDataUrl /gbdb/dm6/uniprot/unipMut.bb\ exonNumbers off\ group varRep\ itemRgb on\ longLabel UniProt/SwissProt Amino Acid Substitutions\ maxWindowCoverage 10000000\ mouseOverField comments\ noScoreFilter on\ shortLabel UniProt Variants\ track spMut\ type bigBed 12 +\ urls variationId="http://www.uniprot.org/uniprot/$$" uniProtId="http://www.uniprot.org/uniprot/$$" pmids="https://www.ncbi.nlm.nih.gov/pubmed/$$"\ visibility hide\ windowmaskerSdust WM + SDust bed 3 Genomic Intervals Masked by WindowMasker + SDust 0 100 0 0 0 127 127 127 0 0 0Description
\ \\ This track depicts masked sequence as determined by\ WindowMasker. The\ WindowMasker tool is included in the NCBI C++ toolkit. The source code\ for the entire toolkit is available from the NCBI\ \ FTP site.\
\ \Methods
\ \\ To create this track, WindowMasker was run with the following parameters:\
\ windowmasker -mk_counts true -input dm6.fa -output wm_counts\ windowmasker -ustat wm_counts -sdust true -input dm6.fa -output repeats.bed\\ The repeats.bed (BED3) file was loaded into the "windowmaskerSdust" table for\ this track.\ \ \References
\ \\ Morgulis A, Gertz EM, Schäffer AA, Agarwala R.\ WindowMasker: window-based masker for sequenced genomes.\ Bioinformatics. 2006 Jan 15;22(2):134-41.\ PMID: 16287941\
\ varRep 1 group varRep\ longLabel Genomic Intervals Masked by WindowMasker + SDust\ shortLabel WM + SDust\ track windowmaskerSdust\ type bed 3\ visibility hide\ chainCondylostylus_patibulatus Condylostylus_patibulatus Chain chain Condylostylus_patibulatus Condylostylus_patibulatus (Condylostylus_patibulatus) Chained Alignments 3 101 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Condylostylus_patibulatus (Condylostylus_patibulatus) Chained Alignments\ otherDb Condylostylus_patibulatus\ parent insectsChainNetViewchain off\ shortLabel Condylostylus_patibulatus Chain\ subGroups view=chain species=s075 clade=c00\ track chainCondylostylus_patibulatus\ type chain Condylostylus_patibulatus\ chainMegaselia_scalaris Megaselia_scalaris Chain chain Megaselia_scalaris Megaselia_scalaris (Megaselia_scalaris) Chained Alignments 3 102 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Megaselia_scalaris (Megaselia_scalaris) Chained Alignments\ otherDb Megaselia_scalaris\ parent insectsChainNetViewchain off\ shortLabel Megaselia_scalaris Chain\ subGroups view=chain species=s076 clade=c00\ track chainMegaselia_scalaris\ type chain Megaselia_scalaris\ chainTrupanea_jonesi Trupanea_jonesi Chain chain Trupanea_jonesi Trupanea_jonesi (Trupanea_jonesi) Chained Alignments 3 103 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Trupanea_jonesi (Trupanea_jonesi) Chained Alignments\ otherDb Trupanea_jonesi\ parent insectsChainNetViewchain off\ shortLabel Trupanea_jonesi Chain\ subGroups view=chain species=s077 clade=c00\ track chainTrupanea_jonesi\ type chain Trupanea_jonesi\ chainAedes_albopictus Aedes_albopictus Chain chain Aedes_albopictus Aedes_albopictus (Aedes_albopictus) Chained Alignments 3 104 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Aedes_albopictus (Aedes_albopictus) Chained Alignments\ otherDb Aedes_albopictus\ parent insectsChainNetViewchain off\ shortLabel Aedes_albopictus Chain\ subGroups view=chain species=s078 clade=c01\ track chainAedes_albopictus\ type chain Aedes_albopictus\ chainAedes_aegypti Aedes_aegypti Chain chain Aedes_aegypti Aedes_aegypti (Aedes_aegypti) Chained Alignments 3 105 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Aedes_aegypti (Aedes_aegypti) Chained Alignments\ otherDb Aedes_aegypti\ parent insectsChainNetViewchain off\ shortLabel Aedes_aegypti Chain\ subGroups view=chain species=s079 clade=c01\ track chainAedes_aegypti\ type chain Aedes_aegypti\ chainAnoGam3 A. gambiae Chain chain anoGam3 A. gambiae (Oct. 2006 (AgamP3/anoGam3)) Chained Alignments 3 106 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel A. gambiae (Oct. 2006 (AgamP3/anoGam3)) Chained Alignments\ otherDb anoGam3\ parent insectsChainNetViewchain off\ shortLabel A. gambiae Chain\ subGroups view=chain species=s080a clade=c01\ track chainAnoGam3\ type chain anoGam3\ netAnoGam3 A. gambiae Net netAlign anoGam3 chainAnoGam3 A. gambiae (Oct. 2006 (AgamP3/anoGam3)) Alignment Net 1 107 0 0 0 255 255 0 0 0 0 compGeno 0 longLabel A. gambiae (Oct. 2006 (AgamP3/anoGam3)) Alignment Net\ otherDb anoGam3\ parent insectsChainNetViewnet off\ shortLabel A. gambiae Net\ subGroups view=net species=s080a clade=c01\ track netAnoGam3\ type netAlign anoGam3 chainAnoGam3\ chainAnoGam1 A. gambiae Chain chain anoGam1 A. gambiae (Feb. 2003 (IAGEC MOZ2/anoGam1)) Chained Alignments 3 108 0 0 0 255 255 0 1 0 0Description
\\ This track shows alignments of A. gambiae (anoGam1, Feb. 2003 (IAGEC MOZ2/anoGam1)) to the\ D. melanogaster genome using a gap scoring system that allows longer gaps \ than traditional affine gap scoring systems. It can also tolerate gaps in both\ A. gambiae and D. melanogaster simultaneously. These \ "double-sided" gaps can be caused by local inversions and \ overlapping deletions in both species. The A. gambiae sequence is \ from the \ MOZ2 assembly.
\\ The chain track displays boxes joined together by either single or\ double lines. The boxes represent aligning regions.\ Single lines indicate gaps that are largely due to a deletion in the\ A. gambiae assembly or an insertion in the D. melanogaster \ assembly. Double lines represent more complex gaps that involve substantial\ sequence in both species. This may result from inversions, overlapping\ deletions, an abundance of local mutation, or an unsequenced gap in one\ species. In cases where multiple chains align over a particular region of\ the D. melanogaster genome, the chains with single-lined gaps are often \ due to processed pseudogenes, while chains with double-lined gaps are more \ often due to paralogs and unprocessed pseudogenes.
\\ In the "pack" and "full" display\ modes, the individual feature names indicate the chromosome, strand, and\ location (in thousands) of the match for each matching alignment.
\ \ \Display Conventions and Configuration
\By default, the chains to chromosome-based assemblies are colored\ based on which chromosome they map to in the aligning organism. To turn\ off the coloring, check the "off" button next to: Color\ track based on chromosome.
\\ To display only the chains of one chromosome in the aligning\ organism, enter the name of that chromosome (e.g. chr4) in box next to: \ Filter by chromosome.
\ \Methods
\\ The A. gambiae/D. melanogaster genomes were aligned with \ blastz and converted into axt format using the lavToAxt program.\ The axt alignments were fed into axtChain, which organizes all \ alignments between a single A. gambiae chromosome and a single \ D. melanogaster chromosome into a group and creates a kd-tree out \ of the gapless subsections (blocks) of the alignments. A dynamic program \ was then run over the kd-trees to find the maximally scoring chains of these \ blocks. Chains scoring below a threshold were discarded; the remaining \ chains are displayed in this track.
\ \Credits
\\ Blastz was developed at Pennsylvania State University by \ Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from\ Ross Hardison.
\\ Lineage-specific repeats were identified by Arian Smit and his \ RepeatMasker\ program.
\\ The axtChain program was developed at the University of California at \ Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler.
\\ The browser display and database storage of the chains were generated\ by Robert Baertsch and Jim Kent.
\ \References
\\ Chiaromonte F, Yap VB, Miller W.\ Scoring pairwise genomic sequence alignments.\ Pac Symp Biocomput. 2002:115-26.\ PMID: 11928468\
\ \\ Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D.\ Evolution's cauldron:\ duplication, deletion, and rearrangement in the mouse and human genomes.\ Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9.\ PMID: 14500911; PMC: PMC208784\
\ \\ Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC,\ Haussler D, Miller W.\ Human-mouse alignments with BLASTZ.\ Genome Res. 2003 Jan;13(1):103-7.\ PMID: 12529312; PMC: PMC430961\
\ compGeno 1 longLabel A. gambiae (Feb. 2003 (IAGEC MOZ2/anoGam1)) Chained Alignments\ otherDb anoGam1\ parent insectsChainNetViewchain off\ shortLabel A. gambiae Chain\ subGroups view=chain species=s080b clade=c01\ track chainAnoGam1\ type chain anoGam1\ netAnoGam1 A. gambiae Net netAlign anoGam1 chainAnoGam1 A. gambiae (Feb. 2003 (IAGEC MOZ2/anoGam1)) Alignment Net 1 109 0 0 0 255 255 0 0 0 0Description
\\ This track shows the best A. gambiae/D. melanogaster chain for \ every part of the D. melanogaster genome. It is useful for\ finding orthologous regions and for studying genome\ rearrangement. The A. gambiae sequence used in this annotation is \ from the Feb. 2003 (IAGEC MOZ2/anoGam1) (anoGam1) assembly.
\ \Display Conventions and Configuration
\\ In full display mode, the top-level (level 1)\ chains are the largest, highest-scoring chains that\ span this region. In many cases gaps exist in the\ top-level chain. When possible, these are filled in by\ other chains that are displayed at level 2. The gaps in \ level 2 chains may be filled by level 3 chains and so\ forth.
\\ In the graphical display, the boxes represent ungapped \ alignments; the lines represent gaps. Click\ on a box to view detailed information about the chain\ as a whole; click on a line to display information\ about the gap. The detailed information is useful in determining\ the cause of the gap or, for lower level chains, the genomic\ rearrangement.
\\ Individual items in the display are categorized as one of four types\ (other than gap):
\\
\ \- Top - the best, longest match. Displayed on level 1.\
- Syn - line-ups on the same chromosome as the gap in the level above\ it.\
- Inv - a line-up on the same chromosome as the gap above it, but in \ the opposite orientation.\
- NonSyn - a match to a chromosome different from the gap in the \ level above.\
Methods
\\ Chains were derived from blastz alignments, using the methods\ described on the chain tracks description pages, and sorted with the \ highest-scoring chains in the genome ranked first. The program\ chainNet was then used to place the chains one at a time, trimming them as \ necessary to fit into sections not already covered by a higher-scoring chain. \ During this process, a natural hierarchy emerged in which a chain that filled \ a gap in a higher-scoring chain was placed underneath that chain. The program \ netSyntenic was used to fill in information about the relationship between \ higher- and lower-level chains, such as whether a lower-level\ chain was syntenic or inverted relative to the higher-level chain. \ The program netClass was then used to fill in how much of the gaps and chains \ contained Ns (sequencing gaps) in one or both species and how much\ was filled with transposons inserted before and after the two organisms \ diverged.
\ \Credits
\\ The chainNet, netSyntenic, and netClass programs were\ developed at the University of California\ Santa Cruz by Jim Kent.
\\ Blastz was developed at Pennsylvania State University by\ Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from\ Ross Hardison.
\\ Lineage-specific repeats were identified by Arian Smit and his program \ RepeatMasker.
\\ The browser display and database storage of the nets were made\ by Robert Baertsch and Jim Kent.
\ \References
\\ Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D.\ Evolution's cauldron:\ duplication, deletion, and rearrangement in the mouse and human genomes.\ Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9.\ PMID: 14500911; PMC: PMC208784\
\ \\ Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC,\ Haussler D, Miller W.\ Human-mouse alignments with BLASTZ.\ Genome Res. 2003 Jan;13(1):103-7.\ PMID: 12529312; PMC: PMC430961\
\ compGeno 0 longLabel A. gambiae (Feb. 2003 (IAGEC MOZ2/anoGam1)) Alignment Net\ otherDb anoGam1\ parent insectsChainNetViewnet off\ shortLabel A. gambiae Net\ subGroups view=net species=s080b clade=c01\ track netAnoGam1\ type netAlign anoGam1 chainAnoGam1\ chainCulex_quinquefasciatus Culex_quinquefasciatus Chain chain Culex_quinquefasciatus Culex_quinquefasciatus (Culex_quinquefasciatus) Chained Alignments 3 110 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Culex_quinquefasciatus (Culex_quinquefasciatus) Chained Alignments\ otherDb Culex_quinquefasciatus\ parent insectsChainNetViewchain off\ shortLabel Culex_quinquefasciatus Chain\ subGroups view=chain species=s081 clade=c01\ track chainCulex_quinquefasciatus\ type chain Culex_quinquefasciatus\ chainA_maculatus A_maculatus Chain chain A_maculatus A_maculatus (A_maculatus) Chained Alignments 3 111 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel A_maculatus (A_maculatus) Chained Alignments\ otherDb A_maculatus\ parent insectsChainNetViewchain off\ shortLabel A_maculatus Chain\ subGroups view=chain species=s082 clade=c01\ track chainA_maculatus\ type chain A_maculatus\ chainA_merus A_merus Chain chain A_merus A_merus (A_merus) Chained Alignments 3 112 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel A_merus (A_merus) Chained Alignments\ otherDb A_merus\ parent insectsChainNetViewchain off\ shortLabel A_merus Chain\ subGroups view=chain species=s083 clade=c01\ track chainA_merus\ type chain A_merus\ chainA_dirus A_dirus Chain chain A_dirus A_dirus (A_dirus) Chained Alignments 3 113 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel A_dirus (A_dirus) Chained Alignments\ otherDb A_dirus\ parent insectsChainNetViewchain off\ shortLabel A_dirus Chain\ subGroups view=chain species=s084 clade=c01\ track chainA_dirus\ type chain A_dirus\ chainA_arabiensis A_arabiensis Chain chain A_arabiensis A_arabiensis (A_arabiensis) Chained Alignments 3 114 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel A_arabiensis (A_arabiensis) Chained Alignments\ otherDb A_arabiensis\ parent insectsChainNetViewchain off\ shortLabel A_arabiensis Chain\ subGroups view=chain species=s085 clade=c01\ track chainA_arabiensis\ type chain A_arabiensis\ chainA_sinensis A_sinensis Chain chain A_sinensis A_sinensis (A_sinensis) Chained Alignments 3 115 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel A_sinensis (A_sinensis) Chained Alignments\ otherDb A_sinensis\ parent insectsChainNetViewchain off\ shortLabel A_sinensis Chain\ subGroups view=chain species=s086 clade=c01\ track chainA_sinensis\ type chain A_sinensis\ chainA_atroparvus A_atroparvus Chain chain A_atroparvus A_atroparvus (A_atroparvus) Chained Alignments 3 116 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel A_atroparvus (A_atroparvus) Chained Alignments\ otherDb A_atroparvus\ parent insectsChainNetViewchain off\ shortLabel A_atroparvus Chain\ subGroups view=chain species=s087 clade=c01\ track chainA_atroparvus\ type chain A_atroparvus\ chainA_epiroticus A_epiroticus Chain chain A_epiroticus A_epiroticus (A_epiroticus) Chained Alignments 3 117 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel A_epiroticus (A_epiroticus) Chained Alignments\ otherDb A_epiroticus\ parent insectsChainNetViewchain off\ shortLabel A_epiroticus Chain\ subGroups view=chain species=s088 clade=c01\ track chainA_epiroticus\ type chain A_epiroticus\ chainA_quadriannulatus A_quadriannulatus Chain chain A_quadriannulatus A_quadriannulatus (A_quadriannulatus) Chained Alignments 3 118 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel A_quadriannulatus (A_quadriannulatus) Chained Alignments\ otherDb A_quadriannulatus\ parent insectsChainNetViewchain off\ shortLabel A_quadriannulatus Chain\ subGroups view=chain species=s089 clade=c01\ track chainA_quadriannulatus\ type chain A_quadriannulatus\ chainA_farauti A_farauti Chain chain A_farauti A_farauti (A_farauti) Chained Alignments 3 119 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel A_farauti (A_farauti) Chained Alignments\ otherDb A_farauti\ parent insectsChainNetViewchain off\ shortLabel A_farauti Chain\ subGroups view=chain species=s090 clade=c01\ track chainA_farauti\ type chain A_farauti\ chainA_minimus A_minimus Chain chain A_minimus A_minimus (A_minimus) Chained Alignments 3 120 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel A_minimus (A_minimus) Chained Alignments\ otherDb A_minimus\ parent insectsChainNetViewchain off\ shortLabel A_minimus Chain\ subGroups view=chain species=s091 clade=c01\ track chainA_minimus\ type chain A_minimus\ chainA_funestus A_funestus Chain chain A_funestus A_funestus (A_funestus) Chained Alignments 3 121 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel A_funestus (A_funestus) Chained Alignments\ otherDb A_funestus\ parent insectsChainNetViewchain off\ shortLabel A_funestus Chain\ subGroups view=chain species=s092 clade=c01\ track chainA_funestus\ type chain A_funestus\ chainA_melas A_melas Chain chain A_melas A_melas (A_melas) Chained Alignments 3 122 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel A_melas (A_melas) Chained Alignments\ otherDb A_melas\ parent insectsChainNetViewchain off\ shortLabel A_melas Chain\ subGroups view=chain species=s093 clade=c01\ track chainA_melas\ type chain A_melas\ chainA_coluzzii A_coluzzii Chain chain A_coluzzii A_coluzzii (A_coluzzii) Chained Alignments 3 123 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel A_coluzzii (A_coluzzii) Chained Alignments\ otherDb A_coluzzii\ parent insectsChainNetViewchain off\ shortLabel A_coluzzii Chain\ subGroups view=chain species=s094 clade=c01\ track chainA_coluzzii\ type chain A_coluzzii\ chainClogmia_albipunctata Clogmia_albipunctata Chain chain Clogmia_albipunctata Clogmia_albipunctata (Clogmia_albipunctata) Chained Alignments 3 124 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Clogmia_albipunctata (Clogmia_albipunctata) Chained Alignments\ otherDb Clogmia_albipunctata\ parent insectsChainNetViewchain off\ shortLabel Clogmia_albipunctata Chain\ subGroups view=chain species=s095 clade=c01\ track chainClogmia_albipunctata\ type chain Clogmia_albipunctata\ chainPhlebotomus_papatasi Phlebotomus_papatasi Chain chain Phlebotomus_papatasi Phlebotomus_papatasi (Phlebotomus_papatasi) Chained Alignments 3 125 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Phlebotomus_papatasi (Phlebotomus_papatasi) Chained Alignments\ otherDb Phlebotomus_papatasi\ parent insectsChainNetViewchain off\ shortLabel Phlebotomus_papatasi Chain\ subGroups view=chain species=s096 clade=c01\ track chainPhlebotomus_papatasi\ type chain Phlebotomus_papatasi\ chainA_culicifacies A_culicifacies Chain chain A_culicifacies A_culicifacies (A_culicifacies) Chained Alignments 3 126 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel A_culicifacies (A_culicifacies) Chained Alignments\ otherDb A_culicifacies\ parent insectsChainNetViewchain off\ shortLabel A_culicifacies Chain\ subGroups view=chain species=s097 clade=c01\ track chainA_culicifacies\ type chain A_culicifacies\ chainA_stephensi A_stephensi Chain chain A_stephensi A_stephensi (A_stephensi) Chained Alignments 3 127 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel A_stephensi (A_stephensi) Chained Alignments\ otherDb A_stephensi\ parent insectsChainNetViewchain off\ shortLabel A_stephensi Chain\ subGroups view=chain species=s098 clade=c01\ track chainA_stephensi\ type chain A_stephensi\ chainCoboldia_fuscipes Coboldia_fuscipes Chain chain Coboldia_fuscipes Coboldia_fuscipes (Coboldia_fuscipes) Chained Alignments 3 128 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Coboldia_fuscipes (Coboldia_fuscipes) Chained Alignments\ otherDb Coboldia_fuscipes\ parent insectsChainNetViewchain off\ shortLabel Coboldia_fuscipes Chain\ subGroups view=chain species=s099 clade=c01\ track chainCoboldia_fuscipes\ type chain Coboldia_fuscipes\ chainCulicoides_sonorensis Culicoides_sonorensis Chain chain Culicoides_sonorensis Culicoides_sonorensis (Culicoides_sonorensis) Chained Alignments 3 129 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Culicoides_sonorensis (Culicoides_sonorensis) Chained Alignments\ otherDb Culicoides_sonorensis\ parent insectsChainNetViewchain off\ shortLabel Culicoides_sonorensis Chain\ subGroups view=chain species=s100 clade=c01\ track chainCulicoides_sonorensis\ type chain Culicoides_sonorensis\ chainA_albimanus A_albimanus Chain chain A_albimanus A_albimanus (A_albimanus) Chained Alignments 3 130 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel A_albimanus (A_albimanus) Chained Alignments\ otherDb A_albimanus\ parent insectsChainNetViewchain off\ shortLabel A_albimanus Chain\ subGroups view=chain species=s101 clade=c01\ track chainA_albimanus\ type chain A_albimanus\ chainA_christyi A_christyi Chain chain A_christyi A_christyi (A_christyi) Chained Alignments 3 131 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel A_christyi (A_christyi) Chained Alignments\ otherDb A_christyi\ parent insectsChainNetViewchain off\ shortLabel A_christyi Chain\ subGroups view=chain species=s102 clade=c01\ track chainA_christyi\ type chain A_christyi\ chainA_cracens A_cracens Chain chain A_cracens A_cracens (A_cracens) Chained Alignments 3 132 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel A_cracens (A_cracens) Chained Alignments\ otherDb A_cracens\ parent insectsChainNetViewchain off\ shortLabel A_cracens Chain\ subGroups view=chain species=s103 clade=c01\ track chainA_cracens\ type chain A_cracens\ chainA_aquasalis A_aquasalis Chain chain A_aquasalis A_aquasalis (A_aquasalis) Chained Alignments 3 133 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel A_aquasalis (A_aquasalis) Chained Alignments\ otherDb A_aquasalis\ parent insectsChainNetViewchain off\ shortLabel A_aquasalis Chain\ subGroups view=chain species=s104 clade=c01\ track chainA_aquasalis\ type chain A_aquasalis\ chainMochlonyx_cinctipes Mochlonyx_cinctipes Chain chain Mochlonyx_cinctipes Mochlonyx_cinctipes (Mochlonyx_cinctipes) Chained Alignments 3 134 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Mochlonyx_cinctipes (Mochlonyx_cinctipes) Chained Alignments\ otherDb Mochlonyx_cinctipes\ parent insectsChainNetViewchain off\ shortLabel Mochlonyx_cinctipes Chain\ subGroups view=chain species=s105 clade=c01\ track chainMochlonyx_cinctipes\ type chain Mochlonyx_cinctipes\ chainA_darlingi A_darlingi Chain chain A_darlingi A_darlingi (A_darlingi) Chained Alignments 3 135 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel A_darlingi (A_darlingi) Chained Alignments\ otherDb A_darlingi\ parent insectsChainNetViewchain off\ shortLabel A_darlingi Chain\ subGroups view=chain species=s106 clade=c01\ track chainA_darlingi\ type chain A_darlingi\ chainA_farauti_No4 A_farauti_No4 Chain chain A_farauti_No4 A_farauti_No4 (A_farauti_No4) Chained Alignments 3 136 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel A_farauti_No4 (A_farauti_No4) Chained Alignments\ otherDb A_farauti_No4\ parent insectsChainNetViewchain off\ shortLabel A_farauti_No4 Chain\ subGroups view=chain species=s107 clade=c01\ track chainA_farauti_No4\ type chain A_farauti_No4\ chainLutzomyia_longipalpis Lutzomyia_longipalpis Chain chain Lutzomyia_longipalpis Lutzomyia_longipalpis (Lutzomyia_longipalpis) Chained Alignments 3 137 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Lutzomyia_longipalpis (Lutzomyia_longipalpis) Chained Alignments\ otherDb Lutzomyia_longipalpis\ parent insectsChainNetViewchain off\ shortLabel Lutzomyia_longipalpis Chain\ subGroups view=chain species=s108 clade=c01\ track chainLutzomyia_longipalpis\ type chain Lutzomyia_longipalpis\ chainA_koliensis A_koliensis Chain chain A_koliensis A_koliensis (A_koliensis) Chained Alignments 3 138 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel A_koliensis (A_koliensis) Chained Alignments\ otherDb A_koliensis\ parent insectsChainNetViewchain off\ shortLabel A_koliensis Chain\ subGroups view=chain species=s109 clade=c01\ track chainA_koliensis\ type chain A_koliensis\ chainBelgica_antarctica Belgica_antarctica Chain chain Belgica_antarctica Belgica_antarctica (Belgica_antarctica) Chained Alignments 3 139 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Belgica_antarctica (Belgica_antarctica) Chained Alignments\ otherDb Belgica_antarctica\ parent insectsChainNetViewchain off\ shortLabel Belgica_antarctica Chain\ subGroups view=chain species=s110 clade=c01\ track chainBelgica_antarctica\ type chain Belgica_antarctica\ chainA_punctulatus A_punctulatus Chain chain A_punctulatus A_punctulatus (A_punctulatus) Chained Alignments 3 140 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel A_punctulatus (A_punctulatus) Chained Alignments\ otherDb A_punctulatus\ parent insectsChainNetViewchain off\ shortLabel A_punctulatus Chain\ subGroups view=chain species=s111 clade=c01\ track chainA_punctulatus\ type chain A_punctulatus\ chainClunio_marinus Clunio_marinus Chain chain Clunio_marinus Clunio_marinus (Clunio_marinus) Chained Alignments 3 141 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Clunio_marinus (Clunio_marinus) Chained Alignments\ otherDb Clunio_marinus\ parent insectsChainNetViewchain off\ shortLabel Clunio_marinus Chain\ subGroups view=chain species=s112 clade=c01\ track chainClunio_marinus\ type chain Clunio_marinus\ chainMayetiola_destructor Mayetiola_destructor Chain chain Mayetiola_destructor Mayetiola_destructor (Mayetiola_destructor) Chained Alignments 3 142 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Mayetiola_destructor (Mayetiola_destructor) Chained Alignments\ otherDb Mayetiola_destructor\ parent insectsChainNetViewchain off\ shortLabel Mayetiola_destructor Chain\ subGroups view=chain species=s113 clade=c01\ track chainMayetiola_destructor\ type chain Mayetiola_destructor\ chainChironomus_tentans Chironomus_tentans Chain chain Chironomus_tentans Chironomus_tentans (Chironomus_tentans) Chained Alignments 3 143 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Chironomus_tentans (Chironomus_tentans) Chained Alignments\ otherDb Chironomus_tentans\ parent insectsChainNetViewchain off\ shortLabel Chironomus_tentans Chain\ subGroups view=chain species=s114 clade=c01\ track chainChironomus_tentans\ type chain Chironomus_tentans\ chainChironomus_riparius Chironomus_riparius Chain chain Chironomus_riparius Chironomus_riparius (Chironomus_riparius) Chained Alignments 3 144 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Chironomus_riparius (Chironomus_riparius) Chained Alignments\ otherDb Chironomus_riparius\ parent insectsChainNetViewchain off\ shortLabel Chironomus_riparius Chain\ subGroups view=chain species=s115 clade=c01\ track chainChironomus_riparius\ type chain Chironomus_riparius\ chainA_gambiae_1 A_gambiae_1 Chain chain A_gambiae_1 A_gambiae_1 (A_gambiae_1) Chained Alignments 3 145 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel A_gambiae_1 (A_gambiae_1) Chained Alignments\ otherDb A_gambiae_1\ parent insectsChainNetViewchain off\ shortLabel A_gambiae_1 Chain\ subGroups view=chain species=s116 clade=c01\ track chainA_gambiae_1\ type chain A_gambiae_1\ chainA_nili A_nili Chain chain A_nili A_nili (A_nili) Chained Alignments 3 146 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel A_nili (A_nili) Chained Alignments\ otherDb A_nili\ parent insectsChainNetViewchain off\ shortLabel A_nili Chain\ subGroups view=chain species=s117 clade=c01\ track chainA_nili\ type chain A_nili\ chainChaoborus_trivitattus Chaoborus_trivitattus Chain chain Chaoborus_trivitattus Chaoborus_trivitattus (Chaoborus_trivitattus) Chained Alignments 3 147 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Chaoborus_trivitattus (Chaoborus_trivitattus) Chained Alignments\ otherDb Chaoborus_trivitattus\ parent insectsChainNetViewchain off\ shortLabel Chaoborus_trivitattus Chain\ subGroups view=chain species=s118 clade=c01\ track chainChaoborus_trivitattus\ type chain Chaoborus_trivitattus\ chainTipula_oleracea Tipula_oleracea Chain chain Tipula_oleracea Tipula_oleracea (Tipula_oleracea) Chained Alignments 3 148 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Tipula_oleracea (Tipula_oleracea) Chained Alignments\ otherDb Tipula_oleracea\ parent insectsChainNetViewchain off\ shortLabel Tipula_oleracea Chain\ subGroups view=chain species=s119 clade=c01\ track chainTipula_oleracea\ type chain Tipula_oleracea\ chainTrichoceridae_BV_2014 Trichoceridae_BV_2014 Chain chain Trichoceridae_BV_2014 Trichoceridae_BV_2014 (Trichoceridae_BV_2014) Chained Alignments 3 149 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel Trichoceridae_BV_2014 (Trichoceridae_BV_2014) Chained Alignments\ otherDb Trichoceridae_BV_2014\ parent insectsChainNetViewchain off\ shortLabel Trichoceridae_BV_2014 Chain\ subGroups view=chain species=s120 clade=c01\ track chainTrichoceridae_BV_2014\ type chain Trichoceridae_BV_2014\ rmsk RepeatMasker rmsk Repeating Elements by RepeatMasker 1 149.1 0 0 0 127 127 127 1 0 0Description
\\ This track was created by using Arian Smit's RepeatMasker program, which screens DNA sequences \ for interspersed repeats and low complexity DNA sequences. The program\ outputs a detailed annotation of the repeats that are present in the\ query sequence (represented by this track), as well as a modified version\ of the query sequence in which all the annotated repeats have been masked\ (generally available on the\ Downloads page). RepeatMasker uses \ the Repbase Update library of repeats from the \ Genetic \ Information Research Institute (GIRI). \ Repbase Update is described in Jurka, J. (2000) in the References section below.
\ \Display Conventions and Configuration
\\ In full display mode, this track displays up to ten different classes of repeats:\
\
\- Short interspersed nuclear elements (SINE), which include ALUs\
- Long interspersed nuclear elements (LINE)\
- Long terminal repeat elements (LTR), which include retroposons\
- DNA repeat elements (DNA)\
- Simple repeats (micro-satellites)\
- Low complexity repeats\
- Satellite repeats\
- RNA repeats (including RNA, tRNA, rRNA, snRNA, scRNA)\
- Other repeats, which includes class RC (Rolling Circle)\
- Unknown\
\ The level of color shading in the graphical display reflects the amount of \ base mismatch, base deletion, and base insertion associated with a repeat \ element. The higher the combined number of these, the lighter the shading.
\ \Methods
\\ UCSC has used the most current versions of the RepeatMasker software \ and repeat libraries available to generate these data. Note that these \ versions may be newer than those that are publicly available on the Internet. \
\\ Data are generated using the RepeatMasker -s flag. Additional flags\ may be used for certain organisms. Repeats are soft-masked. Alignments may \ extend through repeats, but are not permitted to initiate in them. \ See the \ FAQ for \ more information.
\ \Credits
\\ Thanks to Arian Smit and GIRI\ for providing the tools and repeat libraries used to generate this track.
\ \References
\\ Jurka J.\ Repbase update: a database and an electronic journal of repetitive elements.\ Trends Genet. 2000 Sep;16(9):418-20.\ PMID: 10973072\
\ varRep 0 canPack off\ group varRep\ longLabel Repeating Elements by RepeatMasker\ priority 149.1\ shortLabel RepeatMasker\ spectrum on\ track rmsk\ type rmsk\ visibility dense\ chainTriCas2 triCas2 Chain chain triCas2 T. castaneum (Sep. 2005 (Baylor 2.0/triCas2)) Chained Alignments 3 150 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel T. castaneum (Sep. 2005 (Baylor 2.0/triCas2)) Chained Alignments\ otherDb triCas2\ parent insectsChainNetViewchain off\ shortLabel triCas2 Chain\ subGroups view=chain species=s121 clade=c02\ track chainTriCas2\ type chain triCas2\ netTriCas2 triCas2 Net netAlign triCas2 chainTriCas2 T. castaneum (Sep. 2005 (Baylor 2.0/triCas2)) Alignment Net 1 151 0 0 0 255 255 0 0 0 0 compGeno 0 longLabel T. castaneum (Sep. 2005 (Baylor 2.0/triCas2)) Alignment Net\ otherDb triCas2\ parent insectsChainNetViewnet off\ shortLabel triCas2 Net\ subGroups view=net species=s121 clade=c02\ track netTriCas2\ type netAlign triCas2 chainTriCas2\ chainApiMel4 apiMel4 Chain chain apiMel4 A. mellifera (04 Nov 2010 (Amel_4.5/apiMel4)) Chained Alignments 3 152 0 0 0 255 255 0 1 0 0 compGeno 1 longLabel A. mellifera (04 Nov 2010 (Amel_4.5/apiMel4)) Chained Alignments\ otherDb apiMel4\ parent insectsChainNetViewchain off\ shortLabel apiMel4 Chain\ subGroups view=chain species=s122 clade=c02\ track chainApiMel4\ type chain apiMel4\ netApiMel4 apiMel4 Net netAlign apiMel4 chainApiMel4 A. mellifera (04 Nov 2010 (Amel_4.5/apiMel4)) Alignment Net 1 153 0 0 0 255 255 0 0 0 0 compGeno 0 longLabel A. mellifera (04 Nov 2010 (Amel_4.5/apiMel4)) Alignment Net\ otherDb apiMel4\ parent insectsChainNetViewnet off\ shortLabel apiMel4 Net\ subGroups view=net species=s122 clade=c02\ track netApiMel4\ type netAlign apiMel4 chainApiMel4\