cartVersion cartVersion cartVersion cartVersion 0 0 0 0 0 0 0 0 0 0 0 cartVersion cartVersion cartVersion 0 cartVersion 0 cytoBand Chromosome Band bed 4 + Chromosome Bands 0 0.1 0 0 0 200 150 150 0 0 0

Description

\

\ This track shows D. melanogaster cytogenetic locations, \ mapped to the release 3 genome sequence by Aubrey de Grey of FlyBase, \ July 2002.\

\ \

Credits

\

We would like to thank \ FlyBase \ and Aubrey de Grey for providing this information.\ map 1 altColor 200,150,150\ group map\ longLabel Chromosome Bands\ priority .1\ shortLabel Chromosome Band\ track cytoBand\ type bed 4 +\ visibility hide\ cytoBandIdeo Chromosome Band (Low-res) bed 4 + Chromosome Bands (Low-resolution for Chromosome Ideogram) 1 0.1 0 0 0 200 150 150 0 0 0 map 1 altColor 200,150,150\ group map\ longLabel Chromosome Bands (Low-resolution for Chromosome Ideogram)\ priority .1\ shortLabel Chromosome Band (Low-res)\ track cytoBandIdeo\ type bed 4 +\ visibility dense\ refGene RefSeq Genes genePred refPep refMrna RefSeq Genes 1 2 12 12 120 133 133 187 0 0 0

Description

\

\ The RefSeq Genes track shows known D. melanogaster protein-coding and \ non-protein-coding genes taken from the NCBI RNA reference sequences \ collection (RefSeq). The data underlying this track are updated weekly.

\ \

\ Please visit the Feedback for Gene and Reference Sequences (RefSeq) page to\ make suggestions, submit additions and corrections, or ask for help concerning\ RefSeq records.\

\ \

Display Conventions and Configuration

\

\ This track follows the display conventions for \ gene prediction \ tracks.\ The color shading indicates the level of review the RefSeq record has \ undergone: predicted (light), provisional (medium), reviewed (dark).

\

\ The item labels and display colors of features within this track can be\ configured through the controls at the top of the track description page. \ This page is accessed via the small button to the left of the track's \ graphical display or through the link on the track's control menu. \

\ \

Methods

\

\ RefSeq RNAs were aligned against the D. melanogaster genome using blat; \ those with an alignment of less than 15% were discarded. When a single RNA \ aligned in multiple places, the alignment having the highest base identity \ was identified. Only alignments having a base identity level within 0.1% of \ the best and at least 96% base identity with the genomic sequence were kept.\

\ \ \

Credits

\

\ This track was produced at UCSC from RNA sequence data\ generated by scientists worldwide and curated by the \ NCBI RefSeq project.

\ \

References

\

\ Kent WJ.\ \ BLAT--the BLAST-like alignment tool.\ Genome Res. 2002 Apr;12(4):656-64.\ PMID: 11932250; PMC: PMC187518\

\ \

\ Pruitt KD, Tatusova T, Maglott DR.\ NCBI Reference Sequence (RefSeq): a curated non-redundant\ sequence database of genomes, transcripts and proteins.\ Nucleic Acids Res. 2005 Jan 1;33(Database issue):D501-4.\ PMID: 15608248; PMC: PMC539979\

\ genes 1 baseColorUseCds given\ color 12,12,120\ group genes\ idXref hgFixed.refLink mrnaAcc name\ longLabel RefSeq Genes\ priority 2\ shortLabel RefSeq Genes\ track refGene\ type genePred refPep refMrna\ visibility dense\ pscreen BDGP Insertions bed 6 + BDGP Gene Disruption Project P-element Insertion Locations 0 12 50 50 250 152 152 252 0 0 0 http://flypush.imgen.bcm.tmc.edu/pscreen/plqlsearch2.cgi?gene=&mapping=&chromosome=Any&bg=$$&remarks=&orderby=Chromosome&submit=%20%20%20%20Start%20Query%20%20%20%20

Description

\

\ This track shows the locations of P transposable element insertions\ from \ \ P-Screen,\ the online database of the \ \ BDGP Gene Disruption Project.\

\ Triangular arrows indicate the approximate positions in the reference sequence\ at which P elements have been inserted. The direction of the arrow \ corresponds to the orientation of the insertion: right-pointing arrows \ indicate forward orientation; left-pointing arrows show reverse orientation.\ The item name indicates the strain that has a P element disruption at that \ location. When a stock order number is available for \ the strain, a link is provided to the Bloomington stock center where \ the strain can be ordered. \ The project's strain library contains more than 7140 strains \ disrupting at least 5362 different genes, corresponding to 39% of the \ 13,666 currently annotated Drosophila genes.\ \

Methods

\

\ See the P-Screen \ \ homepage\ for materials and methods.\ See also a list of \ \ publications \ on the Gene Disruption Project.\ \

Credits

\

\ Thanks to the BDGP Gene Disruption Project for providing these data.\ Please cite the Bellen et al. reference below when using\ strains from the collection.\

\ \

References

\

\ Bellen HJ, Levis RW, Liao G, He Y, Carlson JW, Tsang G, Evans-Holm M, Hiesinger PR, Schulze KL,\ Rubin GM et al.\ \ The BDGP gene disruption project: single transposon insertions associated with 40% of Drosophila\ genes.\ Genetics. 2004 Jun;167(2):761-81.\ PMID: 15238527; PMC: PMC1470905\

\ map 1 color 50,50,250\ group map\ longLabel BDGP Gene Disruption Project P-element Insertion Locations\ noScoreFilter .\ priority 12\ shortLabel BDGP Insertions\ track pscreen\ type bed 6 +\ url http://flypush.imgen.bcm.tmc.edu/pscreen/plqlsearch2.cgi?gene=&mapping=&chromosome=Any&bg=$$&remarks=&orderby=Chromosome&submit=%20%20%20%20Start%20Query%20%20%20%20\ urlLabel GDP Strain ID:\ visibility hide\ bdgpGene FlyBase Genes genePred bdgpGenePep Protein-Coding Genes from FlyBase 3 34 0 100 180 127 177 217 0 0 0

Description

\

\ This track shows protein-coding genes annotated by \ FlyBase \ (version 3.2). \

\

Credits

\

\ Thanks to FlyBase for providing these annotations.\

\ \ genes 1 color 0,100,180\ directUrl /cgi-bin/hgGene?hgg_gene=%s&hgg_chrom=%s&hgg_start=%d&hgg_end=%d&hgg_type=%s&db=%s\ group genes\ hgGene on\ hgsid on\ longLabel Protein-Coding Genes from FlyBase\ priority 34\ shortLabel FlyBase Genes\ track bdgpGene\ type genePred bdgpGenePep\ visibility pack\ bdgpNonCoding FlyBase Non-coding genePred Non-Coding Genes from FlyBase 3 34.5 30 130 210 142 192 232 0 0 0

Description

\

\ This track shows non-coding genes annotated by \ FlyBase \ (version 3.2). \

\

Credits

\

\ Thanks to FlyBase for providing these annotations.\

\ \ genes 1 color 30,130,210\ group genes\ longLabel Non-Coding Genes from FlyBase\ priority 34.5\ shortLabel FlyBase Non-coding\ track bdgpNonCoding\ type genePred\ visibility pack\ intronEst Spliced ESTs psl est D. melanogaster ESTs That Have Been Spliced 1 56 0 0 0 127 127 127 1 0 0

Description

\

\ This track shows alignments between D. melanogaster expressed sequence tags\ (ESTs) in GenBank and the genome that show signs of splicing when\ aligned against the genome. ESTs are single-read sequences, typically about \ 500 bases in length, that usually represent fragments of transcribed genes.\

\

\ To be considered spliced, an EST must show \ evidence of at least one canonical intron, i.e. one that is at least\ 32 bases in length and has GT/AG ends. By requiring splicing, the level \ of contamination in the EST databases is drastically reduced\ at the expense of eliminating many genuine 3' ESTs.\ For a display of all ESTs (including unspliced), see the \ D. melanogaster EST track.

\ \

Display Conventions and Configuration

\

\ This track follows the display conventions for \ PSL alignment tracks. In dense display mode, darker shading\ indicates a larger number of aligned ESTs.

\

\ The strand information (+/-) indicates the\ direction of the match between the EST and the matching\ genomic sequence. It bears no relationship to the direction\ of transcription of the RNA with which it might be associated.

\

\ The description page for this track has a filter that can be used to change \ the display mode, alter the color, and include/exclude a subset of items \ within the track. This may be helpful when many items are shown in the track \ display, especially when only some are relevant to the current task.

\

\ To use the filter:\

    \
  1. Type a term in one or more of the text boxes to filter the EST\ display. For example, to apply the filter to all ESTs expressed in a specific\ organ, type the name of the organ in the tissue box. To view the list of \ valid terms for each text box, consult the table in the Table Browser that \ corresponds to the factor on which you wish to filter. For example, the \ "tissue" table contains all the types of tissues that can be \ entered into the tissue text box. Wildcards may also be used in the\ filter.\
  2. If filtering on more than one value, choose the desired combination\ logic. If "and" is selected, only ESTs that match all filter \ criteria will be highlighted. If "or" is selected, ESTs that \ match any one of the filter criteria will be highlighted.\
  3. Choose the color or display characteristic that should be used to \ highlight or include/exclude the filtered items. If "exclude" is \ chosen, the browser will not display ESTs that match the filter criteria. \ If "include" is selected, the browser will display only those \ ESTs that match the filter criteria.\

\

\ This track may also be configured to display base labeling, a feature that\ allows the user to display all bases in the aligning sequence or only those \ that differ from the genomic sequence. For more information about this option,\ click \ here.\

\ \

Methods

\

\ To make an EST, RNA is isolated from cells and reverse\ transcribed into cDNA. Typically, the cDNA is cloned\ into a plasmid vector and a read is taken from the 5'\ and/or 3' primer. For most — but not all — ESTs, the\ reverse transcription is primed by an oligo-dT, which\ hybridizes with the poly-A tail of mature mRNA. The\ reverse transcriptase may or may not make it to the 5'\ end of the mRNA, which may or may not be degraded.

\

\ In general, the 3' ESTs mark the end of transcription\ reasonably well, but the 5' ESTs may end at any point\ within the transcript. Some of the newer cap-selected\ libraries cover transcription start reasonably well. Before the \ cap-selection techniques\ emerged, some projects used random rather than poly-A\ priming in an attempt to retrieve sequence distant from the\ 3' end. These projects were successful at this, but as\ a side effect also deposited sequences from unprocessed\ mRNA and perhaps even genomic sequences into the EST databases.\ Even outside of the random-primed projects, there is a\ degree of non-mRNA contamination. Because of this, a\ single unspliced EST should be viewed with considerable\ skepticism.

\

\ To generate this track, D. melanogaster ESTs from GenBank were aligned \ against the genome using blat. Note that the maximum intron length\ allowed by blat is 750,000 bases, which may eliminate some ESTs with very \ long introns that might otherwise align. When a single \ EST aligned in multiple places, the alignment having the \ highest base identity was identified. Only alignments having\ a base identity level within 0.5% of the best and at least 96% base identity \ with the genomic sequence are displayed in this track.

\ \

Credits

\

\ This track was produced at UCSC from EST sequence data\ submitted to the international public sequence databases by \ scientists worldwide.

\ \

References

\

\ Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL. \ GenBank: update.\ Nucleic Acids Res. 2004 Jan 1;32(Database issue):D23-6.\ PMID: 14681350; PMC: PMC308779\

\ \

\ Kent WJ.\ BLAT - the BLAST-like alignment tool.\ Genome Res. 2002 Apr;12(4):656-64.\ PMID: 11932250; PMC: PMC187518\

\ rna 1 baseColorUseSequence genbank\ group rna\ indelDoubleInsert on\ indelQueryInsert on\ intronGap 30\ longLabel D. melanogaster ESTs That Have Been Spliced\ priority 56\ shortLabel Spliced ESTs\ showDiffBasesAllScales .\ spectrum on\ track intronEst\ type psl est\ visibility dense\ flyreg FlyReg bed 4 + FlyReg: Drosophila DNase I Footprint Database 0 80 100 50 0 177 152 127 0 0 0

Description

\

\ This track shows DNase I Footprint data from \ FlyReg version 1.0.\ FlyReg provides access to results of the systematic curation and\ genome annotation of 1,350 DNase I footprints for the fruitfly\ D. melanogaster reported in \ Bergman, C.M. et al. (see below).\

\

\ When available, a footprint motif is also displayed, \ based on a MEME \ matrix \ computed by Dan Pollard on the set of footprints for this factor. \ \

Credits

\

\ Thanks to \ Casey Bergman \ for providing the FlyReg data. If used in published work, please cite \ Bergman CM, Carlson JW, Celniker SE.\ \ Drosophila DNase I footprint database: a systematic genome annotation of transcription factor\ binding sites in the fruitfly, Drosophila melanogaster.\ Bioinformatics. 2005 Apr 15;21(8):1747-9.\ PMID: 15572468\

\ \

\ Thanks to \ Dan Pollard \ for providing the footprint motif matrices. \

\ regulation 1 color 100,50,0\ group regulation\ longLabel FlyReg: Drosophila DNase I Footprint Database\ priority 80\ shortLabel FlyReg\ track flyreg\ type bed 4 +\ visibility hide\ est D. melanogaster ESTs psl est D. melanogaster ESTs Including Unspliced 0 100 0 0 0 127 127 127 1 0 0

Description

\

\ This track shows alignments between D. melanogaster expressed sequence tags\ (ESTs) in GenBank and the genome. ESTs are single-read sequences, \ typically about 500 bases in length, that usually represent fragments of \ transcribed genes.

\ \

Display Conventions and Configuration

\

\ This track follows the display conventions for \ PSL alignment tracks. In dense display mode, the items that\ are more darkly shaded indicate matches of better quality.

\

\ The strand information (+/-) indicates the\ direction of the match between the EST and the matching\ genomic sequence. It bears no relationship to the direction\ of transcription of the RNA with which it might be associated.

\

\ The description page for this track has a filter that can be used to change \ the display mode, alter the color, and include/exclude a subset of items \ within the track. This may be helpful when many items are shown in the track \ display, especially when only some are relevant to the current task.

\

\ To use the filter:\

    \
  1. Type a term in one or more of the text boxes to filter the EST\ display. For example, to apply the filter to all ESTs expressed in a specific\ organ, type the name of the organ in the tissue box. To view the list of \ valid terms for each text box, consult the table in the Table Browser that \ corresponds to the factor on which you wish to filter. For example, the \ "tissue" table contains all the types of tissues that can be \ entered into the tissue text box. Wildcards may also be used in the\ filter.\
  2. If filtering on more than one value, choose the desired combination\ logic. If "and" is selected, only ESTs that match all filter \ criteria will be highlighted. If "or" is selected, ESTs that \ match any one of the filter criteria will be highlighted.\
  3. Choose the color or display characteristic that should be used to \ highlight or include/exclude the filtered items. If "exclude" is \ chosen, the browser will not display ESTs that match the filter criteria. \ If "include" is selected, the browser will display only those \ ESTs that match the filter criteria.\

\

\ This track may also be configured to display base labeling, a feature that\ allows the user to display all bases in the aligning sequence or only those \ that differ from the genomic sequence. For more information about this option,\ click \ here.\

\ \

Methods

\

\ To make an EST, RNA is isolated from cells and reverse\ transcribed into cDNA. Typically, the cDNA is cloned\ into a plasmid vector and a read is taken from the 5'\ and/or 3' primer. For most — but not all — ESTs, the\ reverse transcription is primed by an oligo-dT, which\ hybridizes with the poly-A tail of mature mRNA. The\ reverse transcriptase may or may not make it to the 5'\ end of the mRNA, which may or may not be degraded.

\

\ In general, the 3' ESTs mark the end of transcription\ reasonably well, but the 5' ESTs may end at any point\ within the transcript. Some of the newer cap-selected\ libraries cover transcription start reasonably well. Before the \ cap-selection techniques\ emerged, some projects used random rather than poly-A\ priming in an attempt to retrieve sequence distant from the\ 3' end. These projects were successful at this, but as\ a side effect also deposited sequences from unprocessed\ mRNA and perhaps even genomic sequences into the EST databases.\ Even outside of the random-primed projects, there is a\ degree of non-mRNA contamination. Because of this, a\ single unspliced EST should be viewed with considerable\ skepticism.

\

\ To generate this track, D. melanogaster ESTs from GenBank were aligned \ against the genome using blat. Note that the maximum intron length\ allowed by blat is 750,000 bases, which may eliminate some ESTs with very \ long introns that might otherwise align. When a single \ EST aligned in multiple places, the alignment having the \ highest base identity was identified. Only alignments having\ a base identity level within 0.5% of the best and at least 96% base identity \ with the genomic sequence are displayed in this track.

\ \

Credits

\

\ This track was produced at UCSC from EST sequence data\ submitted to the international public sequence databases by \ scientists worldwide.

\ \

References

\

\ Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL.\ GenBank: update.\ Nucleic Acids Res. 2004 Jan 1;32(Database issue):D23-6.\ PMID: 14681350; PMC: PMC308779\

\ \

\ Kent WJ.\ BLAT - the BLAST-like alignment tool.\ Genome Res. 2002 Apr;12(4):656-64.\ PMID: 11932250; PMC: PMC187518\

\ rna 1 baseColorUseSequence genbank\ group rna\ indelDoubleInsert on\ indelQueryInsert on\ intronGap 30\ longLabel D. melanogaster ESTs Including Unspliced\ maxItems 300\ shortLabel D. melanogaster ESTs\ spectrum on\ table all_est\ track est\ type psl est\ visibility hide\ mrna D. melanogaster mRNAs psl . D. melanogaster mRNAs from GenBank 3 100 0 0 0 127 127 127 1 0 0

Description

\

\ The mRNA track shows alignments between D. melanogaster mRNAs\ in GenBank and the genome.

\ \

Display Conventions and Configuration

\

\ This track follows the display conventions for \ PSL alignment tracks. In dense display mode, the items that\ are more darkly shaded indicate matches of better quality.

\

\ The description page for this track has a filter that can be used to change \ the display mode, alter the color, and include/exclude a subset of items \ within the track. This may be helpful when many items are shown in the track \ display, especially when only some are relevant to the current task.

\

\ To use the filter:\

    \
  1. Type a term in one or more of the text boxes to filter the mRNA \ display. For example, to apply the filter to all mRNAs expressed in a specific\ organ, type the name of the organ in the tissue box. To view the list of \ valid terms for each text box, consult the table in the Table Browser that \ corresponds to the factor on which you wish to filter. For example, the \ "tissue" table contains all the types of tissues that can be \ entered into the tissue text box. Wildcards may also be used in the\ filter.\
  2. If filtering on more than one value, choose the desired combination\ logic. If "and" is selected, only mRNAs that match all filter \ criteria will be highlighted. If "or" is selected, mRNAs that \ match any one of the filter criteria will be highlighted.\
  3. Choose the color or display characteristic that should be used to \ highlight or include/exclude the filtered items. If "exclude" is \ chosen, the browser will not display mRNAs that match the filter criteria. \ If "include" is selected, the browser will display only those \ mRNAs that match the filter criteria.\

\

\ This track may also be configured to display codon coloring, a feature that\ allows the user to quickly compare mRNAs against the genomic sequence. For more \ information about this option, click \ here.\

\ \

Methods

\

\ GenBank D. melanogaster mRNAs were aligned against the genome using the \ blat program. When a single mRNA aligned in multiple places, \ the alignment having the highest base identity was found. \ Only alignments having a base identity level within 0.5% of\ the best and at least 96% base identity with the genomic sequence were kept.\

\ \

Credits

\

\ The mRNA track was produced at UCSC from mRNA sequence data\ submitted to the international public sequence databases by \ scientists worldwide.

\ \

References

\

\ Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL.\ GenBank: update.\ Nucleic Acids Res. 2004 Jan 1;32(Database issue):D23-6.\ PMID: 14681350; PMC: PMC308779\

\ \

\ Kent WJ.\ BLAT - the BLAST-like alignment tool.\ Genome Res. 2002 Apr;12(4):656-64.\ PMID: 11932250; PMC: PMC187518\

\ \ rna 1 baseColorDefault diffCodons\ baseColorUseCds genbank\ baseColorUseSequence genbank\ group rna\ indelDoubleInsert on\ indelPolyA on\ indelQueryInsert on\ longLabel D. melanogaster mRNAs from GenBank\ shortLabel D. melanogaster mRNAs\ showDiffBasesAllScales .\ spectrum on\ table all_mrna\ track mrna\ type psl .\ visibility pack\ augustusGene AUGUSTUS genePred AUGUSTUS ab initio gene predictions v3.1 0 100 12 105 0 133 180 127 0 0 0

Description

\ \

\ This track shows ab initio predictions from the program\ AUGUSTUS (version 3.1).\ The predictions are based on the genome sequence alone.\

\ \

\ For more information on the different gene tracks, see our Genes FAQ.

\ \

Methods

\ \

\ Statistical signal models were built for splice sites, branch-point\ patterns, translation start sites, and the poly-A signal.\ Furthermore, models were built for the sequence content of\ protein-coding and non-coding regions as well as for the length distributions\ of different exon and intron types. Detailed descriptions of most of these different models\ can be found in Mario Stanke's\ dissertation.\ This track shows the most likely gene structure according to a\ Semi-Markov Conditional Random Field model.\ Alternative splicing transcripts were obtained with\ a sampling algorithm (--alternatives-from-sampling=true --sample=100 --minexonintronprob=0.2\ --minmeanexonintronprob=0.5 --maxtracks=3 --temperature=2).\

\ \

\ The different models used by Augustus were trained on a number of different species-specific\ gene sets, which included 1000-2000 training gene structures. The --species option allows\ one to choose the species used for training the models. Different training species were used\ for the --species option when generating these predictions for different groups of\ assemblies.\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
Assembly GroupTraining Species
Fishzebrafish\ \
Birdschicken\ \
Human and all other vertebrateshuman\ \
Nematodescaenorhabditis
Drosophilafly
A. melliferahoneybee1
A. gambiaeculex
S. cerevisiaesaccharomyces
\

\ This table describes which training species was used for a particular group of assemblies.\ When available, the closest related training species was used.\

\ \

Credits

\ \ Thanks to the\ Stanke lab\ for providing the AUGUSTUS program. The training for the chicken version was\ done by Stefanie König and the training for the\ human and zebrafish versions was done by Mario Stanke.\ \

References

\ \

\ Stanke M, Diekhans M, Baertsch R, Haussler D.\ \ Using native and syntenically mapped cDNA alignments to improve de novo gene finding.\ Bioinformatics. 2008 Mar 1;24(5):637-44.\ PMID: 18218656\

\ \

\ Stanke M, Waack S.\ \ Gene prediction with a hidden Markov model and a new intron submodel.\ Bioinformatics. 2003 Oct;19 Suppl 2:ii215-25.\ PMID: 14534192\

\ genes 1 baseColorDefault genomicCodons\ baseColorUseCds given\ color 12,105,0\ group genes\ longLabel AUGUSTUS ab initio gene predictions v3.1\ shortLabel AUGUSTUS\ track augustusGene\ type genePred\ visibility hide\ mzDy1Dp2Ag1_phast Conservation wigMaf 0.0 1.0 D.mel./D.yakuba/D.pseudo./A.gambiae Multiz Alignments & phastCons Scores 3 100 0 0 0 127 127 127 0 0 0

Description

\

\ This track shows a measure of evolutionary conservation in \ three Drosophila species and mosquito (A. gambiae), \ based on a phylogenetic hidden Markov model (phastCons).\ Multiz alignments of the following assemblies were used to generate this\ annotation: \

\

\ In full display mode, this track shows the overall conservation score \ across all species as well as pairwise alignments of \ D. yakuba, D. pseudoobscura, and A. gambiae,\ each aligned to the D. melanogaster genome. The pairwise alignments are\ shown in dense display mode using a grayscale density gradient. \ The checkboxes in the track configuration section allow\ the exclusion of species from the pairwise display; however, this does not\ remove them from the conservation score display.

\

\ When zoomed-in to the base-display level, the track shows the base \ composition of each alignment. The numbers and symbols on the Gaps\ line indicate the lengths of gaps in the D. melanogaster sequence at those \ alignment positions relative to the longest non-D. melanogaster sequence. \ If there is sufficient space in the display, the size of the gap is shown; \ if not, and if the gap size is a multiple of 3, a "*" is displayed, \ otherwise "+" is shown. \ To view detailed information about the alignments at a specific position,\ zoom in the display to 30,000 or fewer bases, then click on the alignment.

\

\ This track may be configured in a variety of ways to highlight different aspects\ of the displayed information. Click the \ Graph \ configuration help link for an explanation of the configuration options.

\ \

Methods

\

\ Best-in-genome blastz pairwise alignments \ were multiply aligned using multiz, beginning with \ D. melanogaster-D. yakuba alignments\ and subsequently adding in D. pseudoobscura and A. gambiae. \ The resulting multiple alignments were then assigned \ conservation scores by phastCons.

\

\ The phastCons program computes conservation scores based on a phylo-HMM, a\ type of probabilistic model that describes both the process of DNA\ substitution at each site in a genome and the way this process changes from\ one site to the next (Felsenstein and Churchill 1996, Yang 1995, Siepel and\ Haussler 2005). PhastCons uses a two-state phylo-HMM, with a state for\ conserved regions and a state for non-conserved regions. The value plotted\ at each site is the posterior probability that the corresponding alignment\ column was "generated" by the conserved state of the phylo-HMM. These\ scores reflect the phylogeny (including branch lengths) of the species in\ question, a continuous-time Markov model of the nucleotide substitution\ process, and a tendency for conservation levels to be autocorrelated along\ the genome (i.e., to be similar at adjacent sites). The general reversible\ (REV) substitution model was used. Note that, unlike many\ conservation-scoring programs, phastCons does not rely on a sliding window\ of fixed size, so short highly-conserved regions and long moderately\ conserved regions can both obtain high scores. More information about\ phastCons can be found in Siepel et al. (2005).

\

\ PhastCons currently treats alignment gaps as missing data, which\ sometimes has the effect of producing undesirably high conservation scores\ in gappy regions of the alignment. We are looking at several possible ways\ of improving the handling of alignment gaps.

\ \

Credits

\

\ This track was created at UCSC using the following programs:\

\

\ \

References

\ \

Phylo-HMMs and phastCons:

\

\ Felsenstein J, Churchill GA.\ A Hidden Markov Model approach to\ variation among sites in rate of evolution.\ Mol Biol Evol. 1996 Jan;13(1):93-104.\ PMID: 8583911\

\ \

\ Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M, Rosenbloom K,\ Clawson H, Spieth J, Hillier LW, Richards S, et al.\ Evolutionarily conserved elements in vertebrate, insect, worm,\ and yeast genomes.\ Genome Res. 2005 Aug;15(8):1034-50.\ PMID: 16024819; PMC: PMC1182216\

\ \

\ Siepel A, Haussler D.\ Phylogenetic Hidden Markov Models.\ In: Nielsen R, editor. Statistical Methods in Molecular Evolution.\ New York: Springer; 2005. pp. 325-351.\

\ \

\ Yang Z.\ A space-time process model for the evolution of DNA\ sequences.\ Genetics. 1995 Feb;139(2):993-1005.\ PMID: 7713447; PMC: PMC1206396\

\ \

Chain/Net:

\

\ Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D.\ Evolution's cauldron:\ duplication, deletion, and rearrangement in the mouse and human genomes.\ Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9.\ PMID: 14500911; PMC: PMC208784\

\ \

Multiz:

\

\ Blanchette M, Kent WJ, Riemer C, Elnitski L, Smit AF, Roskin KM,\ Baertsch R, Rosenbloom K, Clawson H, Green ED, et al.\ Aligning multiple genomic sequences with the threaded blockset aligner.\ Genome Res. 2004 Apr;14(4):708-15.\ PMID: 15060014; PMC: PMC383317\

\ \

Blastz:

\

\ Chiaromonte F, Yap VB, Miller W.\ Scoring pairwise genomic sequence alignments.\ Pac Symp Biocomput. 2002:115-26.\ PMID: 11928468\

\ \

\ Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC,\ Haussler D, Miller W.\ Human-mouse alignments with BLASTZ.\ Genome Res. 2003 Jan;13(1):103-7.\ PMID: 12529312; PMC: PMC430961\

\ compGeno 1 autoScale Off\ group compGeno\ longLabel D.mel./D.yakuba/D.pseudo./A.gambiae Multiz Alignments & phastCons Scores\ maxHeightPixels 100:40:11\ pairwise myp2a\ priority 100\ shortLabel Conservation\ speciesOrder droYak1 dp2 anoGam1\ track mzDy1Dp2Ag1_phast\ type wigMaf 0.0 1.0\ visibility pack\ wiggle mzDy1Dp2Ag1_phast_wig\ yLineOnOff Off\ gap Gap bed 3 + Gap Locations 1 100 0 0 0 127 127 127 0 0 0

Description

\

\ This track depicts gaps — represented by black boxes — in the D. melanogaster\ genome sequence. An assembly region is designated as a gap if the sequence\ contains a series of Ns. The minimum number of Ns that\ constitute a gap varies among assemblies. \

\ \ map 1 group map\ longLabel Gap Locations\ shortLabel Gap\ track gap\ type bed 3 +\ visibility dense\ gcPercent GC Percent bed 4 + Percentage GC in 20,000-Base Windows 0 100 0 0 0 127 127 127 1 0 0

Description

\

\ The GC percent track shows the percentage of G (guanine) and C (cytosine) bases\ in a 20,000 base window. Windows with high GC content are drawn more darkly \ than windows with low GC content. High GC content is typically associated with \ gene-rich areas.\

\

Credits

\

\ This track was generated at UCSC.\ map 1 group map\ longLabel Percentage GC in 20,000-Base Windows\ shortLabel GC Percent\ spectrum on\ track gcPercent\ type bed 4 +\ visibility hide\ genscan Genscan Genes genePred genscanPep Genscan Gene Predictions 1 100 170 100 0 212 177 127 0 0 0

Description

\ \

\ This track shows predictions from the\ Genscan program\ written by Chris Burge.\ The predictions are based on transcriptional, translational and donor/acceptor\ splicing signals as well as the length and compositional distributions of exons,\ introns and intergenic regions.\

\ \

\ For more information on the different gene tracks, see our Genes FAQ.

\ \

Display Conventions and Configuration

\ \

\ This track follows the display conventions for\ gene prediction\ tracks.\

\ \

\ The track description page offers the following filter and configuration\ options:\

\

\ \

Methods

\ \

\ For a description of the Genscan program and the model that underlies it,\ refer to Burge and Karlin (1997) in the References section below.\ The splice site models used are described in more detail in Burge (1998)\ below.\

\ \

Credits

\ \ Thanks to Chris Burge for providing the Genscan program.\ \

References

\ \

\ Burge C.\ Modeling Dependencies in Pre-mRNA Splicing Signals.\ In: Salzberg S, Searls D, Kasif S, editors.\ Computational Methods in Molecular Biology.\ Amsterdam: Elsevier Science; 1998. p. 127-163.\

\ \

\ Burge C, Karlin S.\ \ Prediction of complete gene structures in human genomic DNA.\ J. Mol. Biol. 1997 Apr 25;268(1):78-94.\ PMID: 9149143\

\ genes 1 color 170,100,0\ group genes\ longLabel Genscan Gene Predictions\ shortLabel Genscan Genes\ track genscan\ type genePred genscanPep\ visibility dense\ blastHg16KG Human Proteins psl protein Human Proteins (hg16) Mapped by Chained tBLASTn 0 100 0 0 0 127 127 127 0 0 0

Description

\

\ This track contains tBLASTn alignments of the peptides from the predicted \ and known genes identified in the hg16 Known Genes track.\

\ \

Methods

\

\ First, the predicted proteins from the human Known Genes track were aligned \ with the human genome using the Blat program to discover exon boundaries. \ Next, the amino acid sequences that make up each exon were aligned with the \ D. melanogaster sequence using the tBLASTn program.\ Finally, the putative D. melanogaster exons were chained together using an \ organism-specific maximum gap size but no gap penalty. The single best exon \ chains extending over more than 60% of the query protein were included. Exon \ chains that extended over 60% of the query and matched at least 60% of the \ protein's amino acids were also included.

\ \

Credits

\

\ tBLASTn is part of the NCBI BLAST tool set. For more information on BLAST, see\ Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ.\ Basic local alignment search tool.\ J Mol Biol. 1990 Oct 5;215(3):403-410.

\

\ Blat was written by Jim Kent. The remaining utilities \ used to produce this track were written by Jim Kent or Brian Raney.

\ genes 1 blastRef hg16.blastKGRef00\ colorChromDefault off\ group genes\ longLabel Human Proteins (hg16) Mapped by Chained tBLASTn\ pred hg16.blastKGPep00\ shortLabel Human Proteins\ track blastHg16KG\ type psl protein\ visibility hide\ microsat Microsatellite bed 4 Microsatellites - Di-nucleotide and Tri-nucleotide Repeats 0 100 0 0 0 127 127 127 0 0 0

Description

\

\ This track displays regions that are likely to be useful as microsatellite\ markers. These are sequences of at least 15 perfect di-nucleotide and \ tri-nucleotide repeats and tend to be highly polymorphic in the\ population.\

\ \

Methods

\

\ The data shown in this track are a subset of the Simple Repeats track, \ selecting only those \ repeats of period 2 and 3, with 100% identity and no indels and with\ at least 15 copies of the repeat. The Simple Repeats track is\ created using the \ Tandem Repeats Finder. For more information about this \ program, see Benson (1999).

\ \

Credits

\

\ Tandem Repeats Finder was written by \ Gary Benson.

\ \

References

\ \

\ Benson G.\ \ Tandem repeats finder: a program to analyze DNA sequences.\ Nucleic Acids Res. 1999 Jan 15;27(2):573-80.\ PMID: 9862982; PMC: PMC148217\

\ varRep 1 group varRep\ longLabel Microsatellites - Di-nucleotide and Tri-nucleotide Repeats\ shortLabel Microsatellite\ track microsat\ type bed 4\ visibility hide\ miRNA miRNA bed 8 . MicroRNAs from miRBase 0 100 255 64 64 255 159 159 1 0 0 http://www.mirbase.org/cgi-bin/mirna_entry.pl?acc=$$

Description

\

\ The miRNA track shows microRNAs from\ miRBase.\

\ \

Display Conventions and Configuration

\

\ Mature miRNAs (miRs) are represented by \ thick blocks. The predicted stem-loop portions of the primary transcripts\ are indicated by thinner blocks. miRNAs in the sense orientation are shown in\ black; those in the reverse orientation are colored grey. When a single \ precursor produces two mature miRs from its 5' and 3' parts, it is displayed \ twice with the two different positions of the mature miR.

\

\ To display only those items that exceed a specific unnormalized score, enter\ a minimum score between 0 and 1000 in the text box at the top of the track \ description page.\

\ \

Methods

\

\ Mature and precursor miRNAs from the miRNA Registry were\ aligned against the genome using blat.\ The extents of the precursor sequences were not generally known, and were\ predicted based on base-paired hairpin structure. \ miRBase is described in Griffiths-Jones, S. et al. (2006).\ The miRNA Registry is\ described in Griffiths-Jones, S. (2004) and Weber, M.J. (2005) in the \ References section below.

\ \

Credits

\

\ \ This track was created by Michel Weber of \ Laboratoire de Biologie Moléculaire Eucaryote,\ CNRS Université Paul Sabatier\ (Toulouse, France), Yves Quentin of Laboratoire de Microbiologie et Génétique\ Moléculaires (Toulouse, France) and Sam Griffiths-Jones of\ \ The Wellcome Trust Sanger Institute\ (Cambridge, UK).\

\ \

References

\ \

\ When making use of these data, please cite:\

\ \

\ Griffiths-Jones S, Grocock RJ, van Dongen S, Bateman A, Enright AJ.\ \ miRBase: microRNA sequences, targets and gene nomenclature.\ Nucleic Acids Res. 2006 Jan 1;34(Database issue):D140-4.\ PMID: 16381832; PMC: PMC1347474\

\ \

\ Griffiths-Jones S.\ The microRNA Registry.\ Nucleic Acids Res. 2004 Jan 1;32(Database issue):D109-11.\ PMID: 14681370; PMC: PMC308757\

\ \

\ Weber MJ.\ New human and mouse microRNA genes found by homology search.\ FEBS J. 2005 Jan;272(1):59-73.\ PMID: 15634332\

\ \

\ The following publication provides guidelines on miRNA annotation:\
Ambros V, Bartel B, Bartel DP, Burge CB, Carrington JC, Chen X,\ Dreyfuss G, Eddy SR, Griffiths-Jones S, Marshall M et al.\ \ A uniform system for microRNA annotation.\ RNA. 2003 Mar;9(3):277-9.\ PMID: 12592000; PMC: PMC1370393\

\ \

\ For more information on blat, see\ Kent WJ.\ BLAT - the BLAST-like alignment tool.\ Genome Res. 2002 Apr;12(4):656-64.\ PMID: 11932250; PMC: PMC187518\

\ genes 1 color 255,64,64\ group genes\ longLabel MicroRNAs from miRBase\ shortLabel miRNA\ track miRNA\ type bed 8 .\ url http://www.mirbase.org/cgi-bin/mirna_entry.pl?acc=$$\ urlLabel miRBase:\ useScore 1\ visibility hide\ xenoRefGene Other RefSeq genePred xenoRefPep xenoRefMrna Non-D. melanogaster RefSeq Genes 1 100 12 12 120 133 133 187 0 0 0

Description

\

\ This track shows known protein-coding and non-protein-coding genes \ for organisms other than D. melanogaster, taken from the NCBI RNA reference\ sequences collection (RefSeq). The data underlying this track are \ updated weekly.

\ \

Display Conventions and Configuration

\

\ This track follows the display conventions for \ gene prediction \ tracks.\ The color shading indicates the level of review the RefSeq record has \ undergone: predicted (light), provisional (medium), reviewed (dark).

\

\ The item labels and display colors of features within this track can be\ configured through the controls at the top of the track description page. \

\ \

Methods

\

\ The RNAs were aligned against the D. melanogaster genome using blat; those\ with an alignment of less than 15% were discarded. When a single RNA aligned \ in multiple places, the alignment having the highest base identity was \ identified. Only alignments having a base identity level within 0.5% of \ the best and at least 25% base identity with the genomic sequence were kept.\

\ \

Credits

\

\ This track was produced at UCSC from RNA sequence data\ generated by scientists worldwide and curated by the \ NCBI RefSeq project.

\ \

References

\

\ Kent WJ.\ BLAT - the BLAST-like alignment tool.\ Genome Res. 2002 Apr;12(4):656-64.\ PMID: 11932250; PMC: PMC187518\

\ genes 1 color 12,12,120\ group genes\ longLabel Non-D. melanogaster RefSeq Genes\ shortLabel Other RefSeq\ track xenoRefGene\ type genePred xenoRefPep xenoRefMrna\ visibility dense\ genomicSuperDups Segmental Dups bed 6 + Duplications of >1000 Bases of Non-RepeatMasked Sequence 0 100 0 0 0 127 127 127 0 0 0

Description

\

\ This track shows regions detected as putative genomic duplications within the\ golden path. The following display conventions are used to distinguish\ levels of similarity:\

\ For a region to be included in the track, at least 1 Kb of the total \ sequence (containing at least 500 bp of non-RepeatMasked sequence) had to \ align and a sequence identity of at least 90% was required.

\ \

Methods

\

\ Segmental duplications play an important role in both genomic disease \ and gene evolution. This track displays an analysis of the global \ organization of these long-range segments of identity in genomic sequence.\

\ \

Large recent duplications (>= 1 kb and >= 90% identity) were detected\ by identifying high-copy repeats, removing these repeats from the genomic \ sequence ("fuguization") and searching all sequence for similarity. The\ repeats were then reinserted into the pairwise alignments, the ends of \ alignments trimmed, and global alignments were generated.\ For a full description of the "fuguization" detection method, see Bailey\ et al., 2001. This method has become\ known as WGAC (whole-genome assembly comparison); for example, see Bailey \ et al., 2002.\ \

Credits

\

\ These data were provided by Ginger Cheng, Xinwei She,\ Archana Raja,\ Tin Louie and\ Evan Eichler \ at the University of Washington.

\ \

References

\

\ Bailey JA, Gu Z, Clark RA, Reinert K, Samonte RV, Schwartz S, Adams MD, \ Myers EW, Li PW, Eichler EE.\ Recent segmental duplications in the human genome.\ Science. 2002 Aug 9;297(5583):1003-7.\ PMID: 12169732\

\ \

\ Bailey JA, Yavor AM, Massa HF, Trask BJ, Eichler EE.\ Segmental duplications: organization and impact within the \ current human genome project assembly.\ Genome Res. 2001 Jun;11(6):1005-17.\ PMID: 11381028; PMC: PMC311093\

\ varRep 1 group varRep\ longLabel Duplications of >1000 Bases of Non-RepeatMasked Sequence\ noScoreFilter .\ shortLabel Segmental Dups\ track genomicSuperDups\ type bed 6 +\ visibility hide\ simpleRepeat Simple Repeats bed 4 + Simple Tandem Repeats by TRF 0 100 0 0 0 127 127 127 0 0 0

Description

\

\ This track displays simple tandem repeats (possibly imperfect repeats) located\ by Tandem Repeats\ Finder (TRF) which is specialized for this purpose. These repeats can\ occur within coding regions of genes and may be quite\ polymorphic. Repeat expansions are sometimes associated with specific\ diseases.

\ \

Methods

\

\ For more information about the TRF program, see Benson (1999).\

\ \

Credits

\

\ TRF was written by \ Gary Benson.

\ \

References

\ \

\ Benson G.\ \ Tandem repeats finder: a program to analyze DNA sequences.\ Nucleic Acids Res. 1999 Jan 15;27(2):573-80.\ PMID: 9862982; PMC: PMC148217\

\ varRep 1 group varRep\ longLabel Simple Tandem Repeats by TRF\ shortLabel Simple Repeats\ track simpleRepeat\ type bed 4 +\ visibility hide\ twinscan Twinscan genePred twinscanPep Twinscan Gene Predictions Using D. melanogaster/pseudoobscura Homology 0 100 0 100 100 127 177 177 0 0 0

Description

\

\ The Twinscan program predicts genes in a manner similar to Genscan, except \ that Twinscan takes advantage of genome comparisons to improve gene prediction\ accuracy. More information and a web server can be found at\ http://mblab.wustl.edu/.

\ \

Display Conventions and Configuration

\

\ This track follows the display conventions for \ gene prediction \ tracks.

\

\ The track description page offers the following filter and configuration\ options:\

\ \

Methods

\

\ The Twinscan algorithm is described in Korf, I. et al. (2001) in the\ References section below.

\ \

Credits

\

\ Thanks to Michael Brent's Computational Genomics Group at Washington \ University St. Louis for providing these data.

\ \

References

\

\ Korf I, Flicek P, Duan D, Brent MR.\ Integrating genomic homology into gene structure prediction.\ Bioinformatics. 2001;17 Suppl 1:S140-8.\ PMID: 11473003\

\ genes 1 color 0,100,100\ group genes\ longLabel Twinscan Gene Predictions Using D. melanogaster/pseudoobscura Homology\ shortLabel Twinscan\ track twinscan\ type genePred twinscanPep\ visibility hide\ blatFugu Fugu Blat psl xeno Fugu (Aug. 2002 (JGI 3.0/fr1)) Translated Blat Alignments 1 113 0 60 120 200 220 255 1 0 0

Description

\

\ The Fugu v.3.0 whole genome shotgun assembly was provided by the\ US DOE Joint \ Genome Institute (JGI). The assembly was constructed with the JGI\ assembler, JAZZ, from paired end sequencing reads produced at JGI, Myriad \ Genetics, and Celera Genomics, resulting in a sequence coverage of 5.7X. All \ reads are plasmid, cosmid, or BAC end-sequences, with the predominant coverage\ derived from 2 Kb insert plasmids. This assembly contains 20,379\ scaffolds totaling 319 million base pairs. The largest 679 scaffolds\ total 160 million base pairs.

\

\ The strand information (+/-) for this track is in two parts. The\ first + or - indicates the orientation of the query sequence whose\ translated protein produced the match. The second + or - indicates the\ orientation of the matching translated genomic sequence. Because the two\ orientations of a DNA sequence give different predicted protein sequences,\ there are four combinations. ++ is not the same as --; nor is +- the same\ as -+.

\ \

Methods

\

\ The alignments were made with blat in translated protein mode requiring two\ nearby 4-mer matches to trigger a detailed alignment. The D. melanogaster\ genome was masked with RepeatMasker and Tandem Repeat Finder before \ running blat.

\ \

Credits

\

\ The 3.0 draft from the\ \ JGI Fugu rubripes website was used in the\ UCSC Genome Browser Fugu blat alignments. These data were freely provided \ by the JGI for use in this publication only.

\ \

References

\

\ Kent WJ.\ \ BLAT--the BLAST-like alignment tool.\ Genome Res. 2002 Apr;12(4):656-64.\ PMID: 11932250; PMC: PMC187518\

\ compGeno 1 altColor 200,220,255\ color 0,60,120\ colorChromDefault off\ group compGeno\ longLabel Fugu (Aug. 2002 (JGI 3.0/fr1)) Translated Blat Alignments\ otherDb fr1\ priority 113\ shortLabel Fugu Blat\ spectrum on\ track blatFugu\ type psl xeno\ visibility dense\ anophelesEcores Anopheles Ecores bed 12 . D. melanogaster/A. gambiae (Feb. 2003 (IAGEC MOZ2/anoGam1)) Evolutionary Conserved Regions 1 132.05 0 60 120 200 220 255 0 0 0

Description

\ This track shows Evolutionary Conserved Regions computed by the \ Exofish (Roest et al., 2000) program at\ Genoscope.\ Each singleton block corresponds to an "ecore"; two blocks connected by a thin line \ correspond to an "ecotig", a set of colinear ecores in a syntenic region. \ \

Methods

\ Genome-wide sequence comparisons were done at the protein-coding level between the genome sequences\ of the fruitfly, Drosophila melanogaster and the mosquito, Anopheles gambiae to\ detect evolutionarily conserved regions (ECORES). See Jaillon et al., 2003. \ The sequence versions used in the comparison were \ Ensembl Drosophila v.16.3a.1 (Jan. 2003, the same as BDGP Release 3.1 \ used by the UCSC Genome Browser) and \ Ensembl Anopheles v.16.2.1 . \ \

Credits

\

\ Thanks to Olivier Jaillon at Genoscope for contributing the data. \

\ \

References

\

\ Jaillon O, Dossat C, Eckenberg R, Eiglmeier K, Segurens B, Aury JM, Roth CW, Scarpelli C, Brey PT,\ Weissenbach J et al.\ \ Assessing the Drosophila melanogaster and Anopheles gambiae genome annotations using genome-wide\ sequence comparisons.\ Genome Res. 2003 Jul;13(7):1595-9.\ PMID: 12840038; PMC: PMC403732\

\ \

\ Roest Crollius H, Jaillon O, Bernot A, Dasilva C, Bouneau L, Fischer C, Fizames C, Wincker P,\ Brottier P, Quétier F et al.\ \ Estimate of human gene number provided by genome-wide analysis using Tetraodon nigroviridis DNA\ sequence.\ Nat Genet. 2000 Jun;25(2):235-8.\ PMID: 10835645\

\ compGeno 1 altColor 200,220,255\ color 0,60,120\ group compGeno\ longLabel D. melanogaster/A. gambiae (Feb. 2003 (IAGEC MOZ2/anoGam1)) Evolutionary Conserved Regions\ otherDb anoGam1\ priority 132.05\ shortLabel Anopheles Ecores\ track anophelesEcores\ type bed 12 .\ visibility dense\ chainAnoGam1 A. gambiae Chain chain anoGam1 A. gambiae (Feb. 2003 (IAGEC MOZ2/anoGam1)) Chained Alignments 0 132.1 100 50 0 255 240 200 1 0 0

Description

\

\ This track shows alignments of A. gambiae (anoGam1, Feb. 2003 (IAGEC MOZ2/anoGam1)) to the\ D. melanogaster genome using a gap scoring system that allows longer gaps \ than traditional affine gap scoring systems. It can also tolerate gaps in both\ A. gambiae and D. melanogaster simultaneously. These \ "double-sided" gaps can be caused by local inversions and \ overlapping deletions in both species. The A. gambiae sequence is \ from the \ MOZ2 assembly.

\

\ The chain track displays boxes joined together by either single or\ double lines. The boxes represent aligning regions.\ Single lines indicate gaps that are largely due to a deletion in the\ A. gambiae assembly or an insertion in the D. melanogaster \ assembly. Double lines represent more complex gaps that involve substantial\ sequence in both species. This may result from inversions, overlapping\ deletions, an abundance of local mutation, or an unsequenced gap in one\ species. In cases where multiple chains align over a particular region of\ the D. melanogaster genome, the chains with single-lined gaps are often \ due to processed pseudogenes, while chains with double-lined gaps are more \ often due to paralogs and unprocessed pseudogenes.

\

\ In the "pack" and "full" display\ modes, the individual feature names indicate the chromosome, strand, and\ location (in thousands) of the match for each matching alignment.

\ \ \

Display Conventions and Configuration

\

By default, the chains to chromosome-based assemblies are colored\ based on which chromosome they map to in the aligning organism. To turn\ off the coloring, check the "off" button next to: Color\ track based on chromosome.

\

\ To display only the chains of one chromosome in the aligning\ organism, enter the name of that chromosome (e.g. chr4) in box next to: \ Filter by chromosome.

\ \

Methods

\

\ The A. gambiae/D. melanogaster genomes were aligned with \ blastz and converted into axt format using the lavToAxt program.\ The axt alignments were fed into axtChain, which organizes all \ alignments between a single A. gambiae chromosome and a single \ D. melanogaster chromosome into a group and creates a kd-tree out \ of the gapless subsections (blocks) of the alignments. A dynamic program \ was then run over the kd-trees to find the maximally scoring chains of these \ blocks. The following matrix was used:

\

\ \ \ \ \ \
 ACGT
A91-90-25-100
C-90100-100-25
G-25-100100-90
T-100-25-9091

\ Chains scoring below a threshold were discarded; the remaining \ chains are displayed in this track.

\ \

Credits

\

\ Blastz was developed at Pennsylvania State University by \ Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from\ Ross Hardison.

\

\ Lineage-specific repeats were identified by Arian Smit and his \ RepeatMasker\ program.

\

\ The axtChain program was developed at the University of California at \ Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler.

\

\ The browser display and database storage of the chains were generated\ by Robert Baertsch and Jim Kent.

\ \

References

\

\ Chiaromonte F, Yap VB, Miller W.\ Scoring pairwise genomic sequence alignments.\ Pac Symp Biocomput. 2002:115-26.\ PMID: 11928468\

\ \

\ Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D.\ Evolution's cauldron:\ duplication, deletion, and rearrangement in the mouse and human genomes.\ Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9.\ PMID: 14500911; PMC: PMC208784\

\ \

\ Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC,\ Haussler D, Miller W.\ Human-mouse alignments with BLASTZ.\ Genome Res. 2003 Jan;13(1):103-7.\ PMID: 12529312; PMC: PMC430961\

\ compGeno 1 altColor 255,240,200\ color 100,50,0\ group compGeno\ longLabel A. gambiae (Feb. 2003 (IAGEC MOZ2/anoGam1)) Chained Alignments\ matrix 16 91,-90,-25,-100,-90,100,-100,-25,-25,-100,100,-90,-100,-25,-90,91\ matrixHeader A, C, G, T\ otherDb anoGam1\ priority 132.1\ shortLabel A. gambiae Chain\ spectrum on\ track chainAnoGam1\ type chain anoGam1\ visibility hide\ netAnoGam1 A. gambiae Net netAlign anoGam1 chainAnoGam1 A. gambiae (Feb. 2003 (IAGEC MOZ2/anoGam1)) Alignment Net 0 132.2 0 0 0 127 127 127 1 0 0

Description

\

\ This track shows the best A. gambiae/D. melanogaster chain for \ every part of the D. melanogaster genome. It is useful for\ finding orthologous regions and for studying genome\ rearrangement. The A. gambiae sequence used in this annotation is \ from the Feb. 2003 (IAGEC MOZ2/anoGam1) (anoGam1) assembly.

\ \

Display Conventions and Configuration

\

\ In full display mode, the top-level (level 1)\ chains are the largest, highest-scoring chains that\ span this region. In many cases gaps exist in the\ top-level chain. When possible, these are filled in by\ other chains that are displayed at level 2. The gaps in \ level 2 chains may be filled by level 3 chains and so\ forth.

\

\ In the graphical display, the boxes represent ungapped \ alignments; the lines represent gaps. Click\ on a box to view detailed information about the chain\ as a whole; click on a line to display information\ about the gap. The detailed information is useful in determining\ the cause of the gap or, for lower level chains, the genomic\ rearrangement.

\

\ Individual items in the display are categorized as one of four types\ (other than gap):

\

\ \

Methods

\

\ Chains were derived from blastz alignments, using the methods\ described on the chain tracks description pages, and sorted with the \ highest-scoring chains in the genome ranked first. The program\ chainNet was then used to place the chains one at a time, trimming them as \ necessary to fit into sections not already covered by a higher-scoring chain. \ During this process, a natural hierarchy emerged in which a chain that filled \ a gap in a higher-scoring chain was placed underneath that chain. The program \ netSyntenic was used to fill in information about the relationship between \ higher- and lower-level chains, such as whether a lower-level\ chain was syntenic or inverted relative to the higher-level chain. \ The program netClass was then used to fill in how much of the gaps and chains \ contained Ns (sequencing gaps) in one or both species and how much\ was filled with transposons inserted before and after the two organisms \ diverged.

\ \

Credits

\

\ The chainNet, netSyntenic, and netClass programs were\ developed at the University of California\ Santa Cruz by Jim Kent.

\

\ Blastz was developed at Pennsylvania State University by\ Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from\ Ross Hardison.

\

\ Lineage-specific repeats were identified by Arian Smit and his program \ RepeatMasker.

\

\ The browser display and database storage of the nets were made\ by Robert Baertsch and Jim Kent.

\ \

References

\

\ Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D.\ Evolution's cauldron:\ duplication, deletion, and rearrangement in the mouse and human genomes.\ Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9.\ PMID: 14500911; PMC: PMC208784\

\ \

\ Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC,\ Haussler D, Miller W.\ Human-mouse alignments with BLASTZ.\ Genome Res. 2003 Jan;13(1):103-7.\ PMID: 12529312; PMC: PMC430961\

\ compGeno 0 group compGeno\ longLabel A. gambiae (Feb. 2003 (IAGEC MOZ2/anoGam1)) Alignment Net\ otherDb anoGam1\ priority 132.2\ shortLabel A. gambiae Net\ spectrum on\ track netAnoGam1\ type netAlign anoGam1 chainAnoGam1\ visibility hide\ chainDp2 D. pseudo. Chain chain dp2 D. pseudoobscura (Aug. 2003 (Baylor freeze1/dp2)) Chained Alignments 0 138 100 50 0 255 240 200 1 0 0

Description

\

\ This track shows alignments of D. pseudoobscura (dp2, Aug. 2003 (Baylor freeze1/dp2)) to the\ D. melanogaster genome using a gap scoring system that allows longer gaps \ than traditional affine gap scoring systems. It can also tolerate gaps in both\ D. pseudoobscura and D. melanogaster simultaneously. These \ "double-sided" gaps can be caused by local inversions and \ overlapping deletions in both species. \

\ The chain track displays boxes joined together by either single or\ double lines. The boxes represent aligning regions.\ Single lines indicate gaps that are largely due to a deletion in the\ D. pseudoobscura assembly or an insertion in the D. melanogaster \ assembly. Double lines represent more complex gaps that involve substantial\ sequence in both species. This may result from inversions, overlapping\ deletions, an abundance of local mutation, or an unsequenced gap in one\ species. In cases where multiple chains align over a particular region of\ the D. melanogaster genome, the chains with single-lined gaps are often \ due to processed pseudogenes, while chains with double-lined gaps are more \ often due to paralogs and unprocessed pseudogenes.

\

\ In the "pack" and "full" display\ modes, the individual feature names indicate the chromosome, strand, and\ location (in thousands) of the match for each matching alignment.

\ \ \

Display Conventions and Configuration

\

By default, the chains to chromosome-based assemblies are colored\ based on which chromosome they map to in the aligning organism. To turn\ off the coloring, check the "off" button next to: Color\ track based on chromosome.

\

\ To display only the chains of one chromosome in the aligning\ organism, enter the name of that chromosome (e.g. chr4) in box next to: \ Filter by chromosome.

\ \

Methods

\

\ The D. pseudoobscura/D. melanogaster genomes were aligned with \ blastz and converted into axt format using the lavToAxt program.\ The axt alignments were fed into axtChain, which organizes all \ alignments between a single D. pseudoobscura chromosome and a single \ D. melanogaster chromosome into a group and creates a kd-tree out \ of the gapless subsections (blocks) of the alignments. A dynamic program \ was then run over the kd-trees to find the maximally scoring chains of these \ blocks. Chains scoring below a threshold were discarded; the remaining \ chains are displayed in this track.

\ \

Credits

\

\ Blastz was developed at Pennsylvania State University by \ Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from\ Ross Hardison.

\

\ Lineage-specific repeats were identified by Arian Smit and his \ RepeatMasker\ program.

\

\ The axtChain program was developed at the University of California at \ Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler.

\

\ The browser display and database storage of the chains were generated\ by Robert Baertsch and Jim Kent.

\ \

References

\

\ Chiaromonte F, Yap VB, Miller W.\ Scoring pairwise genomic sequence alignments.\ Pac Symp Biocomput. 2002:115-26.\ PMID: 11928468\

\ \

\ Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D.\ Evolution's cauldron:\ duplication, deletion, and rearrangement in the mouse and human genomes.\ Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9.\ PMID: 14500911; PMC: PMC208784\

\ \

\ Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC,\ Haussler D, Miller W.\ Human-mouse alignments with BLASTZ.\ Genome Res. 2003 Jan;13(1):103-7.\ PMID: 12529312; PMC: PMC430961\

\ compGeno 1 altColor 255,240,200\ color 100,50,0\ group compGeno\ longLabel D. pseudoobscura (Aug. 2003 (Baylor freeze1/dp2)) Chained Alignments\ otherDb dp2\ priority 138\ shortLabel D. pseudo. Chain\ spectrum on\ track chainDp2\ type chain dp2\ visibility hide\ netDp2 D. pseudo. Net netAlign dp2 chainDp2 D. pseudoobscura (Aug. 2003 (Baylor freeze1/dp2)) Alignment Net 0 138.1 0 0 0 127 127 127 1 0 0

Description

\

\ This track shows the best D. pseudoobscura/D. melanogaster chain for \ every part of the D. melanogaster genome. It is useful for\ finding orthologous regions and for studying genome\ rearrangement. The D. pseudoobscura sequence used in this annotation is \ from the Aug. 2003 (Baylor freeze1/dp2) (dp2) assembly.

\ \

Display Conventions and Configuration

\

\ In full display mode, the top-level (level 1)\ chains are the largest, highest-scoring chains that\ span this region. In many cases gaps exist in the\ top-level chain. When possible, these are filled in by\ other chains that are displayed at level 2. The gaps in \ level 2 chains may be filled by level 3 chains and so\ forth.

\

\ In the graphical display, the boxes represent ungapped \ alignments; the lines represent gaps. Click\ on a box to view detailed information about the chain\ as a whole; click on a line to display information\ about the gap. The detailed information is useful in determining\ the cause of the gap or, for lower level chains, the genomic\ rearrangement.

\

\ Individual items in the display are categorized as one of four types\ (other than gap):

\

\ \

Methods

\

\ Chains were derived from blastz alignments, using the methods\ described on the chain tracks description pages, and sorted with the \ highest-scoring chains in the genome ranked first. The program\ chainNet was then used to place the chains one at a time, trimming them as \ necessary to fit into sections not already covered by a higher-scoring chain. \ During this process, a natural hierarchy emerged in which a chain that filled \ a gap in a higher-scoring chain was placed underneath that chain. The program \ netSyntenic was used to fill in information about the relationship between \ higher- and lower-level chains, such as whether a lower-level\ chain was syntenic or inverted relative to the higher-level chain. \ The program netClass was then used to fill in how much of the gaps and chains \ contained Ns (sequencing gaps) in one or both species and how much\ was filled with transposons inserted before and after the two organisms \ diverged.

\ \

Credits

\

\ The Aug. 2003 dp2 data were obtained from the Freeze 1 assembly produced by \ the Human Genome Sequencing Center at \ Baylor School of Medicine.

\

\ The chainNet, netSyntenic, and netClass programs were\ developed at the University of California\ Santa Cruz by Jim Kent.

\

\ Blastz was developed at Pennsylvania State University by\ Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from\ Ross Hardison.

\

\ Lineage-specific repeats were identified by Arian Smit and his program \ RepeatMasker.

\

\ The browser display and database storage of the nets were made\ by Robert Baertsch and Jim Kent.

\ \

References

\

\ Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D.\ Evolution's cauldron:\ duplication, deletion, and rearrangement in the mouse and human genomes.\ Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9.\ PMID: 14500911; PMC: PMC208784\

\ \

\ Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC,\ Haussler D, Miller W.\ Human-mouse alignments with BLASTZ.\ Genome Res. 2003 Jan;13(1):103-7.\ PMID: 12529312; PMC: PMC430961\

\ compGeno 0 group compGeno\ longLabel D. pseudoobscura (Aug. 2003 (Baylor freeze1/dp2)) Alignment Net\ otherDb dp2\ priority 138.1\ shortLabel D. pseudo. Net\ spectrum on\ track netDp2\ type netAlign dp2 chainDp2\ visibility hide\ chainDroYak1 D. yakuba Chain chain droYak1 D. yakuba (Apr. 2004 (WUGSC 1.0/droYak1)) Chained Alignments 0 141 100 50 0 255 240 200 1 0 0

Description

\

\ This track shows alignments of D. yakuba (droYak1, Apr. 2004 (WUGSC 1.0/droYak1)) to the\ D. melanogaster genome using a gap scoring system that allows longer gaps \ than traditional affine gap scoring systems. It can also tolerate gaps in both\ D. yakuba and D. melanogaster simultaneously. These \ "double-sided" gaps can be caused by local inversions and \ overlapping deletions in both species.

\

\ The chain track displays boxes joined together by either single or\ double lines. The boxes represent aligning regions.\ Single lines indicate gaps that are largely due to a deletion in the\ D. yakuba assembly or an insertion in the D. melanogaster \ assembly. Double lines represent more complex gaps that involve substantial\ sequence in both species. This may result from inversions, overlapping\ deletions, an abundance of local mutation, or an unsequenced gap in one\ species. In cases where multiple chains align over a particular region of\ the D. melanogaster genome, the chains with single-lined gaps are often \ due to processed pseudogenes, while chains with double-lined gaps are more \ often due to paralogs and unprocessed pseudogenes.

\

\ In the "pack" and "full" display\ modes, the individual feature names indicate the chromosome, strand, and\ location (in thousands) of the match for each matching alignment.

\ \ \

Display Conventions and Configuration

\

By default, the chains to chromosome-based assemblies are colored\ based on which chromosome they map to in the aligning organism. To turn\ off the coloring, check the "off" button next to: Color\ track based on chromosome.

\

\ To display only the chains of one chromosome in the aligning\ organism, enter the name of that chromosome (e.g. chr4) in box next to: \ Filter by chromosome.

\ \

Methods

\

\ The D. yakuba/D. melanogaster genomes were aligned with \ blastz and converted into axt format using the lavToAxt program.\ The axt alignments were fed into axtChain, which organizes all \ alignments between a single D. yakuba chromosome and a single \ D. melanogaster chromosome into a group and creates a kd-tree out \ of the gapless subsections (blocks) of the alignments. A dynamic program \ was then run over the kd-trees to find the maximally scoring chains of these \ blocks. Chains scoring below a threshold were discarded; the remaining \ chains are displayed in this track.

\ \

Credits

\

\ Blastz was developed at Pennsylvania State University by \ Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from\ Ross Hardison.

\

\ Lineage-specific repeats were identified by Arian Smit and his \ RepeatMasker\ program.

\

\ The axtChain program was developed at the University of California at \ Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler.

\

\ The browser display and database storage of the chains were generated\ by Robert Baertsch and Jim Kent.

\ \

References

\

\ Chiaromonte F, Yap VB, Miller W.\ Scoring pairwise genomic sequence alignments.\ Pac Symp Biocomput. 2002:115-26.\ PMID: 11928468\

\ \

\ Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D.\ Evolution's cauldron:\ duplication, deletion, and rearrangement in the mouse and human genomes.\ Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9.\ PMID: 14500911; PMC: PMC208784\

\ \

\ Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC,\ Haussler D, Miller W.\ Human-mouse alignments with BLASTZ.\ Genome Res. 2003 Jan;13(1):103-7.\ PMID: 12529312; PMC: PMC430961\

\ compGeno 1 altColor 255,240,200\ color 100,50,0\ group compGeno\ longLabel D. yakuba (Apr. 2004 (WUGSC 1.0/droYak1)) Chained Alignments\ otherDb droYak1\ priority 141\ shortLabel D. yakuba Chain\ spectrum on\ track chainDroYak1\ type chain droYak1\ visibility hide\ netDroYak1 D. yakuba Net netAlign droYak1 chainDroYak1 D. yakuba (Apr. 2004 (WUGSC 1.0/droYak1)) Alignment Net 0 141.1 0 0 0 127 127 127 1 0 0

Description

\

\ This track shows the best D. yakuba/D. melanogaster chain for \ every part of the D. melanogaster genome. It is useful for\ finding orthologous regions and for studying genome\ rearrangement. The D. yakuba sequence used in this annotation is \ from the Apr. 2004 (WUGSC 1.0/droYak1) (droYak1) assembly.

\ \

Display Conventions and Configuration

\

\ In full display mode, the top-level (level 1)\ chains are the largest, highest-scoring chains that\ span this region. In many cases gaps exist in the\ top-level chain. When possible, these are filled in by\ other chains that are displayed at level 2. The gaps in \ level 2 chains may be filled by level 3 chains and so\ forth.

\

\ In the graphical display, the boxes represent ungapped \ alignments; the lines represent gaps. Click\ on a box to view detailed information about the chain\ as a whole; click on a line to display information\ about the gap. The detailed information is useful in determining\ the cause of the gap or, for lower level chains, the genomic\ rearrangement.

\

\ Individual items in the display are categorized as one of four types\ (other than gap):

\

\ \

Methods

\

\ Chains were derived from blastz alignments, using the methods\ described on the chain tracks description pages, and sorted with the \ highest-scoring chains in the genome ranked first. The program\ chainNet was then used to place the chains one at a time, trimming them as \ necessary to fit into sections not already covered by a higher-scoring chain. \ During this process, a natural hierarchy emerged in which a chain that filled \ a gap in a higher-scoring chain was placed underneath that chain. The program \ netSyntenic was used to fill in information about the relationship between \ higher- and lower-level chains, such as whether a lower-level\ chain was syntenic or inverted relative to the higher-level chain. \ The program netClass was then used to fill in how much of the gaps and chains \ contained Ns (sequencing gaps) in one or both species and how much\ was filled with transposons inserted before and after the two organisms \ diverged.

\ \

Credits

\

\ The chainNet, netSyntenic, and netClass programs were\ developed at the University of California\ Santa Cruz by Jim Kent.

\

\ Blastz was developed at Pennsylvania State University by\ Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from\ Ross Hardison.

\

\ Lineage-specific repeats were identified by Arian Smit and his program \ RepeatMasker.

\

\ The browser display and database storage of the nets were made\ by Robert Baertsch and Jim Kent.

\ \

References

\

\ Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D.\ Evolution's cauldron:\ duplication, deletion, and rearrangement in the mouse and human genomes.\ Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9.\ PMID: 14500911; PMC: PMC208784\

\ \

\ Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC,\ Haussler D, Miller W.\ Human-mouse alignments with BLASTZ.\ Genome Res. 2003 Jan;13(1):103-7.\ PMID: 12529312; PMC: PMC430961\

\ compGeno 0 group compGeno\ longLabel D. yakuba (Apr. 2004 (WUGSC 1.0/droYak1)) Alignment Net\ otherDb droYak1\ priority 141.1\ shortLabel D. yakuba Net\ spectrum on\ track netDroYak1\ type netAlign droYak1 chainDroYak1\ visibility hide\ rmsk RepeatMasker rmsk Repeating Elements by RepeatMasker 1 149.1 0 0 0 127 127 127 1 0 0

Description

\

\ This track was created by using Arian Smit's RepeatMasker program, which screens DNA sequences \ for interspersed repeats and low complexity DNA sequences. The program\ outputs a detailed annotation of the repeats that are present in the\ query sequence (represented by this track), as well as a modified version\ of the query sequence in which all the annotated repeats have been masked\ (generally available on the\ Downloads page). RepeatMasker uses \ the Repbase Update library of repeats from the \ Genetic \ Information Research Institute (GIRI). \ Repbase Update is described in Jurka, J. (2000) in the References section below.

\ \

Display Conventions and Configuration

\

\ In full display mode, this track displays up to ten different classes of repeats:\

\

\ The level of color shading in the graphical display reflects the amount of \ base mismatch, base deletion, and base insertion associated with a repeat \ element. The higher the combined number of these, the lighter the shading.

\ \

Methods

\

\ UCSC has used the most current versions of the RepeatMasker software \ and repeat libraries available to generate these data. Note that these \ versions may be newer than those that are publicly available on the Internet. \

\

\ Data are generated using the RepeatMasker -s flag. Additional flags\ may be used for certain organisms. Repeats are soft-masked. Alignments may \ extend through repeats, but are not permitted to initiate in them. \ See the \ FAQ for \ more information.

\ \

Credits

\

\ Thanks to Arian Smit and GIRI\ for providing the tools and repeat libraries used to generate this track.

\ \

References

\

\ Jurka J.\ Repbase update: a database and an electronic journal of repetitive elements.\ Trends Genet. 2000 Sep;16(9):418-20.\ PMID: 10973072\

\ varRep 0 canPack off\ group varRep\ longLabel Repeating Elements by RepeatMasker\ priority 149.1\ shortLabel RepeatMasker\ spectrum on\ track rmsk\ type rmsk\ visibility dense\