Zebrafish
Danio rerio
Photo courtesy of NHGRI (Press Photos)

The Dec. 2008 zebrafish (Danio rerio) zv8 assembly (CAAK00000000.5) was produced by The Wellcome Trust Sanger Institute, UK. For more information about this assembly, see Zv8 in the NCBI Assembly database.

Sample position queries

A genome position can be specified by the accession number of a sequenced genomic region, an mRNA or EST, a chromosomal coordinate range, or keywords from the GenBank description of an mRNA. The following list shows examples of valid position queries for the zebrafish genome. Note that some position queries (e.g. "huntington") may return matches to the mRNA records of other species. In these cases, the mRNAs are mapped to their homologs in zebrafish. See the User's Guide for more information.

Request:
   Genome Browser Response:

chr1   Displays all of chromosome 1
chr1:1-200000   Displays first two hundred thousand bases of chromosome 1
chr1:100000+2000 Displays a region of chr 1 that spans 2000 bases, starting with position 100000

U30710   Displays region containing zebrafish mRNA with GenBank accession number U30710
AA658622   Displays region containing zebrafish EST with GenBank accession AA658622
ENSDART00000025573   Displays region containing Ensembl gene prediction transcript ENSDART00000025573

p53   Lists mRNAs related to the p53 tumor suppressor
pseudogene mRNA   Lists transcribed pseudogenes but not cDNAs, in GenBank
homeobox caudal   Lists mRNAs for caudal homeobox genes in GenBank
zinc finger   Lists many zinc finger mRNAs
kruppel zinc finger   Lists only kruppel-like zinc fingers
huntington   Lists mRNAs associated with Huntington's disease

porter   Lists mRNAs deposited by scientists named Porter
Amsterdam,A.   Lists mRNAs deposited by co-author A. Amsterdam

Use this last format for author queries. Although GenBank requires the search format Amsterdam A, internally it uses the format Amsterdam,A.

Assembly details

The zv8 assembly consists of 1,481,241,295 bp in 11,623 scaffolds with a coverage of 6.5-7x. Two major changes were made in this assembly process to overcome problems in previous assemblies:

  • The fingerprint contig (FPC) order and orientation was reorganized through more careful use of the existing genetic maps, heat shock, MGH and T51 radiation hybrid maps, resulting in the identification and removal of several haplotypic duplications.
  • A whole-genome shotgun assembly (WGS) with more coverage was used in zv8 to fill in gaps between finished BAC/fosmid FPC sequence.
For more details about the zv8 assembly, see the Sanger Institute page for the Danio rerio Sequencing Project.

In this assembly, scaffolds that are based on clone contigs or could be associated with chromosome placements through marker information are named Zv8_scaffoldn. WGS contigs that could not be placed on chromosomes are named Zv8_NAn. In previous assemblies, the latter contigs were discarded unless the length was greater than 5 kb (or 2 kb if features were present). For the zv8 assembly, all Zv8_NA contigs greater than 2 kb were retained, resulting in an increase in the number of contigs. Linkage group numbers were translated directly into chromosome numbers (e.g. linkage group 1 = chromosome 1). The zebrafish mitochondrial sequence is also available as the virtual chromosome chrM.

The Sanger Institute notes that this assembly release is still preliminary. The regions of the assembly covered by WGS contigs are of lower quality. The assembly contains misjoins and misassemblies and artificial duplications due to the retention of haplotypic sequences. During the generation of the zv8 assembly, particular attention was paid to improving the order of the clone path.

Downloads of the zebrafish data and annotations can be obtained from the UCSC FTP site or Downloads page. This data set has specific conditions for use. The danRer6 annotation tracks were generated by UCSC and collaborators worldwide. See the Credits page for a detailed list of the organizations and individuals who contributed to this release.


GenBank Pipeline Details

For the purposes of the GenBank alignment pipeline, this assembly is considered to be: well-ordered.