- Home
- Genomes
- Genome Browser
- Tools
- Mirrors
- Downloads
- My Data
- Help
- About Us
This page contains links to sequence and annotation downloads for the genome assemblies featured in the UCSC Genome Browser. Downloads are also available via our JSON API, MySQL server, or FTP server. Data filtering is available in the Table Browser or via the command-line utilities.
For access to the most recent assembly of each genome, see the current genomes directory. Previous versions of certain data are available from our track archive. Data hosted in Public Hubs exists on external sites. GenArk (Genome Archive) species data can be found here. All data in the Genome Browser are freely usable for any purpose except as indicated in the README.txt files in the download directories. These data were contributed by many researchers, as listed on the Genome Browser credits page. Please acknowledge the contributor(s) of the data you use.
This assembly represents the T2T-CHM13v2.0 genome. While it may be more recent than hg38, hg38 is still the latest GRCh assembly and is better annotated by most projects.
This assembly is served entirely as a track hub, meaning no MySQL files exist.
The following http://hgdownload.soe.ucsc.edu/gbdb/ location has assembly sequences used in alignment tracks, such as in the 100-species conservation track. For example, you can find the underlying mayZeb1.2bit sequence file for the Zebra Mbuna fish assembly, not yet released but used in the hg38 Vertebrate Multiz Alignment & Conservation (100 Species) track, here: http://hgdownload.soe.ucsc.edu/gbdb/mayZeb1/. These links also display under a column titled "UCSC version" on the conservation track description page.
The source code for the Genome Browser, Blat, liftOver and other utilities is free for non-profit academic research and personal use. For information on commercial licensing, see the Genome Browser license and Blat license requirements. The source and executables for several of these products can be downloaded or purchased from our online store.
You can install a local mirrored copy of the Genome Browser website on your web server, eliminating the need to compile the entire source tree and providing customization and privacy options.
If you encounter difficulties with slow download speeds, try using UDT Enabled Rsync (UDR), which improves the throughput of large data transfers over long distances. The 32-bit and 64-bit versions can be downloaded here.
The utilities directory offers downloads of pre-compiled standalone binaries for:
$ rsync -aP hgdownload.soe.ucsc.edu::genome/admin/exe/linux.x86_64/ ./
Note about 'permission denied' error when downloading with https, curl, or wget:$ wget https://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/liftOver
$ chmod +x ./filePath/utility_name
$ ./filePath/utility_name
Example:
$ chmod +x /home/user/liftover/liftOver
See also: http://en.wikipedia.org/wiki/Chmod
Please review the userApps README for information on fetching specific directories from the kent source tree or downloading userApps.src.tgz to build and install all kent utilities. Note that commercial download and installation of the Blat and In-Silico PCR software requires a licence, which may be obtained from Kent Informatics.
The /gbdb fileserver offers access to all files referenced by the Genome Browser tables, with servers in North America and Europe for faster downloads. Many files in the browser, such as bigBed files, are hosted in binary format. For example, in the hg38 database, the crispr.bb and crisprDetails.tab files for the CRISPR track can be found using the following URLs:
Individual regions or whole genome annotations from binary files can be obtained using tools
such as bigBedToBed
, which can be downloaded as a
precompiled binary for your system (see the Source and utilities
downloads section). The bigBedToBed
tool can also be used to obtain a
specific subset of features within a given range, e.g.:
bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/hg38/crispr/crispr.bb -chrom=chr21 -start=25000000 -end=30000000 stdout
The GenArk Hubs allow visualization of thousands of NCBI genomes previously not available on the Genome Browser. The underlying data can be accessed by clicking the clade (e.g. primates) finding your organism or assembly, and clicking the download link in the third column. For direct link to a particular GCA or GCF assembly ID, you can model your links after this example, where IDs are separated by slashes each three characters.
You can find more examples of downloading GenArk data in our FAQ section.UCSC Genome Browser supports a public MySql server with annotation data available for filter and query. For more information on this service, see our MySQL server page.
The JSON API can also be used to query and download gbdb data in JSON format. Below are two examples of how to query and download data using the JSON API, respectively.
http://api.genome.ucsc.edu/getData/track?genome=hg38;track=ncbiRefSeqOther;chrom=chr21;start=25000000;end=30000000
wget -O- 'http://api.genome.ucsc.edu/getData/track?genome=hg38;track=ncbiRefSeqOther;chrom=chr21;start=25000000;end=30000000' > out.json
For more information see the JSON API help page.