This directory contain human/rat alignments made
using the May 2004 human assembly (also known
as NCBI build 35/UCSC version hg17)
vs. the June 2003 rat assembly (also known as rn3).
The subdirectory axtTight contains relatively stringent
human/rat alignments filtered so that only the best
alignment for any given region of the human genome
is used.
The alignments are in 'axt' format. Each alignment
contains three lines and is separated from the next
alignment by a space:
Line 1 - summarizes the alignment.
Line 2 - contains the human sequence with inserts.
Line 3 - contains the rat sequence with inserts.
The summary line contains 9 blank separated fields with the
following meanings:
1 - Alignment number. The first alignment in a file
is numbered 0, the next 1, and so forth.
2 - Human chromosome.
3 - Start in human chromosome. The first base is
numbered 1.
4 - End in human chromosome. The end base is included.
5 - Rat chromosome.
6 - Start in rat.
7 - End in rat.
8 - Rat strand. If this is '-', the rat start/end fields
are relative to the reverse-complemented rat chromosome.
9 - Blastz score. The scoring matrix blastz uses is:
A C G T
A 91 -114 -31 -123
C -114 100 -125 -31
G -31 -125 100 -114
T -123 -31 -114 91
with a gap open penalty of 400 and a gap extension
penalty of 30. The minimum score for an alignment
to be kept was 3000 for the first pass, and then
2200 for the second pass, which just restricts
the search space to the regions between two alignments
found in the first pass.
The alignments were done with blastz, which is available
from Webb Miller's group at Pennsylvania State University (PSU).
Each chromosome was divided into 10010000 base chunks with 10000
bases of overlap. The .lav format blastz output, which does not
include the sequence, was converted to .axt with PSU's lavToAxt.
The axtTight alignments were processed with subsetAxt from
Jim Kent using the matrix:
A C G T
A 100 -200 -100 -200
C -200 100 -200 -100
G -100 -200 100 -200
T -200 -100 -200 100
with a gap open penalty of 2000 and a gap extension
penalty of 50. The minimum score was 3400. The axtTight
subset covers 6% of the human genome while axtBest covers
40%.
Name Last modified Size Description
Parent Directory -
chr1.axt.gz 2004-06-29 11:29 9.0M
chr1_random.axt.gz 2004-06-29 11:30 121K
chr2.axt.gz 2004-06-29 11:30 9.2M
chr2_random.axt.gz 2004-06-29 11:30 7.9K
chr3.axt.gz 2004-06-29 11:31 7.4M
chr3_random.axt.gz 2004-06-29 11:31 36K
chr4.axt.gz 2004-06-29 11:31 5.7M
chr4_random.axt.gz 2004-06-29 11:31 15K
chr5.axt.gz 2004-06-29 11:31 6.6M
chr5_random.axt.gz 2004-06-29 11:31 3.0K
chr6.axt.gz 2004-06-29 11:31 5.6M
chr6_hla_hap1.axt.gz 2004-06-29 11:31 2.3K
chr6_hla_hap2.axt.gz 2004-06-29 11:31 3.6K
chr6_random.axt.gz 2004-06-29 11:31 30K
chr7.axt.gz 2004-06-29 11:32 5.5M
chr7_random.axt.gz 2004-06-29 11:32 11K
chr8.axt.gz 2004-06-29 11:32 4.6M
chr8_random.axt.gz 2004-06-29 11:32 17K
chr9.axt.gz 2004-06-29 11:32 4.4M
chr9_random.axt.gz 2004-06-29 11:32 17K
chr10.axt.gz 2004-06-29 11:29 4.7M
chr10_random.axt.gz 2004-06-29 11:29 3.8K
chr11.axt.gz 2004-06-29 11:29 5.5M
chr12.axt.gz 2004-06-29 11:29 4.5M
chr12_random.axt.gz 2004-06-29 11:29 10K
chr13.axt.gz 2004-06-29 11:29 3.0M
chr13_random.axt.gz 2004-06-29 11:29 8.1K
chr14.axt.gz 2004-06-29 11:29 3.6M
chr15.axt.gz 2004-06-29 11:30 3.6M
chr15_random.axt.gz 2004-06-29 11:30 28K
chr16.axt.gz 2004-06-29 11:30 3.3M
chr16_random.axt.gz 2004-06-29 11:30 3.3K
chr17.axt.gz 2004-06-29 11:30 3.8M
chr17_random.axt.gz 2004-06-29 11:30 49K
chr18.axt.gz 2004-06-29 11:30 2.4M
chr18_random.axt.gz 2004-06-29 11:30 37
chr19.axt.gz 2004-06-29 11:30 2.0M
chr19_random.axt.gz 2004-06-29 11:30 524
chr20.axt.gz 2004-06-29 11:30 2.2M
chr21.axt.gz 2004-06-29 11:30 912K
chr22.axt.gz 2004-06-29 11:30 1.1M
chr22_random.axt.gz 2004-06-29 11:30 3.9K
chrM.axt.gz 2004-06-29 11:32 127
chrX.axt.gz 2004-06-29 11:32 6.2M
chrX_random.axt.gz 2004-06-29 11:32 33K
chrY.axt.gz 2004-06-29 11:32 268K