Bacterial artificial chromosomes (BACs) are large inserts of genomic DNA (typically 150–300 kb) carried in bacteria. Sequencing a single short read from each end of a BAC and mapping those end sequences to a reference genome yields the approximate start and stop of the full BAC insert. These BAC end placements are useful for confirming the order, orientation, and span of the reference assembly, for identifying large structural variants that disrupt concordant pair placement, and for locating a BAC containing a gene of interest for downstream laboratory work. The individual clones in all three libraries shown here can be ordered from BACPAC Resources (CHORI/BACPAC) for use at the bench; see the per-library links below.
This track container shows three CHORI (Children's Hospital Oakland Research Institute) zebrafish BAC libraries:
Each item is drawn as a single block spanning the inferred BAC insert (start of the upstream end to end of the downstream end). Clicking an item opens a details page showing the clone name, NCBI placement ID, insert size, concordance and uniqueness flags, assembly unit (Primary Assembly, ALT_DRER_TU_1, etc.), and an oversize flag set for placements larger than 500 kb — far longer than a typical BAC — so users can filter out likely-spurious mappings.
The clone name links out to a ZFIN search for cross-reference information on the clone. Clone names (e.g. CH1073-100A1, CH73-1A1, CH211-1A1) are indexed and can be entered directly in the Genome Browser position/search box to jump to a clone.
Three categorical filters are available in each subtrack:
The source data were produced by the NCBI Clone DB group from end sequences of the three CHORI libraries. NCBI maps each end sequence to the reference assembly and categorizes the pair as concordant (expected orientation and insert size) or discordant, and as uniquely placed or multiply placed. The full set of per-library placement reports for zebrafish is available from the NCBI FTP server at ftp.ncbi.nih.gov/repository/clone/reports/Danio_rerio/.
To build the UCSC tracks, the three *.GCF_000002035.6.105.unique_concordant.gff files were downloaded and converted to BED. RefSeq contig accessions in the GFFs (e.g. NC_007114.7, NW_018394540.1) were mapped to UCSC-style chromosome names (e.g. chr3, chr1_KZ114997v1_alt) using the NCBI GRCz11 assembly report. An oversize flag was set on any insert longer than 500 kb; these records are retained so researchers can inspect them but are easy to exclude via the track filter. The resulting BEDs were converted to bigBed with bedToBigBed using a name search index so clone names can be looked up from the browser position box.
The step-by-step track build commands (downloads, RefSeq-to-UCSC mapping, BED conversion, bigBed build) are recorded in the UCSC makeDoc for this track: src/hg/makeDb/doc/danRer11/choriCloneEnds.txt. The GFF-to-BED converter, the RefSeq-to-UCSC mapping script, and the autoSql schema live in src/hg/makeDb/scripts/choriCloneEnds/.
The data can be explored interactively in table format with the Table Browser or the Data Integrator and exported from there to spreadsheet or tab-sep tables. From scripts, the data can be accessed through our API, with track=choriCloneEndsCH1073, track=choriCloneEndsCH73, or track=choriCloneEndsCH211.
For automated download and analysis, each library's annotation is stored in a bigBed file that can be downloaded from our download server: CH1073.bb, CH73.bb, CH211.bb. Individual regions or the whole genome annotation can be obtained using our tool bigBedToBed, which can be compiled from the source code or downloaded as a precompiled binary for your system. Instructions for downloading source code and binaries can be found here. The tool can also be used to obtain features within a given range, e.g. bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/danRer11/choriCloneEnds/CH1073.bb -chrom=chr1 -start=0 -end=10000000 stdout.
Clone placements produced by the NCBI Clone DB group. The CHORI zebrafish BAC libraries (CH73, CH211, CH1073) were constructed by Pieter de Jong and colleagues at BACPAC Resources (CHORI/BACPAC). For background on de Jong's role in building these clone libraries, see this Undark profile.
Schneider VA, Chen HC, Clausen C, Meric PA, Zhou Z, Bouk N, Husain N, Maglott DR, Church DM. Clone DB: an integrated NCBI resource for clone-associated data. Nucleic Acids Res. 2013 Jan;41(Database issue):D1070-8. PMID: 23193260; PMC: PMC3531087