        Chaining/Netting README

This file describes how to make chain and net tracks.
These are higher level structures built on top
of blastz mouse/human alignments.  
This document is based on a README written by Jim Kent.
It specifically refers to mouse/human alignment.

The programs referred to here are currently
in source control in the directory src/hg/mouseStuff.

I. Do a little cluster run that does
      axtFilter -notQ=chrUn chrN.axt | 
      axtChain stdin humanMixedNibDir mouseMixedNibDir chrN.chain
   so that for each axt file in the axtChrom directory.   Check
   carefully that there are no errors on chr19,  as this is very
   close to running out of memory.  See the 'run1' directory for
   how this was set up.  [If chr19 fails in the run1/all directory
   then you'll need to execute the stuff in run1/19, and then
   chainMergeSort the results into run1/all/chain].

   Note this only takes 4 hours to do serially, so you could do
   it as an overnight run rather than a little cluster run.
   Note: this actually took 8 hours for rn3/hg15.
   Note since it's just on the little cluster and there are only
   24 jobs it's ok not to have the input on the node local disks.

II. In order to assign unique id's to each chain genome wide
  do the following on the file server:
      chainMergeSort run1/all/chain/*.chain > all.chain
      chainSplit chain all.chain
  These two steps will take about 20 minutes.
  If all looks good at this stage you can delete run1/chain/*.

III. Load the chains into database as so
     cd chain
     foreach i (*.chain)
	set c = $i:r
	hgLoadChain hg13 ${c}_mouseChain $i
	echo done $c
     end
  You can go on to step IV while this is proceeding.

IV. Convert the chains to nets as so, still while on the file server
      mkdir preNet
      cd chain
      foreach i (*.chain)
	  echo preNetting $i
	  chainPreNet $i ~/oo/chrom.sizes ~/mm/chrom.sizes ../preNet/$i
      end
      cd ..
   This foreach loop will take about 15 min to execute.  

      mkdir n1 
      cd preNet
      foreach i (*.chain)
          set n = $i:r.net
	  echo primary netting $i
	  chainNet $i -minSpace=1 ~/oo/chrom.sizes ~/mm/chrom.sizes ../n1/$n /dev/null
      end
      cd ..
      cat n1/*.net | netSyntenic stdin hNoClass.net
   This next step requires the database, so it needs to be done on hgwdev
      netClass hNoClass.net hg13 mm2 human.net -tNewR=$HOME/oo/bed/linSpecRep -qNewR=$HOME/mm/bed/linSpecRep

   If things look good do 
        rm -r n1 hNoClass.net
   Make a 'syntenic' subset of these with
        netFilter -syn human.net > humanSyn.net

V. Load the nets into database as so
     netFilter -minGap=10 human.net |  hgLoadNet hg13 mouseNet stdin
     netFilter -minGap=10 humanSyn.net | hgLoadNet hg13 mouseSynNet stdin

VI. Make syntenic and regular subset of axt files, chain files and 
  make gap files.
     netSplit human.net humanNet
     netSplit humanSyn.net humanSynNet
     mkdir ../axtSyn ../axt
     mkdir humanSynGap humanSynChain humanNetGap humanNetChain
     cd humanSynNet
     foreach i (*.net)
	 set c = $i:r
         netToAxt $i ../chain/$c.chain ~/oo/mixedNib ~/mm/mixedNib ../../axtSyn/$c.axt 
	 echo done ../../axtSyn/$c.axt
     end

     foreach i (*.net)
	 set c = $i:r
         netChainSubset $i ../chain/$c.chain ../humanSynChain/$c.chain -gapOut=../humanSynGap/$c.gap
	 echo done $c chain and gap
     end

     cd ..
     cd humanNet
     foreach i (*.net)
         set c = $i:r
	 netToAxt $i ../chain/$c.chain ~/oo/mixedNib ~/mm/mixedNib ../axt/$c.axt
	 echo done ../axt/$c.axt
     end

     foreach i (*net)
         set c = $i:r
	 netChainSubset $i ../chain/$c.chain ../humanNetChain/$c.chain -gapOut=../humanNetGap/$c.gap
	 echo done $c chain and gap
     end


     
VII. Do much of this again for mouse.
      #Sort chains into mouse chromosomes.
      mkdir mouse
      chainSplit -q mouse/chain all.chain

      #PreNet mouse
      cd mouse
      mkdir preNet
      cd chain
      foreach i (*.chain)
	  echo preNetting $i
	  chainPreNet $i ~/oo/chrom.sizes ~/mm/chrom.sizes ../preNet/$i
      end
      cd ..

      #Net mouse
      mkdir n1 
      cd preNet
      foreach i (*.chain)
          set n = $i:r.net
	  echo primary netting $i
	  chainNet $i -minSpace=1 ~/oo/chrom.sizes ~/mm/chrom.sizes /dev/null ../n1/$n 
      end
      cd ..

      #Add synteny info
      cat n1/*.net | netSyntenic stdin mNoClass.net

      #Move to hgwdev and add repeat and other classifications.
      netClass mNoClass.net mm2 hg12 mouse.net -tNewR=$HOME/mm/bed/linSpecRep -qNewR=$HOME/oo/bed/linSpecRep

      #clean up
      rm -r n1 mNoClass.net
      rm all.chain

      #load database with everything but the small gaps
      netFilter -minGap=10 mouse.net |  hgLoadNet mm2 humanNet stdin
      netFilter -minGap=10 -syn mouse.net | hgLoadNet mm2 humanSynNet stdin

