RepeatModeler Version 2.0.4 =========================== Using output directory = /dev/shm/rModeler.csdBLZ/RM_7866.TueAug130212002024 Search Engine = rmblast 2.13.0+ Threads = 32 Dependencies: TRF 4.09, RECON , RepeatScout 1.0.6, RepeatMasker 4.1.4 LTR Structural Analysis: Disabled [use -LTRStruct to enable] Random Number Seed: 1723540319 Database = /dev/shm/rModeler.csdBLZ/GCA_004325045.1_FIOER33_v1 - Sequences = 2915 - Bases = 18344352 - N50 = 9589 - Contig Histogram: Size(bp) Count ----------------------------------------------------------------------- 56282-60231 | [ 2 ] 52333-56281 | [ 3 ] 48384-52332 | [ 2 ] 44436-48384 | [ ] 40487-44435 | [ 6 ] 36538-40486 | [ 6 ] 32589-36537 | [ 10 ] 28641-32589 | [ 9 ] 24692-28640 | [ 20 ] 20743-24691 |* [ 35 ] 16794-20742 |** [ 90 ] 12846-16794 |**** [ 149 ] 8897-12845 |******** [ 287 ] 4948-8896 |******************* [ 639 ] 1000-4948 |************************************************** [ 1657 ] WARN: The N50 for this assembly is low ( <10,000 ). The de novo methods employed by RepeatModeler are intended for use with long contiguous sequences and may not perform well with an over-abundance of short contigs in the database. Storage Throughput = excellent ( 1054.92 MB/s ) RepeatModeler Round # 1 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 40000000 bp - Final Sample Size = 18344261 bp ( 18344261 non ambiguous ) - Num Contigs Represented = 2915 - Sequence extraction : 00:00:02 (hh:mm:ss) Elapsed Time -- Running RepeatScout on the sequences... - RepeatScout: 00:10:29 (hh:mm:ss) Elapsed Time Round Time: 00:14:27 (hh:mm:ss) Elapsed Time : 110 families discovered. RepeatModeler Round # 2 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 10000000 bp - Sequence extraction : 00:00:02 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:00:43 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 7094 repeats masked totaling 1900052 bp(s). - TE Masking time 00:00:10 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 10002159 bp Num Contigs Represented = 1598 Non ambiguous bp: Initial: 10002159 bp After Masking: 7929053 bp Masked: 20.73 % -- Input Database Coverage: 10002159 bp out of 18344352 bp ( 54.52 % ) Sampling Time: 00:00:57 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 1282401 Comparison Time: 00:18:27 (hh:mm:ss) Elapsed Time, 18422 HSPs Collected Number of families returned by RECON: 1025 Round Time: 00:20:10 (hh:mm:ss) Elapsed Time : 18 families discovered. - Increasing sample size to include end piece now = 38344352 RepeatModeler Round # 3 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 38344352 bp - Sequence extraction : 00:00:01 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:00:35 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 6723 repeats masked totaling 2025043 bp(s). - TE Masking time 00:00:12 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 8342090 bp Num Contigs Represented = 1322 Non ambiguous bp: Initial: 8342090 bp After Masking: 6185851 bp Masked: 25.85 % -- Input Database Coverage: 18344249 bp out of 18344352 bp ( 100.00 % ) Sampling Time: 00:00:49 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 878475 Comparison Time: 00:13:21 (hh:mm:ss) Elapsed Time, 2588 HSPs Collected Number of families returned by RECON: 678 Round Time: 00:14:24 (hh:mm:ss) Elapsed Time : 3 families discovered. RepeatScout/RECON discovery complete: 131 families found Classification Time: 00:05:37 (hh:mm:ss) Elapsed Time Program Time: 00:54:38 (hh:mm:ss) Elapsed Time