RepeatModeler Version 2.0.4 =========================== Using output directory = /data/tmp/rModeler.5LpFbB/RM_728245.FriAug151122492025 Search Engine = rmblast 2.13.0+ Threads = 32 Dependencies: TRF 4.09, RECON , RepeatScout 1.0.6, RepeatMasker 4.1.4 LTR Structural Analysis: Disabled [use -LTRStruct to enable] Random Number Seed: 1755282166 Database = /data/tmp/rModeler.5LpFbB/GCA_034696485.1_ASM3469648v1 - Sequences = 409473 - Bases = 222290082 - N50 = 577 - Contig Histogram: Size(bp) Count ----------------------------------------------------------------------- 113662-121767 | [ 3 ] 105558-113662 | [ ] 97453-105557 | [ 1 ] 89349-97453 | [ 2 ] 81244-89348 | [ 1 ] 73140-81244 | [ ] 65035-73139 | [ 5 ] 56931-65035 | [ 12 ] 48826-56930 | [ 17 ] 40722-48826 | [ 26 ] 32617-40721 | [ 66 ] 24513-32617 | [ 162 ] 16408-24512 | [ 385 ] 8304-16408 | [ 1541 ] 200-8304 |************************************************** [ 407252 ] WARN: The N50 for this assembly is low ( <10,000 ). The de novo methods employed by RepeatModeler are intended for use with long contiguous sequences and may not perform well with an over-abundance of short contigs in the database. Storage Throughput = fair ( 563.58 MB/s ) RepeatModeler Round # 1 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 40000000 bp - Final Sample Size = 40003780 bp ( 40003780 non ambiguous ) - Num Contigs Represented = 73822 - Sequence extraction : 00:00:09 (hh:mm:ss) Elapsed Time -- Running RepeatScout on the sequences... - RepeatScout: 00:14:27 (hh:mm:ss) Elapsed Time Round Time: 00:16:47 (hh:mm:ss) Elapsed Time : 118 families discovered. RepeatModeler Round # 2 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 10000000 bp - Sequence extraction : 00:00:03 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:00:17 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 875 repeats masked totaling 110487 bp(s). - TE Masking time 00:00:06 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 10004019 bp Num Contigs Represented = 18423 Non ambiguous bp: Initial: 10004019 bp After Masking: 9846356 bp Masked: 1.58 % -- Input Database Coverage: 10004019 bp out of 222290082 bp ( 4.50 % ) Sampling Time: 00:00:28 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 169694253 Comparison Time: 01:06:43 (hh:mm:ss) Elapsed Time, 2532 HSPs Collected Number of families returned by RECON: 959 Round Time: 01:07:49 (hh:mm:ss) Elapsed Time : 1 families discovered. RepeatModeler Round # 3 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 30000000 bp - Sequence extraction : 00:00:08 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:00:51 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 2600 repeats masked totaling 318441 bp(s). - TE Masking time 00:00:13 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 30000077 bp Num Contigs Represented = 55400 Non ambiguous bp: Initial: 30000077 bp After Masking: 29546783 bp Masked: 1.51 % -- Input Database Coverage: 40004096 bp out of 222290082 bp ( 18.00 % ) Sampling Time: 00:01:17 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 1534663101 Comparison Time: 03:28:46 (hh:mm:ss) Elapsed Time, 20255 HSPs Collected Number of families returned by RECON: 4783 Round Time: 03:33:18 (hh:mm:ss) Elapsed Time : 24 families discovered. RepeatModeler Round # 4 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 90000000 bp - Sequence extraction : 00:00:24 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:02:32 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 9299 repeats masked totaling 1180319 bp(s). - TE Masking time 00:01:10 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 90000184 bp Num Contigs Represented = 163915 Non ambiguous bp: Initial: 90000184 bp After Masking: 88413514 bp Masked: 1.76 % -- Input Database Coverage: 130004280 bp out of 222290082 bp ( 58.48 % ) Sampling Time: 00:04:20 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 13436604415 Comparison Time: 13:48:07 (hh:mm:ss) Elapsed Time, 173517 HSPs Collected Number of families returned by RECON: 23139 Round Time: 14:23:29 (hh:mm:ss) Elapsed Time : 228 families discovered. - Increasing sample size to include end piece now = 362285986 RepeatModeler Round # 5 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 270000000 bp - Sequence extraction : 00:00:18 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:02:12 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 17921 repeats masked totaling 3234804 bp(s). - TE Masking time 00:01:37 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 92285649 bp Num Contigs Represented = 171777 Non ambiguous bp: Initial: 92285649 bp After Masking: 88650164 bp Masked: 3.94 % -- Input Database Coverage: 222289929 bp out of 222290082 bp ( 100.00 % ) Sampling Time: 00:04:20 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 14756331528 Comparison Time: 14:41:51 (hh:mm:ss) Elapsed Time, 89966 HSPs Collected Number of families returned by RECON: 22784 Round Time: 15:08:37 (hh:mm:ss) Elapsed Time : 95 families discovered. RepeatScout/RECON discovery complete: 466 families found Classification Time: 00:13:45 (hh:mm:ss) Elapsed Time Program Time: 34:43:46 (hh:mm:ss) Elapsed Time