**Distributed construction of an FM index from multiple input files** If your data sets consists of multiple files, you can construct the FM-index for each file separately then merge the indices together to obtain an index of the entire data. This requires much less memory than constructing an index from a single file containing the entire data set. For example, suppose your data consists of four files: s_1_1.fastq s_1_2.fastq s_2_1.fastq s_2_2.fastq We begin by constructing an index of each file individually: sga index s_1_1.fastq sga index s_1_2.fastq sga index s_2_1.fastq sga index s_2_2.fastq Then we want to merge the indices together in pairs until we obtain a single index: sga merge -p merged1 s_1_1.fastq s_1_2.fastq sga merge -p merged2 s_2_1.fastq s_2_2.fastq sga merge -p final merged1.fa merged2.fa The final index can then be used in other steps of the pipeline, for instance to error correct the original sequence files: sga correct -p final s_1_1.fastq sga correct -p final s_1_2.fastq sga correct -p final s_2_1.fastq sga correct -p final s_2_2.fastq