Prepared search databases

RefSeq microbial genomes

These database are formatted for use with sourmash sbt_search and sourmash sbt_gather.

Approximately 60,000 microbial genomes (including viral and fungal) from NCBI RefSeq.

Genbank microbial genomes

These database are formatted for use with sourmash sbt_search and sourmash sbt_gather.

Approximately 100,000 microbial genomes (including viral and fungal) from NCBI Genbank.

Details

The individual signatures for the above SBTs were calculated as follows:

sourmash compute -k 4,5 \
                         -n 2000 \
                         --track-abundance \
                         --name-from-first \
                         -o {output} \
                         {input}

sourmash compute -k 21,31,51 \
                         --scaled 2000 \
                         --track-abundance \
                         --name-from-first \
                         -o {output} \
                         {input}