We support different storage options for the internal SBT data.
The initial storage schema. Saves internal SBT data in a hidden directory near the SBT JSON description.
Pros: easy to create
Cons: annoying to distribute (thousands of files). We used to create a tar file of JSON + hidden directory, which requires extracting and using more disk space.
Similar to FSStorage,
but saves the internal SBT data in a
easy to distribute (one file)
still need to distribute and download everything (you need the full zip file available locally to be able to use the SBT).
Uses IPFS to store internal SBT data, allowing partial database download.
easy to distribute (one file, the SBT JSON description)
only data needed for analysis is downloaded
benefits from more people storing the data in their computers and sharing bandwidth
needs IPFS daemon running in the computer
takes longer to run if data is not prefetched
Meant to be a fast in-memory storage. There won’t be a public Redis server to provide the internal SBT data, but this storage is a good option for loading data from others sources and sharing with other processes or servers in your private network.
Shareable between processes or servers in a network
Faster access time than reading from disk (probably?)
No public server for the data (need to convert from other sources)
Converting an existing tree to use a new storage¶
You can convert SBTs to another storage using the
sourmash storage convert command:
$ sourmash storage convert -b new_storage_type database.sbt.json
For example: to convert a tree to IPFSStorage, do
$ sourmash storage convert -b ipfs database.sbt.json