Skip to main content
Fig. 1 | Microbiome

Fig. 1

From: Streaming histogram sketching for rapid microbiome analytics

Fig. 1

Overview of our method to histosketch microbiome samples from sequence data streams. a During counting, sequence reads are collected from the data stream by n counting processes. Reads are decomposed to canonical k-mers, encoded to uint64 values and used to increment local count-min sketches. Once X reads have been received from the data stream, approximate k-mer counts from the counting processes are transmitted as histogram elements to the single sketching process. b To update the histosketch, the incoming histogram element is hashed and compared against each hash value (W) or the previous histosketch (S), updating S and W if a new minimum is encountered. To hash the incoming vector, uniform scaling is applied and a cumulative frequency estimate is made using a count-min sketch; we then utilise CWS to generate a hash value for the updated histogram bin

Back to article page