Data Storage

StreamingFast Firehose data storage

Data Storage in Firehose

Data Storage in Detail

Data and the locations where it is stored are important facets of Firehose deployment and operation.

Key Firehose data storage topics include Data Stores, Merged blocks files, serialization, one block files, and 100-blocks files.

Data Stores

Firehose Stores are abstractions sitting on top of Object Storage.

Note: Object Storage is a data storage technique that manages data as objects in opposition to other data storage architectures like hierarchical file systems.

Abstraction Library

Stores utilize the Firehose dstore abstraction library to provide support for local file systems, Azure, Google Cloud, Amazon S3, and other Amazon S3 API compatible object storage solutions such as MinIO or Ceph.

Production Environments

For production deployments outside of cloud providers, StreamingFast recommends Ceph as the distributed storage instead of its compatible Amazon S3 API system.

Serialization

Firehose primarily utilizes Protocol Buffers version 3 for serialization.

Merged Blocks Files

Merged Blocks in Detail

Merged blocks files are also referred to as 100-blocks files, and merged bundles. These terms are all used interchangeably within Firehose.

Merged blocks are binary files that use the dbin packing format to store a series of bstream block objects, serialized as protocol buffers.

Merged Block Creation

Firehose uses Firehose-enabled node components that have been set with a special flag to work in catch-up mode to create merged blocks.

Highly-available Merged Blocks

In high-availability Firehose configurations, merged blocks will be created by the Merger component. The Firehose-enabled node component will provide the Merger component with one-block files.

Block Bundles

The Merger component will also collate all of the one-block files into a single bundle of blocks.

One Hundred Blocks Files

Up to one hundred blocks can be contained within a single 100-blocks file.

The 100-blocks files can include multiple versions such as a fork block or a given block number, ensuring continuity through the previous block link.

Blocks Files Consumption & Use

Nearly all components in Firehose rely on or utilize 100-blocks files. The bstream library consumes 100-blocks files for example.

Protocol-specific decoded block objects, like Ethereum, are what circulate amongst all processes that work with executed block data in Firehose.

One Block Files

One Block Files in Detail

In high availability configurations, one-block files are transient and ensure the Merger component gathers all visible forks from any Firehose-enabled Node components.

Important: One-block files contain only one bstream.Block as a serialized protocol buffer.

One-block File Consumption & Use

One-block files are consumed by the Merger component, bundled in executed __ 100-blocks files. The one-block files are then stored to dstore storage and consumed by most of the other Firehose processes.

Last updated