Data Storage
StreamingFast Firehose data storage
Data Storage in Firehose
Data Storage in Detail
Data and the locations where it is stored are important facets of Firehose deployment and operation.
Key Firehose data storage topics include Data Stores, Merged blocks files, serialization, one block files, and 100-blocks files.
Data Stores
Firehose Stores are abstractions sitting on top of Object Storage.
Note: Object Storage is a data storage technique that manages data as objects in opposition to other data storage architectures like hierarchical file systems.
Abstraction Library
Stores utilize the Firehose dstore abstraction library to provide support for local file systems, Azure, Google Cloud, Amazon S3, and other Amazon S3 API compatible object storage solutions such as MinIO or Ceph.
Production Environments
For production deployments outside of cloud providers, StreamingFast recommends Ceph as the distributed storage instead of its compatible Amazon S3 API system.
Serialization
Firehose primarily utilizes Protocol Buffers version 3 for serialization.
Merged Blocks Files
Merged Blocks in Detail
Merged blocks files are also referred to as 100-blocks files
, and merged bundles. These terms are all used interchangeably within Firehose.
Merged blocks are binary files that use the dbin packing format to store a series of bstream block objects, serialized as protocol buffers.
Merged Block Creation
Firehose uses Firehose-enabled node components that have been set with a special flag to work in catch-up mode to create merged blocks.
Highly-available Merged Blocks
In high-availability Firehose configurations, merged blocks will be created by the Merger component. The Firehose-enabled node component will provide the Merger component with one-block files.
Block Bundles
The Merger component will also collate all of the one-block files into a single bundle of blocks.
One Hundred Blocks Files
Up to one hundred blocks can be contained within a single 100-blocks file.
The 100-blocks files can include multiple versions such as a fork block or a given block number, ensuring continuity through the previous block link.
Blocks Files Consumption & Use
Nearly all components in Firehose rely on or utilize 100-blocks files. The bstream library consumes 100-blocks files for example.
Protocol-specific decoded block objects, like Ethereum, are what circulate amongst all processes that work with executed block data in Firehose.
One Block Files
One Block Files in Detail
In high availability configurations, one-block files are transient and ensure the Merger component gathers all visible forks from any Firehose-enabled Node components.
Important: One-block files contain only one bstream.Block
as a serialized protocol buffer.
One-block File Consumption & Use
One-block files are consumed by the Merger
component, bundled in executed __ 100-blocks files. The one-block files are then stored to dstore
storage and consumed by most of the other Firehose processes.
Last updated