StreamingFast Firehose components

Firehose Component Family

Components by Name

The Firehose system is comprised of several key components including the:

Component Relationships

The Firehose components work together in symphony to provide blockchain data from configured and instrumented nodes to consumers through the gRPC Server.
Tip: Understanding the Firehose components individually is helpful for fully comprehending the overall system and will aid with setup and operation.

Firehose-enabled Blockchain Node

Firehose-enabled Node in Detail

The Firehose-enabled Blockchain Node is a third-party blockchain node client, such as Ethereum, instrumented under StreamingFast practices to output data that will be read by the Firehose Reader component.
Note: The Reader component will consume the data produced by the Firehose-enabled Blockchain Node.
The Firehose-enabled Blockchain Node runs in tandem with the Reader component. The two components are connected either through a UNIX pipe using stdout, or by having the Reader component's process execute and fork the blockchain client. This is accomplished using the node-manager software included in Firehose.
Blockchain nodes used in this capacity require:
  • very few features,
  • no archive mode capability,
  • no JSON-RPC service,
  • and no indexed data will be queried.
The Firehose-enabled Blockchain Node is responsible for executing all transactions in an order respecting the consensus protocol of the blockchain.


Reader Component in Detail

The Reader component is responsible for extracting data from instrumented blockchain nodes.
The Reader component utilizes the StreamingFast node-manager library to run a blockchain node instance as a sub-process. Alternatively, the Reader component can consume the stdout of the process where reader-stdin is implemented.
Once the process has been started, the Reader component:
  • reads the data being generated by the node,
  • forwards the data downstream to other connected components including the Relayer, Firehose gRPC Server, etc.
  • flushes the data to Object Storage for durability, and for the Merger to pick up the data.

Firehose Depends on the Reader

The data consumed by Firehose is provided by the Reader component.
Tip: The Reader component is the initial and deterministic data producer for Firehose and all of its components.

Reader Nodes

The Blockchain node underlying and managed by the Reader can be considered simplistic. They don't have archiving capabilities or any additional features.
Note, after Firehose has been instrumented on a node it will begin returning substantial amounts of data.

High Availability

Placing multiple Reader components side by side, and fronted by one or more Relayers, allows for highly available setups. This is a core part of the design of Firehose.
A Relayer connected to multiple Readers will deduplicate the incoming stream and push the first block downstream.
Tip: Two Reader components will even race to push the data first. The system is designed to leverage this racing Reader feature to the benefit of the end-user by producing the lowest latency possible.

Data Aggregation

Firehose also aggregates any forked blocks that would be seen by a single Reader component, and not seen by any other Reader components.

Component Cooperation

Adding Reader components and dispersing each one geographically will result in the components actually racing to transfer blocks to the Relayer component. This cooperation between the Reader and Relayer components significantly increases the performance of Firehose.

Reader Nomenclature

The Reader component is sometimes referred to as the Mindreader. This nickname stems from history where the codebase deepmind was used to describe the instrumentation of nodes.


Merger Component in Detail

The Merger component is responsible for managing and shaping data flowing out of the Reader component.

Blocks Files

The Merger component produces what are referred to as "100-blocks files." The Merger component receives "one-block" files from Reader components that are feeding the Merger.

One-block Storage

The Merger component reads the one-block object store to produce the 100-blocks files.


All forks visited by a Reader component will also be merged by the Merger component.

Merging Blocks

The merged 100-blocks files will be created each time the Merger component receives one hundred blocks of data from its associated Reader component.

Fork Data Awareness

The Merger component will produce the files when there are no additional forks that might occur. The StreamingFast bstream ForkableHandler provides support for fork data awareness in future merged blocks.

Merger Responsibilities

The Merger component will:
  • boot and try to start where it left off if a merged-seen.gob file is available.
  • boot and start the next bundle in the last merged-block in storage if there is not a merged-seen.gob file available.
  • gather one-block-files and assemble them into a bundle. The bundle is written when the first blocks of the next bundle are older than 25 seconds or it contains at least one fully-linked segment of 100 blocks.
  • keep a list of all seen (merged) blocks in the last {merger-max-fixable-fork}. A "seen" block is a block that has been merged by the current Merger component or a Reader component.
  • delete one-blocks that are older than {merger-max-fixable-fork} or have already been seen (merged), and recorded in the merged-seen.gob file.
  • load missing blocks from storage if missing blocks or holes are encountered. The blocks in storage are loaded to fill the seen-blocks cache and the Merger component continues to the next bundle.
  • add any previously unaccounted-for one-block files to the subsequent bundle. For instance bundle 500 might include block 429 if it was previously missed during the merging process. Also, note that any blocks older than {merger-max-fixable-fork} will be deleted.

High Availability Merger

A single Merger component is required for Reader nodes in a highly available Firehose.
Highly available systems usually connect to the Relayer component to receive real-time blocks. Merged blocked files are used when Relayer components can't provide the requested data or satisfy a range.
Restarts from other components can be sustained and time provided for Merger components to be down when Relayer components provide 200 to 300 blocks in RAM.
Note: Merged blocks generally aren't read by other Firehose components in a running, live highly available system.


Relayer Component in Detail

The Relayer component is responsible for providing executed block data to other Firehose components.
The Relayer component feeds from all available Reader nodes to get a comprehensive view of all possible forks.
The Relayer "fans out", or relays, block information to the other Firehose components.

Relayer & gRPC

The Relayer component serves its block data through the streaming gRPC interface BlockStream::Blocks. This is the same interface that the Reader component exposes to the Relayer component. Read more about the BlockStream::Blocks interface in its GitHub repository.

High Availability Relayer

A Relayer component in a highly available Firehose will feed from all of the Reader nodes to gain a complete view of all possible forks.
Tip: Multiple Reader components will ensure blocks are flowing efficiently to the Relayer component and throughout Firehose.

Firehose gRPC Server

gRPC Server in Detail

The Firehose gRPC Server component is responsible for providing the extracted, formed, and collated blockchain data handled by the other Firehose components.

Historical Data

Firehose will use merged blocks from data storage directly for historical requests.

Live Data

Live blocks are received from the Relayer component.

Relayer & Reader Coordination

The Relayer component gets its data from one, or more, Reader components.

Serving Data

The Firehose gRPC Server component provides the data to the end consumer of Firehose through remote method calls to the server.

High Availability gRPC

Firehose can be scaled horizontally to provide a highly available system.
The network speed and data throughput between consumers and Firehose deployments will dictate the speed of data availability.
Note: The network speed and data throughput between Relayer components and Firehose gRPC Server components will impact the speed of data availability.
Firehose gRPC Server components have the ability to connect to a subset of Relayer components or all Relayers available.
When the Firehose gRPC Server component is connected to all available Relayer components the probability that all forks will be viewed increases. Inbound requests made by consumers will be fulfilled with in-memory fork data.
Block navigation can be delayed when forked data isn't completely communicated to the Firehose gRPC Server component.
Understanding how data flows through Firehose is beneficial for harnessing its full power. Additional information is provided further explaining the data flow through Firehose.
Copy link
On this page
Firehose Component Family
Firehose-enabled Blockchain Node
Firehose gRPC Server