Merger

StreamingFast Firehose Merger component

The Merger is responsible for consolidating individual block files into efficient batched bundles. It reads one-block files produced by Reader Nodes and creates merged 100-block files that serve as the primary historical data source for the Firehose pipeline.

How Merger Works

The Merger continuously polls the one-block storage for new files, accumulates them into bundles, and writes merged files to persistent storage. It also handles fork preservation and cleanup of old data.

┌─────────────────────────────────────────────────────────────────────────┐
│                           Merger Component                              │
│                                                                         │
│  ┌───────────────────────────────────────────────────────────────────┐  │
│  │                     One-Block Storage Poller                      │  │
│  │         (continuously polls for new one-block files)              │  │
│  └───────────────────────────────────────────────────────────────────┘  │
│                              │                                          │
│                              ▼                                          │
│  ┌───────────────────────────────────────────────────────────────────┐  │
│  │                         Bundler                                   │  │
│  │  • Accumulates blocks until bundle boundary (100 blocks)          │  │
│  │  • Validates sequential block ordering                            │  │
│  │  • Tracks forked blocks within current range                      │  │
│  └───────────────────────────────────────────────────────────────────┘  │
│                              │                                          │
│              ┌───────────────┼───────────────┐                          │
│              ▼               ▼               ▼                          │
│     ┌─────────────┐  ┌─────────────┐  ┌─────────────┐                   │
│     │ Merged      │  │ Forked      │  │ One-Block   │                   │
│     │ Blocks      │  │ Blocks      │  │ Pruner      │                   │
│     │ Storage     │  │ Storage     │  │ (cleanup)   │                   │
│     └─────────────┘  └─────────────┘  └─────────────┘                   │
└─────────────────────────────────────────────────────────────────────────┘

Bundling Process

The Merger accumulates blocks in memory until it has enough to create a complete bundle:

Polling: Continuously checks one-block storage for new files
Validation: Ensures blocks arrive in sequential order, starting at bundle boundaries
Accumulation: Collects blocks until reaching the bundle size (100 blocks)
Merge & Store: Writes the accumulated blocks as a single merged file
Advance: Moves to the next bundle boundary and repeats

The Merger retains the last block from each bundle as a "bootstrap block" - this facilitates connection to downstream systems and cursor resolution.

Fork Handling

The Merger preserves all forks encountered by Reader Nodes:

Fork Detection: Identifies blocks on non-canonical branches within the current bundle range
Fork Storage: Moves forked blocks to dedicated fork storage for cursor resolution
Fork Pruning: Cleans up old forked blocks beyond the configured pruning distance

This fork preservation enables the Firehose component to resume streaming from any cursor, even if that cursor points to a block on a fork that was later abandoned.

Retry Logic

When encountering failures (storage errors, network issues), the Merger implements automatic retry:

12 retry attempts with 5-second intervals between attempts
Handles temporary storage unavailability gracefully
Logs warnings for holes in merged files that can be recovered

Bundle Size

The Merger creates bundles of exactly 100 blocks each. This fixed size provides:

Predictable file sizes for storage planning
Efficient batch access for historical queries
Consistent compression ratios

Pruning Operations

Two concurrent pruning operations run in the background:

One-Block Pruner: Removes source files after they've been merged and are beyond the pruning distance from the Last Irreversible Block (LIB)
Forked Blocks Pruner: Removes forked block files that are older than the configured pruning distance (default 50,000 blocks)

The pruning distance should be set large enough to handle chain reorganizations. The default of 50,000 blocks is conservative and suitable for most chains.

Storage Access

One-Block Storage (Input)

Access: Read (to fetch blocks) and Delete (to prune)
Volume: Temporary - files are deleted after merging
Naming: Files include block number and hash for uniqueness

Merged Blocks Storage (Output)

Access: Write (to store bundles) and Read (to detect gaps)
Volume: Permanent - the primary historical data store
Format: Compressed 100-block bundles

Forked Blocks Storage

Access: Write (to preserve forks) and Delete (to prune)
Volume: Moderate - pruned regularly, retains recent forks only
Purpose: Enables cursor resolution on forked blocks

Multiple Readers

When running multiple Reader Nodes, the Merger consolidates blocks from all sources:

Each Reader may see different forks depending on network peers
The Merger captures all forks from all Readers
Forked blocks from any Reader are preserved for cursor resolution

This architecture ensures no block data is lost, even when different Readers observe different blockchain states.

Configuration Reference

For complete configuration options and flags, see Merger CLI Reference.

PreviousReader Node NextRelayer

Last updated 22 days ago

Was this helpful?

hashtagHow Merger Works

hashtagBundling Process

hashtagFork Handling

hashtagRetry Logic

hashtagBundle Size

hashtagPruning Operations

hashtagStorage Access

hashtagOne-Block Storage (Input)

hashtagMerged Blocks Storage (Output)

hashtagForked Blocks Storage

hashtagMultiple Readers

hashtagConfiguration Reference

How Merger Works

Bundling Process

Fork Handling

Retry Logic

Bundle Size

Pruning Operations

Storage Access

One-Block Storage (Input)

Merged Blocks Storage (Output)

Forked Blocks Storage

Multiple Readers

Configuration Reference