githubEdit

Distributed Deployment

This guide shows how to deploy Firehose components as separate processes using shared object storage. This approach is recommended for production environments where you need scalability, high availability, and proper service isolation.

Overview

In this deployment, each component (reader-node, merger, relayer, firehose, substreams-tier1, substreams-tier2) runs as a separate process. Components communicate through shared object storage and gRPC endpoints.

┌─────────────────-┐    ┌─────────────────┐    ┌─────────────────┐
│  Reader          │    │   Processing    │    │   Serving       │
│  Process         │    │   Components    │    │   Components    │
├─────────────────-┤    ├─────────────────┤    ├─────────────────┤
│ ┌─────────────┐  │    │ ┌─────────────┐ │    │ ┌─────────────┐ │
│ │dummy-blockchain│    │ │   Merger    │ │    │ │  Firehose   │ │
│ │ (subprocess)│  │    │ │   Relayer   │ │    │ │  Substreams │ │
│ │   Reader    │  │    │ │             │ │    │ │             │ │
│ └─────────────┘  │    │ └─────────────┘ │    │ └─────────────┘ │
└─────────────────-┘    └─────────────────┘    └─────────────────┘
         │                       │                       │
         └───────────────────────┼───────────────────────┘

                    ┌─────────────────┐
                    │ Shared Object   │
                    │ Storage         │
                    │ (Cloud Storage) │
                    └─────────────────┘
circle-info

While this guide shows all components running on a single machine for simplicity, in production you would typically deploy these across multiple machines with proper ingress, DNS, and service discovery.

Prerequisites

  1. Shared Object Storage: Set up cloud storage (AWS S3, Google Cloud Storage, etc.) or On-Premise Storage Solution compatible with S3 like Ceph (recommended).

  2. Binaries: Install firecore and dummy-blockchain as described in Prerequisites

Storage Configuration

First, configure your shared object storage. For this example, we'll use a local filesystem path that simulates cloud storage:

Component 1: Reader Node

The Reader manages the blockchain node and extracts block data.

circle-info

The Reader runs the dummy-blockchain as a subprocess and extracts block data to shared storage. See Reader Component for details.

Verify Reader Operation

Component 2: Merger

The Merger combines one-block files into merged block files for efficient storage.

circle-info

The Merger processes one-block files from shared storage and creates optimized merged block files. Learn more about Merger Component.

Verify Merger Operation

Component 3: Relayer

The Relayer provides live block streaming capabilities.

circle-info

The Relayer connects to the Reader to stream live blocks and provides real-time data access. See Relayer Component for more details.

Verify Relayer Operation

Component 4: Firehose

The Firehose component serves historical and live block data via gRPC.

Verify Firehose Operation

Component 5: Substreams Tier 1

Substreams Tier 1 serves as the entry point for Substreams requests, handling live blocks directly and delegating historical block processing to Tier 2 workers.

Component 6: Substreams Tier 2

Substreams Tier 2 workers handle the actual block processing for historical data. Tier 1 delegates work to Tier 2 workers.

circle-info

Start Tier 2 before Tier 1, as Tier 1 connects to Tier 2 workers via --substreams-tier1-subrequests-endpoint.

Verify Substreams Operation

Load Balancer / API Gateway

In production, you would typically put a load balancer or API gateway in front of your services:

Monitoring and Health Checks

Monitor each component's health:

Production Considerations

Service Discovery

In production, components need to discover each other. Consider using:

  • Kubernetes: Service discovery via DNS

  • Consul: Service mesh with health checking

  • AWS ELB/ALB: Load balancing with health checks

Storage

Replace the local filesystem with proper cloud storage:

High Availability

For high availability:

  1. Run multiple instances of each component

  2. Use health checks and automatic restarts

  3. Implement proper monitoring and alerting

  4. Use redundant storage with replication

Security

Secure your deployment:

  1. TLS encryption for gRPC communications

  2. Authentication and authorization

  3. Network segmentation and firewalls

  4. Secrets management for storage credentials

Scaling

Scale components based on load:

  • Reader: Usually one per blockchain network

  • Merger: Can run multiple instances for different block ranges

  • Relayer: Multiple instances for high availability

  • Firehose: Scale horizontally based on query load

  • Substreams: Scale both tiers based on processing needs

Next Steps

  • Adapt for your blockchain: Use these patterns with your target blockchain

  • Production deployment: Implement proper orchestration (Kubernetes, Docker Swarm)

  • Monitoring: Set up comprehensive monitoring and alerting

  • Performance tuning: Optimize based on your specific requirements

circle-check

Last updated

Was this helpful?