# Data Storage

## Data Storage in Firehose

### Data Storage in Detail

Data and the locations where it is stored are important facets of Firehose deployment and operation.

Key Firehose data storage topics include [Data Stores](#data-stores), [Merged blocks files](#merged-blocks-files), [serialization](#serialization), [one block files](#one-block-files), and [100-blocks files](#one-hundred-blocks-files).

## Data Stores

Firehose Stores are abstractions sitting on top of Object Storage.

{% hint style="info" %}
**Note***:* *Object Storage is a data storage technique that manages data as objects in opposition to other data storage architectures like hierarchical file systems.*
{% endhint %}

### Abstraction Library

Stores utilize the Firehose [dstore abstraction library](https://github.com/streamingfast/dstore) to provide support for local file systems, [Azure](https://www.google.com/aclk?sa=l\&ai=DChcSEwjr3Yqr9r75AhVuH60GHaPqCPAYABAAGgJwdg\&sig=AOD64_1oS9RVQu923fWqHBIH9TUq9RxM_w\&q\&adurl\&ved=2ahUKEwjZ_4Or9r75AhXjKX0KHR_eBJYQ0Qx6BAgDEAE), [Google Cloud](https://cloud.google.com/), [Amazon S3](https://www.google.com/aclk?sa=l\&ai=DChcSEwiitIe_9r75AhXMwsIEHaRvBvsYABAAGgJwdg\&sig=AOD64_0zvgrb2ySU8puRmtykCtCNbLSHQw\&q\&adurl\&ved=2ahUKEwiqpoC_9r75AhWjKn0KHbOGDaYQ0Qx6BAgDEAE), and other Amazon S3 API compatible object storage solutions such as [MinIO](https://min.io/) or [Ceph](https://ceph.com/en/).

### Production Environments

For production deployments outside of cloud providers, StreamingFast recommends [Ceph](https://ceph.com/en/) as the distributed storage instead of its compatible Amazon S3 API system.

## Serialization

Firehose primarily utilizes [Protocol Buffers version 3](https://developers.google.com/protocol-buffers) for serialization.

## Merged Blocks Files

### Merged Blocks in Detail

Merged blocks files are also referred to as `100-blocks files`, and merged bundles. These terms are all used interchangeably within Firehose.

Merged blocks are binary files that use the [dbin](https://github.com/streamingfast/dbin) packing format to store a series of [bstream block objects](https://github.com/streamingfast/proto/blob/develop/sf/bstream/v1/bstream.proto), serialized as [protocol buffers](https://developers.google.com/protocol-buffers).

### Merged Block Creation

Firehose uses [Reader Node](/firehose/architecture/components/reader.md) components that have been set with a special flag to work in *catch-up* mode to create merged blocks.

### Highly-available Merged Blocks

In [high-availability](/firehose/architecture/components/high-availability.md) Firehose configurations, merged blocks will be created by the [Merger](/firehose/architecture/components/merger.md) component. The [Reader Node](/firehose/architecture/components/reader.md) component will provide the Merger component with one-block files.

### Block Bundles

The [Merger](/firehose/architecture/components/merger.md) component will also collate all of the one-block files into a single bundle of blocks.

### One Hundred Blocks Files

Up to one hundred blocks can be contained within a single 100-blocks file.

The 100-blocks files can include multiple versions such as a fork block or a given block number, ensuring continuity through the previous block link.

### Blocks Files Consumption & Use

Nearly all components in Firehose rely on or utilize 100-blocks files. The bstream library consumes 100-blocks files for example.

Protocol-specific decoded block objects, like Ethereum, are what circulate amongst all processes that work with executed block data in Firehose.

## One Block Files

### One Block Files in Detail

In [high availability](/firehose/architecture/components/high-availability.md) configurations, one-block files are transient and ensure the [Merger](/firehose/architecture/components/merger.md) component gathers all visible forks from any [Reader Node](/firehose/architecture/components/reader.md) components.

{% hint style="warning" %}
**Important***: One-block files contain only one `bstream.Block` as a serialized protocol buffer.*
{% endhint %}

### One-block File Consumption & Use

One-block files are consumed by the `Merger` component, bundled in executed \_\_ 100-blocks files. The one-block files are then stored to `dstore` storage and consumed by most of the other Firehose processes.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://firehose.streamingfast.io/firehose/architecture/data-storage.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
