Indexer

The MUD Indexer is an offchain indexer for onchain applications built with MUD.

Why an offchain indexer?

Reads with onchain apps can be tricky. What does it mean to be able to query the Ethereum network? Technically, given a node with a fully synced state, we can explore just about everything using the EVM, but the “exploring” would be looking at raw storage slots for accounts corresponding to smart contracts. A way around this exists by providing view functions on smart contracts: these effectively are just wrappers around raw storage and expose a more friendly API. Instead of having to figure out where the balances for an account are stored in the storage tree, we now can call a function that does the lookup via Solidity via an RPC endpoint.

The issue with view functions is that for any sophisticated application the calls needed to get the “full picture” of the state from the chain are very complex. Servicing so many view function calls also creates the need to run a set of dedicated nodes instead of relying on a third-party provider's free tier.

The MUD indexer solves this problem by listening to the MUD store events to automatically replicate the entire onchain state in a relational database. Having such a database allows clients to quickly and efficiently query the onchain data.

Installation

These environment variables need to be provided to the indexer to work:

Type	Variable	Meaning	Sample value (using `anvil` running on the host)
Needed	RPC_HTTP_URL	The URL to access the blockchain using HTTP	http://host.docker.internal:8545 (opens in a new tab) (when running in Docker)
Optional	RPC_WS_URL	The URL to access the blockchain using WebSocket
Optional	START_BLOCK	The block to start indexing from. The block in which the `World` contract was deployed is a good choice.	1
Optional	DEBUG=mud:*	Turn on debugging
Optional	STORE_ADDRESS	Only index tables from this `World`
Optional, only for SQLite	SQLITE_FILENAME	Name of database	`indexer.db`
Required for PostgreSQL	DATABASE_URL	URL for the database, of the form `postgres://<host>/<database>`

Schemaful indexing to SQLite via npx

To run the indexer directly on your computer using SQLite:

Start a World to index. An easy way to do this is to use a TypeScript template in a separate command line window.

Set RPC_HTTP_URL.

export RPC_HTTP_URL=http://127.0.0.1:8545

Run the indexer. If necessary, install it first.

npx -y -p @latticexyz/store-indexer sqlite-indexer

Note: The indexer.db is persistent if you stop and restart the indexer. If that is not the desired behavior (for example, because you restarted the blockchain itself), delete it before starting the indexer.

Schemaless indexing to PostgreSQL via npx

The schemaless indexer stores MUD table records into a single monolithic table. This allows it to index the data of all tables of all MUD Worlds on a chain efficiently, but querying for data is limited to filters on address, tableId, and the record's key. Since the record's data is stored as encoded bytes blob, it's harder to query based on it in SQL. The main purpose of this variant of the indexer is to efficiently hydrate a MUD client, which decodes the data.

Start a World to index. An easy way to do this is to use a TypeScript template in a separate command line window.

Set the environment variables.

export RPC_HTTP_URL=http://127.0.0.1:8545
export DATABASE_URL=postgres://127.0.0.1/postgres

Run the indexer. Install it first if necessary.

npx -y -p @latticexyz/store-indexer postgres-indexer

Open a separate command line. In it, specify the database and run the indexer frontend, which is responsible for serving the data to the client.
```
export DATABASE_URL=postgres://127.0.0.1/postgres
npx -y -p @latticexyz/store-indexer postgres-frontend
```

Schemaful indexing with PostgreSQL via npx

The schemaful indexer dynamically creates a PostgreSQL table every time it encounters a new MUD table. It then decodes the MUD events and stores it in the PostgreSQL table with a matching schema. This approach doesn't scale well to all tables of a chain, but it is a convenient way to index the tables of a particular MUD World and have access to powerful SQL queries on its data.

Start a World to index. An easy way to do this is to use a TypeScript template in a separate command line window.

Set the environment variables and start the indexer, installing it if necessary.

export RPC_HTTP_URL=http://127.0.0.1:8545
export DATABASE_URL=postgres://127.0.0.1/postgres
npx -y -p @latticexyz/store-indexer postgres-decoded-indexer

To verify the installation, run psql and then:

List the schemas.

\dn

Result:

                  List of schemas
                  Name                    | Owner
--------------------------------------------+-------
0xc14fbdb7808d9e2a37c1a45b635c8c3ff64a1cc1 | qbzzt
mud                                        | qbzzt

Connect to the schema for your world.

SET search_path TO "0xc14fbdb7808d9e2a37c1a45b635c8c3ff64a1cc1";

Get the list of tables.

\dt

Result (When using the React template):

                                 List of relations
                  Schema                   |           Name            | Type  | Owner
--------------------------------------------+---------------------------+-------+-------
0xc14fbdb7808d9e2a37c1a45b635c8c3ff64a1cc1 | store__resource_ids       | table | qbzzt
0xc14fbdb7808d9e2a37c1a45b635c8c3ff64a1cc1 | store__store_hooks        | table | qbzzt
0xc14fbdb7808d9e2a37c1a45b635c8c3ff64a1cc1 | store__tables             | table | qbzzt
0xc14fbdb7808d9e2a37c1a45b635c8c3ff64a1cc1 | tasks                     | table | qbzzt
0xc14fbdb7808d9e2a37c1a45b635c8c3ff64a1cc1 | world__balances           | table | qbzzt
0xc14fbdb7808d9e2a37c1a45b635c8c3ff64a1cc1 | world__function_selector  | table | qbzzt
0xc14fbdb7808d9e2a37c1a45b635c8c3ff64a1cc1 | world__function_signatur  | table | qbzzt
0xc14fbdb7808d9e2a37c1a45b635c8c3ff64a1cc1 | world__init_module_addres | table | qbzzt
0xc14fbdb7808d9e2a37c1a45b635c8c3ff64a1cc1 | world__installed_modules  | table | qbzzt
0xc14fbdb7808d9e2a37c1a45b635c8c3ff64a1cc1 | world__namespace_delegat  | table | qbzzt
0xc14fbdb7808d9e2a37c1a45b635c8c3ff64a1cc1 | world__namespace_owner    | table | qbzzt
0xc14fbdb7808d9e2a37c1a45b635c8c3ff64a1cc1 | world__resource_access    | table | qbzzt
0xc14fbdb7808d9e2a37c1a45b635c8c3ff64a1cc1 | world__system_hooks       | table | qbzzt
0xc14fbdb7808d9e2a37c1a45b635c8c3ff64a1cc1 | world__system_registry    | table | qbzzt
0xc14fbdb7808d9e2a37c1a45b635c8c3ff64a1cc1 | world__systems            | table | qbzzt
0xc14fbdb7808d9e2a37c1a45b635c8c3ff64a1cc1 | world__user_delegation_co | table | qbzzt
(16 rows)

Read the actual data.

SELECT * FROM tasks;

Result:

                                id                                 | created_at | completed_at |    description     |                            __key_bytes                             | __last_updated_block_number
-------------------------------------------------------------------+------------+--------------+--------------------+--------------------------------------------------------------------+-----------------------------
\x3100000000000000000000000000000000000000000000000000000000000000 | 1712714176 |            0 | Walk the dog       | \x3100000000000000000000000000000000000000000000000000000000000000 |                          10
\x5e9c11602057fbf149cca23095b1863f7ffa8d0d0bd9434005a344ad612488a7 | 1712714178 |            0 | Take out the trash | \x5e9c11602057fbf149cca23095b1863f7ffa8d0d0bd9434005a344ad612488a7 |                          10
\x0c9151148be227a42be8d3e3e7e61da28a532f2340b0ad9ca8bc747703ec2417 | 1712714178 |   1712714178 | Do the dishes      | \x0c9151148be227a42be8d3e3e7e61da28a532f2340b0ad9ca8bc747703ec2417 |                          10
(3 rows)

Docker

The indexer Docker image is available on github (opens in a new tab).

There are several ways to provide the environment variables to docker run:

On the command line you can specify -e <variable>=<value>. You specify this after the docker run, but before the name of the image.
You can also write all the environment variables in a file and specify it using --env-file. You specify this after the docker run, but before the name of the image.
Both Docker Compose (opens in a new tab) and Kubernetes (opens in a new tab) have their own mechanisms for starting docker containers with environment variables.

The easiest way to test the indexer is to run the template as a world in a separate command-line window.

SQLite

The command to start the indexer in SQLite mode is pnpm start:sqlite. To index an anvil instance running to the host you can use this command.

docker run \
  --platform linux/amd64 \
  -e RPC_HTTP_URL=http://host.docker.internal:8545 \
  -p 3001:3001  \
  ghcr.io/latticexyz/store-indexer:latest  \
  pnpm start:sqlite

However, this creates a docker container with a state, the SQLite database file. If we start a new container with the same image and parameters, it is going to have to go back to the start of the blockchain, which depending on how long the blockchain has been in use may be a problem. We can solve this with volumes (opens in a new tab):

Create a docker volume for the SQLite database file.
```
docker volume create sqlite-db-file
```

Run the indexer container using the volume.

docker run \
   --platform linux/amd64 \
   -e RPC_HTTP_URL=http://host.docker.internal:8545 \
   -e SQLITE_FILENAME=/dbase/indexer.db \
   -v sqlite-db-file:/dbase \
   -p 3001:3001  \
   ghcr.io/latticexyz/store-indexer:latest  \
   pnpm start:sqlite

You can stop the docker container and restart it, or start a separate container using the same database.
When you are done, you have to delete the docker containers that used it before you can delete the volume. You can use these commands:
```
docker rm `docker ps -a --filter volume=sqlite-db-file -q`
docker volume rm sqlite-db-file
```
Note: You should do this every time you restart the blockchain. Otherwise your index will include data from multiple blockchains, and make no sense.

Schemaless indexing with PostgreSQL via Docker

The command to start the indexer in schemaless PostgreSQL mode is pnpm start:postgres. This command starts both the indexer and query frontends.

The docker instance identifies itself to PostgreSQL as root. To give this user permissions on the database, enter psql and run this command.
```
CREATE ROLE root SUPERUSER LOGIN;
```
Note: This is assuming a database that is isolated from the internet and only used by trusted entities. In a production system you will use at least a password as authentication, and limit the user's authority.

Start the docker container. For example, to index an anvil instance running to the host to the database postgres on the host, use.

docker run \
  --platform linux/amd64 \
  -e RPC_HTTP_URL=http://host.docker.internal:8545 \
  -e DATABASE_URL=postgres://host.docker.internal/postgres \
  -p 3001:3001  \
  ghcr.io/latticexyz/store-indexer:latest  \
  pnpm start:postgres

If you want to create additional query frontends (for reliability and load balancing), use these commands:

HOST_PORT=3002
docker run \
  --platform linux/amd64 \
  -e DATABASE_URL=postgres://host.docker.internal/postgres \
  -p $HOST_PORT:3001  \
  ghcr.io/latticexyz/store-indexer:latest  \
  node dist/bin/postgres-frontend.js

Schemaful indexing with PostgreSQL via Docker

The command to start the indexer in schemaful PostgreSQL mode is pnpm start:postgres-decoded. This command starts both the indexer and the query frontend.

The docker instance identifies itself to PostgreSQL as root. To give this user permissions on the database, enter psql and run this command.
```
CREATE ROLE root SUPERUSER LOGIN;
```
Note: This is assuming a database that is isolated from the internet and only used by trusted entities. In a production system you will use at least a password as authentication, and limit the user's authority.

Start the docker container. For example, to index an anvil instance running to the host to the database postgres on the host, use.

docker run \
  --platform linux/amd64 \
  -e RPC_HTTP_URL=http://host.docker.internal:8545 \
  -e DATABASE_URL=postgres://host.docker.internal/postgres \
  -p 3001:3001  \
  ghcr.io/latticexyz/store-indexer:latest  \
  pnpm start:postgres-decoded

Verification

If you use either SQLite or PostgreSQL with the query frontend (using PostgreSQL only as storage), you can run this command to test the indexer.

curl 'http://localhost:3001/trpc/findAll?batch=1&input=%7B%20%220%22%3A%20%7B%22json%22%3A%20%7B%22chainId%22%3A%2031337%2C%20%22address%22%3A%20%220xc14fbdb7808d9e2a37c1a45b635c8c3ff64a1cc1%22%20%7D%20%7D%20%7D' | jq

The result should be nicely formatted (and long) JSON output with all the data stored in the World.

Where does this URL come from?

The URL has these parameters:

Parameter	Value	Explanation
Server	http://localhost:3001 (opens in a new tab)	By default the indexer listens on port 3001
Path	trpc/findAll	Return all entries (based on the input filter)
`batch`	`1`	A required field
`input`	`%7B%20 ... %20%7D`	See below

The input is the JSON filter that tells the server what we need. It is URL encoded (opens in a new tab), you can decode it using an online calculator (opens in a new tab).

{
  "0": {
    "json": {
      "chainId": 31337,
      "address": "0xc14fbdb7808d9e2a37c1a45b635c8c3ff64a1cc1"
    }
  }
}

Meaning that query 0 is for everything in the World at address 0xc14fbdb7808d9e2a37c1a45b635c8c3ff64a1cc1, on the chain with chain ID 31337.

ⓘ

The JSON here is written in multiple lines for readablility. However, when you encode the JSON for the URL it is necessary to place the entire string on one line.

Using the indexer

The source code of a MUD client has a call either to syncToRecs (opens in a new tab) or to syncToZustand (opens in a new tab), typically in setupNetwork.ts (opens in a new tab). This call initializes the synchronization between the data source (RPC or indexer) and the client's copy.

To use the indexer, specify an indexerUrl parameter with the URL.

const { components, latestBlock$, storedBlockLogs$, waitForTransaction } = await syncToRecs({
  world,
  config: mudConfig,
  address: networkConfig.worldAddress as Hex,
  publicClient,
  startBlock: BigInt(networkConfig.initialBlockNumber),
  indexerUrl: "http://127.0.0.1:3001",
});

If the client does not need all the information stored in the World, you can filter the synchronization to save on resources.

Clearing the information

If you restart the blockchain, you need to clear all the information stored by the indexer otherwise you'll have an inconsistent state.

If you are using SQLite, remove indexer.db.
If you are using schemaless PostgreSQL, you just need to remove the mud schema:
```
DROP SCHEMA "mud" CASCADE;
```
If you are using schemaful PostgreSQL, you need to also remove the schemas for the Worlds indexed.
1. Get the list of Worlds using this command:
```
SELECT DISTINCT address FROM mud.records;
```
2. Delete those schemas. For example, if you use one of the TypeScript templates, run this command:
```
DROP SCHEMA "0xc14fbdb7808d9e2a37c1a45b635c8c3ff64a1cc1" CASCADE;
```
3. Drop the mud schema.
```
DROP SCHEMA mud CASCADE;
```

Usage examples

SQLite to view the tasks from the React template

Start the indexer.

export RPC_HTTP_URL=http://127.0.0.1:8545
npx -y -p @latticexyz/store-indexer sqlite-indexer

Get the list of tables. To do that, we read all the data and filter it locally.

curl 'http://localhost:3001/trpc/findAll?batch=1&input=%7B%220%22%3A%7B%22json%22%3A%7B%22chainId%22%3A31337%2C%22address%22%3A%220x4f4ddafbc93cf8d11a253f21ddbcf836139efdec%22%7D%7D%7D' > alldata.json
cat alldata.json | jq '.[0].result.data.json.tables[]' > tables.json
cat tables.json | jq '.namespace + "__" + .name + " " + .tableId'

Ask the indexer for only the information in a specific table (which requires you to know the tableId for it).

INDEXER_URL=http://localhost:3001/trpc
WORLD_ADDRESS=0xc14fbdb7808d9e2a37c1a45b635c8c3ff64a1cc1
TABLE_ID=0x746200000000000000000000000000005461736b730000000000000000000000
CHAIN_ID=31337
JSON='{"0":{"json":{"chainId": '$CHAIN_ID', "address": "'$WORLD_ADDRESS'", "filters": [{"tableId": "'$TABLE_ID'"}]}}}'
ENCODED_JSON=`echo "console.log(encodeURI('$JSON'))" | node`
curl $INDEXER_URL'/findAll?batch=1&input='$ENCODED_JSON > data.json
cat data.json |  jq '.[0].result.data.json.tables[0].records'

To see that the only table returned is Tasks, run this command:

cat data.json | jq '.[0].result.data.json.tables[].name'

Zustand Faucet