Skip to content

SQLite — One Store for Local, Large, and Private Data

← All posts

SQLite — One Store for Local, Large, and Private Data

2026-06-01 · FlowMCP Team · #data-formats #sqlite #resources

A tool that calls a live API is the obvious building block — but it is not always the right one. Some data rarely changes. Some data is too large to fetch on every call. Some data is private and should never leave the machine. For these, FlowMCP offers three local mechanisms. This post explains what each one is for, and then goes deep on the most powerful of them: SQLite resources.

Three ways to anchor data

MechanismWhat it isReach for it when…
PreloadAn optional preload block on a route (enabled, ttl). The runtime caches the API response locally in memory for the time-to-live.The data comes from an API but changes slowly and is expensive to refetch — a few hundred KB to a couple of MB that you don’t want to pull on every call.
Shared ListsVersioned, centrally maintained value sets, referenced by name and filtered per schema.You need small, canonical value sets shared across many schemas — EVM chain IDs, fiat codes, country codes. One source of truth instead of duplicated inline lists.
SQLite resourcesA local SQLite database exposed as an MCP resource (main.resources).The data is large, or static reference data, or private, or generated by the agent itself.

The first two are about light data. SQLite is about heavy and local data.

SQLite: turning an awkward format into queryable tables

The strongest argument for SQLite is conversion. Many datasets ship in formats that are painful to query directly — a ZIP of dozens of CSVs, a sprawling XML dump, a custom binary. Converting such a format once into SQLite turns it into ordinary tables you can query with SQL, with indexes, without re-parsing the source on every access. GTFS transit feeds are exactly this story: the gtfs-sqlite-toolkit converts a feed into a sealed SQLite database that a schema simply references.

Two modes

SQLite resources come in two modes, and the difference is the whole point:

ModeOpened asAllowedUse case
in-memorybetter-sqlite3 with readonly: trueSELECT only — the file on disk is never modifiedReference data, lookups, open data
file-basedWAL mode, writableAll SQL statements; changes persist; a .bak is taken before the first writeAgent-generated data: analysis results, collected metrics

Read vs. write: safe by design

Here is the part that matters for safety. In in-memory mode an AI can write SQL queries itself and run them — but only reads get through, and that guarantee holds on two independent layers:

  1. Driver level. The database is opened with better-sqlite3 and the readonly: true flag. The driver itself refuses any write.
  2. Module level. Before a query runs, the runtime checks that the statement begins with SELECT or WITH. Write keywords — INSERT, UPDATE, DELETE, CREATE, ALTER, DROP, REPLACE, TRUNCATE — are rejected with “Only SELECT statements are allowed in in-memory mode.”

So the AI gets full freedom to explore and query a dataset, and zero ability to mutate it. Writing is reserved for the explicit file-based mode, which only schemas marked origin: 'project' may use. The separation is deliberate: completely read-only and safe, or completely on disk and free — never an ambiguous middle.

The default methods

Every SQLite resource gets two methods injected automatically by the runtime, so the AI never has to be told the table layout in advance:

MethodWhat it does
describeTablesReturns the tables and their columns — the AI inspects the structure before querying.
runSqlRuns a query the AI composes itself. In in-memory mode: SELECT only. In file-based mode: all statements.

A schema author can add up to seven more named queries on top, but in practice describeTables + runSql already let an assistant explore an unknown database and answer questions about it on its own.

Choosing between them

  • Live, frequently changing data → a normal tool (an API call).
  • Slow-changing API data you don’t want to refetch → add preload to the route.
  • Small canonical lists shared across schemas → a shared list.
  • Large static reference data, or private data, or anything you’d otherwise re-parse every time → a SQLite resource (in-memory, read-only).
  • Data the agent produces and needs to keep → a SQLite resource (file-based, writable).

The common thread: keep schemas thin and keep data where it belongs. Preload and shared lists handle the light cases; SQLite handles the heavy and the private ones — with a read/write boundary the runtime enforces for you.


📖 Read also:

Related Posts

hackathon

Making the Connection — How FlowMCP Became a Mobility Framework

Story from DB InfraGO's 'Anschluss erreichen' hackathon, where FlowMCP's main contributor reached 3rd place with a multi-agent mobility assistant.

release

FlowMCP v4.2 — Grading as a Versioned Standard

FlowMCP Spec v4.2 delegates schema grading to its own independently versioned standard — the Grading-Spec v2.0, published as its own docs area so anyone can grade by the same rules.

release

FlowMCP v4.0 — Skills, Selections, Pipes

How deterministic structures carry LLM composition without the AI hallucinating parameters.