guides

Data layout

Data layout

Everything lives in one local SQLite file. Default path: ~/.discrawl/discrawl.db.

#What is stored

  • guild metadata
  • channels and threads in one table (Discord models threads as channels)
  • current member snapshot
  • canonical message rows
  • append-only message event records
  • FTS5 index rows
  • optional local embedding queue metadata and vectors

Messages imported from Discord Desktop use the same message, attachment, mention, and FTS paths as bot-synced messages.

#DMs

Proven DMs use the synthetic guild id @me. Unclassifiable desktop-cache payloads are skipped instead of being stored as unknown synthetic data.

#Attachments

Attachment binaries are not stored in SQLite. Only attachment metadata, filenames, and (optionally) extracted text.

Set sync.attachment_text = false if you want to keep attachment metadata and filenames but disable attachment body fetches for text indexing.

#Multi-guild ready

The schema is multi-guild ready even when the common UX stays single-guild simple. Threads are stored as channels because that matches the Discord model. Archived threads are part of the sync surface.

#Schema migrations

SQLite schema migrations are versioned with PRAGMA user_version. Startup fails fast when a local DB schema is newer than the supported binary - that means you have a binary older than the database.

#Querying directly

Anything you want, with read-only SQL:

discrawl sql 'select count(*) as messages from messages'
echo 'select guild_id, count(*) from messages group by guild_id' | discrawl sql -

See sql.

#See also