Data layout
Everything lives in one local SQLite file. Default path: ~/.discrawl/discrawl.db.
#What is stored
- guild metadata
- channels and threads in one table (Discord models threads as channels)
- current member snapshot
- canonical message rows
- append-only message event records
- FTS5 index rows
- optional local embedding queue metadata and vectors
Messages imported from Discord Desktop use the same message, attachment, mention, and FTS paths as bot-synced messages.
#DMs
Proven DMs use the synthetic guild id @me. Unclassifiable desktop-cache payloads are skipped instead of being stored as unknown synthetic data.
#Attachments
Attachment binaries are not stored in SQLite. Only attachment metadata, filenames, and (optionally) extracted text.
Set sync.attachment_text = false if you want to keep attachment metadata and filenames but disable attachment body fetches for text indexing.
#Multi-guild ready
The schema is multi-guild ready even when the common UX stays single-guild simple. Threads are stored as channels because that matches the Discord model. Archived threads are part of the sync surface.
#Schema migrations
SQLite schema migrations are versioned with PRAGMA user_version. Startup fails fast when a local DB schema is newer than the supported binary - that means you have a binary older than the database.
#Querying directly
Anything you want, with read-only SQL:
discrawl sql 'select count(*) as messages from messages'
echo 'select guild_id, count(*) from messages group by guild_id' | discrawl sql -
See sql.