Configuration
discrawl init writes a complete config so most users do not hand-edit anything initially. This page documents the full shape and override rules for when you do.
#Default paths
Discrawl follows the normal per-OS storage convention instead of writing a new top-level directory in your home folder.
On Linux and other Unix desktops, Discrawl uses XDG Base Directory paths. In practice, that means:
- config:
${XDG_CONFIG_HOME:-~/.config}/discrawl/config.toml - database:
${XDG_DATA_HOME:-~/.local/share}/discrawl/discrawl.db - Git share repo:
${XDG_DATA_HOME:-~/.local/share}/discrawl/share - cache:
${XDG_CACHE_HOME:-~/.cache}/discrawl/ - logs:
${XDG_STATE_HOME:-~/.local/state}/discrawl/logs/
On macOS, Discrawl uses the platform's ~/Library locations:
- config:
~/Library/Application Support/discrawl/config.toml - database:
~/Library/Application Support/discrawl/discrawl.db - Git share repo:
~/Library/Application Support/discrawl/share - cache:
~/Library/Caches/discrawl/ - logs:
~/Library/Application Support/discrawl/logs/
If you set XDG_CONFIG_HOME, XDG_DATA_HOME, XDG_CACHE_HOME, or XDG_STATE_HOME, those variables choose the new default locations on any OS.
Upgrades do not move your database automatically. Existing installs are preserved before those new locations take over. If ~/.discrawl/config.toml exists and the new default config file does not, Discrawl keeps loading the legacy config. Missing runtime fields also keep using existing legacy files or directories under ~/.discrawl when the new location does not exist yet. This avoids breaking users whose desktop already sets XDG variables globally.
To migrate deliberately, copy or create the new config file first, or point Discrawl at it with --config / DISCRAWL_CONFIG. Runtime paths switch one-by-one: once the new database, cache, logs, or share path exists, that path wins over the legacy ~/.discrawl counterpart. Copy the SQLite database before creating the new database path if you want to preserve the existing archive.
#File layout
version = 1
default_guild_id = ""
guild_ids = []
db_path = "~/.local/share/discrawl/discrawl.db" # macOS: "~/Library/Application Support/discrawl/discrawl.db"
cache_dir = "~/.cache/discrawl" # macOS: "~/Library/Caches/discrawl"
log_dir = "~/.local/state/discrawl/logs" # macOS: "~/Library/Application Support/discrawl/logs"
[discord]
token_source = "env" # use "none" for Git-only read access
token_env = "DISCORD_BOT_TOKEN"
token_keyring_service = "discrawl"
token_keyring_account = "discord_bot_token"
[sync]
source = "both" # "discord" for bot-only sync, "wiretap" for desktop-cache-only import
concurrency = 16
repair_every = "6h"
full_history = true
attachment_text = true
attachment_media = false
max_attachment_bytes = 104857600
[desktop]
path = "~/.config/discord" # macOS default: "~/Library/Application Support/discord"
max_file_bytes = 67108864
full_cache = false
[search]
default_mode = "fts"
[search.embeddings]
enabled = false
provider = "openai"
model = "text-embedding-3-small"
api_key_env = "OPENAI_API_KEY"
batch_size = 64
max_input_chars = 12000
request_timeout = "2m"
vector_backend = "exact"
[share]
remote = ""
repo_path = "~/.local/share/discrawl/share" # macOS: "~/Library/Application Support/discrawl/share"
branch = "main"
auto_update = true
stale_after = "15m"
media = true
[share.filter]
public_only = false
include_channel_ids = []
exclude_channel_ids = []
[remote]
mode = "local" # use "cloud" for Worker-fronted remote archives
endpoint = ""
archive = ""
token_env = "DISCRAWL_REMOTE_TOKEN"
stale_after = ""
concurrency is auto-sized at init to min(32, max(8, GOMAXPROCS*2)).
#Token resolution
In order:
DISCORD_BOT_TOKEN, or the env var named indiscord.token_env- OS keyring item
discrawl/discord_bot_token, or the configured keyring service/account
discrawl accepts either raw token text or a value prefixed with Bot . Normalization is automatic.
Set discord.token_source = "keyring" if you want to require keyring lookup and skip env entirely. Set it to "none" for a Git-only reader.
#Override rules
--config <path>beats everythingDISCRAWL_CONFIG=<path>overrides the default config pathdiscord.token_source = "none"disables live Discord access for Git-only readersdiscord.token_source = "keyring"skips env lookupremote.mode = "cloud"makesstatus --jsonandremote ...read the configured Worker archive without opening SQLiteDISCRAWL_NO_AUTO_UPDATE=1disables Git snapshot auto-update for read commands in one process
#Notes
default_guild_idis the implicit scope forsync,tail,digest, andanalyticswhen--guildis not passedguild_idsis reserved for explicit multi-guild fan-out; usually you do not set this directly- changing
[search.embeddings]provider/model/input version retargets pending jobs and resets prior attempts; existing vectors for another identity remain in SQLite but are not used for semantic search [search.embeddings].vector_backendacceptsexactor optionalturbovec; turbovec requires Python plus theturbovecpackage and embedding dimensions divisible by 8.- changing
db_pathdoes not migrate existing data; copy the file yourself if you want to keep history sync.attachment_media = truemakessyncbehave likesync --with-media; media bytes are cached undercache_dir/media, and CDN404/other fetch failures are recorded on attachment rowsshare.media = falsemakes publish/update/auto-update omit or skip restoring cached media;subscribe --no-mediawrites this for Git-only readers. With the defaultshare.media = true, publish/update include cached non-DM media as gzip-compressed snapshot files, but publish does not fetch missing Discord files by itself.[share.filter]narrows onlypublishoutput; sync can still keep a richer local archiveshare.filter.public_onlyexports only channels visible to the guildshare.filter.include_channel_idsand- filtered publishes cannot write generated README reports, and remove older
@everyone role after category/channel permission overwrites; private threads are excluded
share.filter.exclude_channel_ids accept Discord channel ids; exclusions win, and including a forum parent also includes its allowed public threads
generated Discrawl share READMEs before committing