sync
Refreshes SQLite from one or both archive sources.
By default, sync runs both live/local sources and does not import the Git snapshot first:
- Discord bot-token sync for bot-visible guild data
- local Discord Desktop cache import for classifiable cached messages and proven DMs
Use update when you want to pull/import the shared Git snapshot. If you intentionally want a sync run to import the snapshot before live deltas, pass --update=auto (only when stale) or --update=force (always). --no-update is accepted as an explicit no-op alias for the default.
Run one explicit --full pass when you want a complete historical guild archive. Use plain sync afterward for frequent latest-message and desktop-cache refreshes.
#Usage
discrawl sync
discrawl sync --update=auto
discrawl sync --update=force
discrawl sync --no-update
discrawl sync --full
discrawl sync --full --all
discrawl sync --guild 123456789012345678
discrawl sync --guilds 123,456 --concurrency 8
discrawl sync --source both # default: bot API + desktop cache
discrawl sync --source discord # bot API only; aliases: key, bot, api
discrawl sync --source wiretap # desktop cache only; aliases: desktop, cache
discrawl sync --guild 123456789012345678 --all-channels
discrawl sync --channels 111,222 --since 2026-03-01T00:00:00Z
discrawl sync --with-embeddings
#Sources
| Source | Reads from | Stores |
|---|---|---|
both | Discord bot API and local Discord Desktop cache | bot-visible guild data plus classifiable cached desktop messages |
discord / key | Discord bot API | guilds, channels, threads, members, and messages the bot can access |
wiretap | local Discord Desktop cache files | classifiable cached messages; proven DMs are stored under @me |
#Bot sync modes
| Command | Use when | Behavior |
|---|---|---|
discrawl sync | routine refresh | skips member refreshes, checks live top-level channels plus active threads, only fetches new messages for channels with a stored cursor |
discrawl sync --update=auto | hybrid Git/live refresh | imports a stale Git snapshot first, then runs the routine live refresh |
discrawl sync --all-channels | repair pass | broad incremental sweep across every stored channel/thread, including archived threads |
discrawl sync --full | historical backfill | crawls older history until channels are complete |
#Flags
--source <both|discord|wiretap>- which archive sources to read--update <auto|force|none>- whether to import the Git snapshot before live deltas--full- historical backfill (slow on large guilds)--all-channels- broader incremental sweep across every stored channel/thread--latest-only- explicit latest-only run (also the default for untargetedsync)--all- ignoredefault_guild_idand fan out across every discovered guild--guild <id>/--guilds <id,id>- target specific guilds--channels <id,id>- target specific channels (forum ids expand to threads)--since <RFC3339>- limit initial history and--fullbackfill to messages at or after this timestamp--concurrency <n>- override worker count (default auto-sized: floor 8, cap 32)--skip-members- refresh guild/channel/message data without crawling members--with-embeddings- also enqueue changed messages intoembedding_jobs
#Notes
--latest-onlyis the default for untargetedsync. Use--all-channelsto opt out without doing a full historical crawl.--sincedoes not mark older history as complete, so a latersync --fullwithout--sincecan continue the backfill.- Long runs emit periodic progress logs to stderr.
- Heartbeat logs (
message sync waiting) name the oldest active channel and per-channel page activity if in-flight channels stop completing for a while. - Every run ends with a
message sync finishedsummary. - Each channel crawl has a bounded runtime budget; pathological channels are deferred and retried next sync.
- Full sync member refresh is best-effort and gives up after five minutes without a caller-supplied deadline.
- When the archive is already complete,
sync --fullreuses backlog markers and limits steady-state refresh to live top-level channels plus active threads.