I’d like to propose adding app-owned backup and restore primitives to Msgvault.
The motivation is correctness: an external backup system can handle scheduling, retention, encryption, and storage, but Msgvault itself is the best place to define what a restorable backup actually contains. Without that contract, server backup scripts have to know internal details like which SQLite database is canonical, which files are content-addressed attachments, which caches are rebuildable, and how to verify restored state.
I’d propose implementing backup and restore together, with verification as part of the contract. You haven’t really backed up application state until you can prove it restores cleanly.
Proposed CLI:
msgvault backup create --output DIR [--include-tokens] [--include-config]
msgvault backup verify DIR
msgvault backup restore DIR --target DIR [--overwrite]
Possible backup layout:
DIR/
manifest.json
msgvault.db
attachments/
config/
tokens/
manifest.json would include:
- Msgvault version
- backup format version
- database schema/user_version
- backup timestamp
- selected options, such as whether tokens/config were included
- attachment count and expected hashes
- database message/account counts
- list of excluded derived state
Proposed behavior:
-
backup create
- take a consistent SQLite snapshot, likely using the existing
VACUUM INTO-style approach
- copy or hardlink content-addressed attachments into the backup directory
- include config/tokens only behind explicit flags
- exclude rebuildable state by default, such as analytics/cache/index data
- write
manifest.json last, after all content is present
- fail clearly if any referenced attachment is missing or corrupt
-
backup verify
- run SQLite integrity checks on the backed-up DB
- validate the manifest
- verify every referenced attachment exists
- verify attachment contents match their expected hashes
- confirm expected table counts / schema metadata match the manifest
-
backup restore
- restore into a fresh target directory by default
- refuse to overwrite existing state unless
--overwrite is supplied
- restore DB, attachments, and explicitly included config/token state
- omit derived caches and rebuild them after restore if needed
- run
backup verify against the restored target before reporting success
Acceptance criteria for the PR:
- automated test creates a sample Msgvault state
- backs it up
- restores it into a separate empty directory
- verifies the restored DB and attachments
- confirms derived state is either absent or rebuildable
- confirms restore refuses unsafe overwrite by default
The intent is not to make Msgvault own retention, encryption, remote storage, alerts, or scheduling. Tools like restic/kopia/borg/systemd/cron should still own that. Msgvault would own the correctness boundary: “this directory is a valid, restorable Msgvault backup.”
Would this direction fit the project? If so, I’m happy to work on the initial PR and would appreciate feedback on the proposed backup layout and CLI names before implementing.
I’d like to propose adding app-owned backup and restore primitives to Msgvault.
The motivation is correctness: an external backup system can handle scheduling, retention, encryption, and storage, but Msgvault itself is the best place to define what a restorable backup actually contains. Without that contract, server backup scripts have to know internal details like which SQLite database is canonical, which files are content-addressed attachments, which caches are rebuildable, and how to verify restored state.
I’d propose implementing backup and restore together, with verification as part of the contract. You haven’t really backed up application state until you can prove it restores cleanly.
Proposed CLI:
Possible backup layout:
manifest.jsonwould include:Proposed behavior:
backup createVACUUM INTO-style approachmanifest.jsonlast, after all content is presentbackup verifybackup restore--overwriteis suppliedbackup verifyagainst the restored target before reporting successAcceptance criteria for the PR:
The intent is not to make Msgvault own retention, encryption, remote storage, alerts, or scheduling. Tools like restic/kopia/borg/systemd/cron should still own that. Msgvault would own the correctness boundary: “this directory is a valid, restorable Msgvault backup.”
Would this direction fit the project? If so, I’m happy to work on the initial PR and would appreciate feedback on the proposed backup layout and CLI names before implementing.