Skip to content

Add first-class backup and restore support for Msgvault state #339

@danshapiro

Description

@danshapiro

I’d like to propose adding app-owned backup and restore primitives to Msgvault.

The motivation is correctness: an external backup system can handle scheduling, retention, encryption, and storage, but Msgvault itself is the best place to define what a restorable backup actually contains. Without that contract, server backup scripts have to know internal details like which SQLite database is canonical, which files are content-addressed attachments, which caches are rebuildable, and how to verify restored state.

I’d propose implementing backup and restore together, with verification as part of the contract. You haven’t really backed up application state until you can prove it restores cleanly.

Proposed CLI:

msgvault backup create --output DIR [--include-tokens] [--include-config]
msgvault backup verify DIR
msgvault backup restore DIR --target DIR [--overwrite]

Possible backup layout:

DIR/
  manifest.json
  msgvault.db
  attachments/
  config/
  tokens/

manifest.json would include:

  • Msgvault version
  • backup format version
  • database schema/user_version
  • backup timestamp
  • selected options, such as whether tokens/config were included
  • attachment count and expected hashes
  • database message/account counts
  • list of excluded derived state

Proposed behavior:

  • backup create

    • take a consistent SQLite snapshot, likely using the existing VACUUM INTO-style approach
    • copy or hardlink content-addressed attachments into the backup directory
    • include config/tokens only behind explicit flags
    • exclude rebuildable state by default, such as analytics/cache/index data
    • write manifest.json last, after all content is present
    • fail clearly if any referenced attachment is missing or corrupt
  • backup verify

    • run SQLite integrity checks on the backed-up DB
    • validate the manifest
    • verify every referenced attachment exists
    • verify attachment contents match their expected hashes
    • confirm expected table counts / schema metadata match the manifest
  • backup restore

    • restore into a fresh target directory by default
    • refuse to overwrite existing state unless --overwrite is supplied
    • restore DB, attachments, and explicitly included config/token state
    • omit derived caches and rebuild them after restore if needed
    • run backup verify against the restored target before reporting success

Acceptance criteria for the PR:

  • automated test creates a sample Msgvault state
  • backs it up
  • restores it into a separate empty directory
  • verifies the restored DB and attachments
  • confirms derived state is either absent or rebuildable
  • confirms restore refuses unsafe overwrite by default

The intent is not to make Msgvault own retention, encryption, remote storage, alerts, or scheduling. Tools like restic/kopia/borg/systemd/cron should still own that. Msgvault would own the correctness boundary: “this directory is a valid, restorable Msgvault backup.”

Would this direction fit the project? If so, I’m happy to work on the initial PR and would appreciate feedback on the proposed backup layout and CLI names before implementing.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions