media archive meets automation

Find a file

warnason 0b43c7c4dd Add mama-apply: materialize blobs to CAS, hardlink to views - archive.py: CAS path layout (blobs/<2>/<2>/<hash>), view paths scoped by source_kind, hardlink helpers - applier.py: cursor-paginated apply_pending with dry-run support - mama-apply CLI with progress and --dry-run Note: cross-dataset hardlinks fall back to copy (POSIX limitation), so applying from /mnt/preview to /mnt/archive currently doubles storage. To be addressed by consolidating sources into the archive dataset or by introducing --delete-source for move semantics.		2026-05-25 22:15:33 +02:00
alembic	Rescan idempotency + mama-dev (reset, stats)	2026-05-25 21:19:30 +02:00
src/mama	Add mama-apply: materialize blobs to CAS, hardlink to views	2026-05-25 22:15:33 +02:00
tests	Add project skeleton: config, models, CLI, BLAKE3 scanner	2026-05-25 18:30:30 +02:00
.gitignore	Initial commit: README, MIT license, gitignore	2026-05-25 18:30:30 +02:00
.python-version	Add project skeleton: config, models, CLI, BLAKE3 scanner	2026-05-25 18:30:30 +02:00
alembic.ini	Add Alembic with initial schema migration (blobs, observations)	2026-05-25 18:30:30 +02:00
LICENSE	Initial commit: README, MIT license, gitignore	2026-05-25 18:30:30 +02:00
mama.toml.example	Add project skeleton: config, models, CLI, BLAKE3 scanner	2026-05-25 18:30:30 +02:00
pyproject.toml	Rescan idempotency + mama-dev (reset, stats)	2026-05-25 21:19:30 +02:00
README.md	Initial commit: README, MIT license, gitignore	2026-05-25 18:30:30 +02:00
uv.lock	Add project skeleton: config, models, CLI, BLAKE3 scanner	2026-05-25 18:30:30 +02:00

README.md

mama

Media Archive Meets Automation — a self-hosted system for ingesting, deduplicating, and organizing personal media (photos, videos, music, documents) on top of ZFS.

⚠️ Project Status

Pre-alpha. Not ready for use by anyone but the author.

Database schema, CLI surface, on-disk layout, and HTTP API are unstable and will change without migration paths.
Most features described below are planned, not implemented.
Documentation lags behind code.

Do not point mama at irreplaceable data. Keep independent backups of anything mama touches.

Concept

mama indexes files placed in configured scan folders, stores file contents in a content-addressed blob store, and exposes them through hardlinked filesystem views consumable by specialized viewers (Immich for photos and video, Navidrome for music, Paperless-ngx for documents).

Identical content is stored only once. Per-file context — original path, source device, scan timestamp, embedded metadata (EXIF, ID3, sidecar files) — is preserved as observations linked to the underlying blob, so duplicates contribute information instead of clutter.

Workflow

mama-scan PATH — index files into the database (no copies, no moves)
mama-apply — materialize approved observations into the archive (blob into CAS, hardlink into view)
mama-web — browse, merge duplicates, filter, export, delete

Tech Stack

Python 3.13, FastAPI, SQLAlchemy 2.x (async), Alembic
PostgreSQL 16 (JSONB for embedded metadata)
Vue 3, Vite
ZFS (single archive dataset, snapshots, NFS export), Caddy
ExifTool, BLAKE3, ffmpeg, Pillow
Docker Compose for companion viewers (Immich, Navidrome, Paperless-ngx)

Disclaimer

mama is provided as-is for personal use. The author assumes no responsibility for data loss, corruption, mis-deduplication, accidental deletion, or any other adverse outcome arising from its use. Use at your own risk and only on data you can afford to lose.

License

MIT