media archive meets automation

Find a file

warnason a84b32d6b0 Rescan idempotency + mama-dev (reset, stats) - observations now keyed by (path, mtime) for idempotent rescans - new index ix_observations_path_mtime - mama-dev reset: truncate all data, schema kept - mama-dev stats: overview, breakdowns by source_kind/status/hostname/MIME, largest blobs		2026-05-25 21:19:30 +02:00
alembic	Rescan idempotency + mama-dev (reset, stats)	2026-05-25 21:19:30 +02:00
src/mama	Rescan idempotency + mama-dev (reset, stats)	2026-05-25 21:19:30 +02:00
tests	Add project skeleton: config, models, CLI, BLAKE3 scanner	2026-05-25 18:30:30 +02:00
.gitignore	Initial commit: README, MIT license, gitignore	2026-05-25 18:30:30 +02:00
.python-version	Add project skeleton: config, models, CLI, BLAKE3 scanner	2026-05-25 18:30:30 +02:00
alembic.ini	Add Alembic with initial schema migration (blobs, observations)	2026-05-25 18:30:30 +02:00
LICENSE	Initial commit: README, MIT license, gitignore	2026-05-25 18:30:30 +02:00
mama.toml.example	Add project skeleton: config, models, CLI, BLAKE3 scanner	2026-05-25 18:30:30 +02:00
pyproject.toml	Rescan idempotency + mama-dev (reset, stats)	2026-05-25 21:19:30 +02:00
README.md	Initial commit: README, MIT license, gitignore	2026-05-25 18:30:30 +02:00
uv.lock	Add project skeleton: config, models, CLI, BLAKE3 scanner	2026-05-25 18:30:30 +02:00

README.md

mama

Media Archive Meets Automation — a self-hosted system for ingesting, deduplicating, and organizing personal media (photos, videos, music, documents) on top of ZFS.

⚠️ Project Status

Pre-alpha. Not ready for use by anyone but the author.

Database schema, CLI surface, on-disk layout, and HTTP API are unstable and will change without migration paths.
Most features described below are planned, not implemented.
Documentation lags behind code.

Do not point mama at irreplaceable data. Keep independent backups of anything mama touches.

Concept

mama indexes files placed in configured scan folders, stores file contents in a content-addressed blob store, and exposes them through hardlinked filesystem views consumable by specialized viewers (Immich for photos and video, Navidrome for music, Paperless-ngx for documents).

Identical content is stored only once. Per-file context — original path, source device, scan timestamp, embedded metadata (EXIF, ID3, sidecar files) — is preserved as observations linked to the underlying blob, so duplicates contribute information instead of clutter.

Workflow

mama-scan PATH — index files into the database (no copies, no moves)
mama-apply — materialize approved observations into the archive (blob into CAS, hardlink into view)
mama-web — browse, merge duplicates, filter, export, delete

Tech Stack

Python 3.13, FastAPI, SQLAlchemy 2.x (async), Alembic
PostgreSQL 16 (JSONB for embedded metadata)
Vue 3, Vite
ZFS (single archive dataset, snapshots, NFS export), Caddy
ExifTool, BLAKE3, ffmpeg, Pillow
Docker Compose for companion viewers (Immich, Navidrome, Paperless-ngx)

Disclaimer

mama is provided as-is for personal use. The author assumes no responsibility for data loss, corruption, mis-deduplication, accidental deletion, or any other adverse outcome arising from its use. Use at your own risk and only on data you can afford to lose.

License

MIT