paperless-ngx + paperless-gpt stack on ps1raf (quadlets + scanner integration)

Shell 100%

Find a file

raf 9c0114aec1 Add paperless-ai (clusterzx) wired to local ollama (llama3.2) Second AI companion alongside paperless-gpt, using the host's local ollama (127.0.0.1:11434 via host.containers.internal) instead of the remote .219. Docs: host ollama 0.0.0.0 drop-in + model pull, scripted /setup, env example. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>		2026-06-21 19:38:08 +02:00
.env.example	initial paperless-ngx + paperless-gpt stack scaffolding (loop task 87)	2026-04-18 21:10:21 +02:00
.gitignore	Add paperless-ai (clusterzx) wired to local ollama (llama3.2)	2026-06-21 19:38:08 +02:00
paperless-ai.env.example	Add paperless-ai (clusterzx) wired to local ollama (llama3.2)	2026-06-21 19:38:08 +02:00
README.md	Add paperless-ai (clusterzx) wired to local ollama (llama3.2)	2026-06-21 19:38:08 +02:00
scan-to-consume.sh	initial paperless-ngx + paperless-gpt stack scaffolding (loop task 87)	2026-04-18 21:10:21 +02:00

README.md

paperless-ngx + paperless-gpt on ps1raf

Document management stack with AI-assisted metadata (titles, tags, correspondents) via ollama + gpt-oss:120b running on 192.168.45.219. Scanner output from the scannerserver service auto-ingests via the shared consume directory.

Web UI: http://ps1raf:8018/ (admin/admin on first start — change this immediately)
paperless-gpt UI: http://ps1raf:8019/ (only useful after API token is set)

Architecture

┌──────────────┐     ┌──────────────┐     ┌──────────────┐
│ paperless-   │────▶│ paperless-db │     │ paperless-   │
│  webserver   │     │  (postgres)  │     │  gotenberg   │◀─┐
│  :8018       │◀────│              │     │              │  │
└──────┬───────┘     └──────────────┘     └──────────────┘  │
       │                                                     │
       │◀──── redis broker ──────────┐                       │
       │                             │                       │
       ▼                             ▼                       │
┌──────────────┐             ┌──────────────┐                │
│ paperless-   │             │ paperless-   │                │
│  redis       │             │  tika        │────────────────┘
└──────────────┘             └──────────────┘
       ▲
       │  API token
┌──────┴───────┐     HTTP   ┌──────────────────────┐
│ paperless-   │───────────▶│ ollama 192.168.45.219│
│  gpt :8019   │            │ gpt-oss:120b         │
└──────────────┘            └──────────────────────┘

Inputs:
  /home/raf/paperless/consume              ← drop files here manually
  /home/raf/scannerserver/output           ← auto-fed by scannerserver

Layout

/home/raf/paperless/
├── README.md
├── .env.example         # template; copy to .env before starting
├── consume/             # drop zone (bind-mounted as /usr/src/paperless/consume)
│                        # scanner output symlinked in as subdir
├── data/                # django index, originals, thumbnails
├── media/               # processed documents (archive + originals)
├── export/              # UI exports
├── pgdata/              # postgres data
├── redisdata/           # redis persistence
├── gpt-prompts/         # paperless-gpt prompt templates (editable via its UI)
└── (no Containerfile — all images come from upstream registries)

Quadlets live in ~/.config/containers/systemd/:

paperless.network — shared network
paperless-redis.container
paperless-db.container
paperless-gotenberg.container
paperless-tika.container
paperless-webserver.container
paperless-gpt.container — starts disabled; needs an API token from the webserver first

First-start flow

# 1) Prepare .env
cp /home/raf/paperless/.env.example /home/raf/paperless/.env
python3 -c 'import secrets; print(secrets.token_urlsafe(48))'  # for PAPERLESS_SECRET_KEY
openssl rand -hex 16                                            # for POSTGRES_PASSWORD
vi /home/raf/paperless/.env                                     # paste both in

# 2) Start core stack
systemctl --user daemon-reload
systemctl --user start paperless-redis paperless-db paperless-gotenberg paperless-tika
sleep 5
systemctl --user start paperless-webserver

# 3) Watch the first-run migrations + admin creation
journalctl --user -u paperless-webserver -f
# wait until you see "Listening at: http://0.0.0.0:8000"

# 4) Log in at http://ps1raf:8018/ as admin/admin and IMMEDIATELY change the password.
#    Then Profile → "API Auth Tokens" → create → copy the token.

# 5) Paste the token into PAPERLESS_API_TOKEN in /home/raf/paperless/.env

# 6) Start the GPT sidecar
systemctl --user start paperless-gpt

Scanner integration

scannerserver writes to /home/raf/scannerserver/output/. A symlink under paperless's consume directory picks these up:

ln -sfn /home/raf/scannerserver/output /home/raf/paperless/consume/from-scanner

PAPERLESS_CONSUMER_RECURSIVE=true makes paperless walk the subdir. Scans appear in paperless within ~5 seconds of being finalized.

Backup

For a proper backup run paperless's documented document_exporter (into export/) and then tar the export directory + the .env. Do NOT rely on copying media/ and data/ alone — the postgres DB is authoritative for metadata.

AI model — note

gpt-oss:120b on ollama is ~60 GB and response latency scales with prompt size. For paperless-gpt's lightweight title/tag generation it's overkill but fine. If latency becomes an issue, switch LLM_MODEL to a smaller model (llama3.3:70b, qwen2.5:7b, etc.) — paperless-gpt restarts pick up env changes.

paperless-ai (clusterzx) — local-ollama AI tagging

A second AI companion (distinct from paperless-gpt) that auto-suggests titles, tags, correspondents and document types using the local ollama on this host (not the remote 192.168.45.219 that paperless-gpt uses).

Quadlet: ~/.config/containers/systemd/paperless-ai.container
UI: https://ps1raf.tn.ps1.at:8473/ (TLS via nginx) → loopback 127.0.0.1:8020:3000
App login: admin / adminadmin (created during setup; separate from paperless)
Env: paperless-ai.env (copy from paperless-ai.env.example, not committed)
Data: ai-data/ (persisted config + db, not committed)
LLM: llama3.2:latest on the host ollama, reached via host.containers.internal:11434

Host ollama prerequisites

The host ollama (ollama.service, user ollama, /usr/share/ollama) must:

Listen beyond loopback so containers can reach it — drop-in /etc/systemd/system/ollama.service.d/listen.conf sets OLLAMA_HOST=0.0.0.0:11434.
Have the model present (the store was empty initially): OLLAMA_HOST=http://127.0.0.1:11434 ollama pull llama3.2:latest (+ nomic-embed-text:latest). (Note: the raf shell sets OLLAMA_HOST to the remote .219 host — override it to 127.0.0.1 when pulling so models land in the LOCAL store.)

Scripted setup (skip the web wizard)

The /setup wizard persists ai-data/.env and marks PAPERLESS_AI_INITIAL_SETUP=yes. It can be driven non-interactively (the handler appends /api itself, so pass the paperless URL WITHOUT it):

curl -s -X POST http://127.0.0.1:8020/setup -H 'Content-Type: application/json' -d '{
  "paperlessUrl":"http://paperless-webserver:8000","paperlessToken":"<TOKEN>",
  "paperlessUsername":"admin","aiProvider":"ollama",
  "ollamaUrl":"http://host.containers.internal:11434","ollamaModel":"llama3.2:latest",
  "scanInterval":"*/30 * * * *","activateTagging":true,"activateCorrespondents":true,
  "activateDocumentType":true,"activateTitle":true,"activateCustomFields":false,
  "username":"admin","password":"adminadmin"}'

First inference is slow (cold model load on CPU, ~90s); warm calls are a few seconds.