Skip to content
$_ setuptracking
Web Analytics

Self-Hosted Analytics Backup: Plausible, Matomo, Umami

25 min read
Dark server rack and fiber cabling representing a self-hosted analytics stack to back up

When you run a self-hosted analytics stack, you are the backup plan. There is no ops team, no managed database snapshot policy, no ticket you can open when the disk fills up and ClickHouse stops writing. The SLA is whatever you build into your own cron. Self-hosted analytics backup is the operational layer that almost nobody sets up before they need it — and the people who skip it find out at the worst possible moment.

This guide covers what to back up, the exact commands per tool, automating it with a simple shell pattern, pushing offsite with rclone or restic, and how to actually restore — which is the part most backup guides quietly omit. If you’re running Plausible, Matomo, or Umami on a VPS (Hetzner, DigitalOcean, your own bare metal), this is the ops checklist you want before anything goes wrong.

What actually needs backing up

The assumption that “the database” covers everything is wrong for self-hosted analytics, because each tool spreads its state across multiple datastores. Get the mapping right first.

Tool Event / analytics data Config / users / goals Secrets / env
Plausible CE ClickHouse — plausible_events_db PostgreSQL — plausible_db .env (SECRET_KEY_BASE, TOTP_VAULT_KEY)
Matomo CE MySQL/MariaDB — matomo (all tables) Same database — config in matomo_option / config.ini.php config/config.ini.php, plugin files
Umami PostgreSQL or MySQL — umami Same database — sites, teams, users, events .env (DATABASE_URL, APP_SECRET)
Plausible is the only one with split datastores. Miss ClickHouse and you restore an empty analytics shell with user accounts but zero historical data.

Beyond the databases, three more things belong in your backup scope:

  • Your .env file. Contains the encryption keys that protect session tokens and (for Plausible) TOTP secrets. Losing SECRET_KEY_BASE invalidates all sessions; losing TOTP_VAULT_KEY bricks two-factor for every user. Back it up somewhere separate from the database dump — and ideally encrypted at rest.
  • Your reverse-proxy config. Whether that’s a Caddyfile, nginx vhost, or Traefik labels. The database restores perfectly; then you spend an hour rebuilding the routing from memory. Write it down first.
  • Custom plugin files and themes (Matomo only). If you’ve installed Matomo plugins or custom report templates that live on disk, a database dump won’t capture them.

Plausible CE backup

Plausible CE runs as a Docker Compose stack: the app plus two databases. In the official community-edition compose file the database services are plausible_events_db (ClickHouse) and plausible_db (PostgreSQL). The compose file doesn’t pin container_name, so Docker derives the actual container as something like plausible-plausible_events_db-1. The reliable way to run the commands below is docker compose exec <service> from your compose directory — or substitute the real container name from docker ps. The examples use the service name directly; adjust to match your install.

Backing up ClickHouse

ClickHouse stores all your pageview and custom event data in plausible_events_db. The two primary tables are events_v2 (every event row) and sessions_v2 (session-aggregated data, VersionedCollapsingMergeTree). For a complete backup you want both, but events_v2 is the source of truth — sessions_v2 is derived from the event stream during ingestion.

Two approaches, depending on your setup:

Option A — clickhouse-backup (recommended for large installs)

clickhouse-backup by Altinity is the standard tool for ClickHouse backup/restore. It supports both local and remote storage (S3, GCS, Azure). Install it on the host or run it in a sidecar container:

# Install clickhouse-backup (released as a .tar.gz, Linux amd64)
# Check github.com/Altinity/clickhouse-backup/releases for the current version
CHBK_VER=2.6.0
wget -qO /tmp/clickhouse-backup.tar.gz \
  "https://github.com/Altinity/clickhouse-backup/releases/download/v${CHBK_VER}/clickhouse-backup-linux-amd64.tar.gz"
tar -xzf /tmp/clickhouse-backup.tar.gz -C /tmp
# extract the binary onto your PATH (path inside the archive can vary by version)
find /tmp -name clickhouse-backup -type f -exec install {} /usr/local/bin/clickhouse-backup \;

# Create a config (minimal example — adjust host/port/auth to match your CH container)
cat > /etc/clickhouse-backup/config.yml <<'EOF'
clickhouse:
  host: 127.0.0.1
  port: 9000
  username: default
  password: ""
  disk_mapping:
    default: /var/lib/clickhouse
general:
  remote_storage: none    # change to s3/gcs/azure for offsite
  backups_to_keep_local: 7
EOF

# Take a backup (creates /var/lib/clickhouse/backup/BACKUP_NAME/)
clickhouse-backup create plausible-$(date +%Y%m%d)

If ClickHouse runs inside Docker and the data directory is bind-mounted to the host (check your docker-compose.yml volumes section), point the disk mapping at the host path. If it’s a named Docker volume, the backup needs to run inside the container:

# Run backup inside the ClickHouse container
docker exec plausible_events_db clickhouse-backup create plausible-$(date +%Y%m%d)

Option B — clickhouse-client dump (simpler, no extra tooling)

For smaller installs or when you don’t want to install clickhouse-backup, export directly via the HTTP interface. ClickHouse supports Native, Parquet, and CSV formats — use Native for faithful round-trips:

BACKUP_DIR="/opt/backups/plausible/$(date +%Y%m%d)"
mkdir -p "$BACKUP_DIR"

# Dump events_v2 in ClickHouse Native format (binary, compact)
docker exec plausible_events_db clickhouse-client \
  --query="SELECT * FROM plausible_events_db.events_v2 FORMAT Native" \
  > "$BACKUP_DIR/events_v2.native"

# sessions_v2 — smaller, but useful
docker exec plausible_events_db clickhouse-client \
  --query="SELECT * FROM plausible_events_db.sessions_v2 FINAL FORMAT Native" \
  > "$BACKUP_DIR/sessions_v2.native"

gzip "$BACKUP_DIR/"*.native

The FINAL modifier on sessions_v2 matters: sessions_v2 is a VersionedCollapsingMergeTree table, which means it stores sign-versioned rows that collapse on merge. Without FINAL, the dump may contain uncollapsed duplicates that cause double-counting on restore. See the ClickHouse VersionedCollapsingMergeTree docs for the collapsing semantics.

Read:  Umami on Vercel + Neon: Free Self-Hosted Analytics in 5 Minutes

Backing up PostgreSQL

The Postgres instance holds users, sites, API keys, goals, and all of Plausible’s application state. It’s typically much smaller than ClickHouse. Standard pg_dump works:

BACKUP_DIR="/opt/backups/plausible/$(date +%Y%m%d)"
mkdir -p "$BACKUP_DIR"

docker exec plausible_db pg_dump \
  -U postgres plausible_db \
  --format=custom \
  --compress=9 \
  > "$BACKUP_DIR/plausible_postgres.dump"

--format=custom produces a compressed binary archive that pg_restore can use selectively (restore only one table, restore in parallel, etc.). It’s smaller than plain SQL and more flexible on restore. The database name inside the container is usually plausible_db — confirm with docker exec plausible_db psql -U postgres -l.

Back up the .env file

cp /opt/plausible/.env "$BACKUP_DIR/plausible.env.bak"
# Store this separately from the database dumps — ideally encrypted
# (see the rclone/restic section for encryption at rest)
Optical fiber cables connected to network switch hardware in a dark server rack

Matomo backup

Matomo is a single MySQL or MariaDB database. The entire state — raw visit data, goal conversions, aggregated reports, plugin config, user accounts — lives in one schema. That simplifies backup: one mysqldump covers everything.

The size caveat: on active sites, matomo_log_visit (one row per visit), matomo_log_link_visit_action (one row per action), and matomo_log_conversion can grow to tens of gigabytes within a year. A full dump of a three-year-old instance with moderate traffic can easily exceed 50 GB. Factor that into your backup storage budget before you’re surprised.

Full dump with mysqldump

BACKUP_DIR="/opt/backups/matomo/$(date +%Y%m%d)"
mkdir -p "$BACKUP_DIR"

# Replace 'matomo_user' and 'matomo_db' with your actual credentials
mysqldump \
  --user=matomo_user \
  --password=YOUR_PASSWORD \
  --single-transaction \
  --routines \
  --triggers \
  --hex-blob \
  matomo_db \
  | gzip > "$BACKUP_DIR/matomo_full.sql.gz"

The --single-transaction flag takes a consistent snapshot using InnoDB’s MVCC without locking tables — critical for a production database that continues writing during the backup. Without it, you risk a snapshot mid-transaction on the matomo_log_* tables, which can produce an inconsistent state.

If Matomo runs in Docker, the command runs inside the container:

docker exec matomo_db mysqldump \
  --user=matomo_user \
  --password=YOUR_PASSWORD \
  --single-transaction \
  matomo_db \
  | gzip > "$BACKUP_DIR/matomo_full.sql.gz"

Partial dump: skip raw logs, keep aggregated reports

If the full dump is too large or too slow, Matomo’s data has a two-tier structure you can exploit. The raw log tables (matomo_log_visit, matomo_log_link_visit_action, matomo_log_conversion, matomo_log_conversion_item) are the source of truth, but Matomo archives them into pre-aggregated report tables (matomo_archive_*) nightly. If you’re willing to accept that a restore won’t let you re-run historical raw queries, you can back up only the archive tables plus the config/user tables and skip the raw logs entirely.

Back up config.ini.php and plugins

# Back up the Matomo config (contains DB credentials, salt, plugin settings)
cp /var/www/matomo/config/config.ini.php "$BACKUP_DIR/config.ini.php.bak"

# Back up any custom or third-party plugins not in the default distribution
tar -czf "$BACKUP_DIR/plugins-custom.tar.gz" \
  /var/www/matomo/plugins/YourCustomPlugin/ \
  /var/www/matomo/plugins/AnotherThirdParty/

The [General].salt value in config.ini.php is what Matomo uses to build the Config ID (the per-installation part of the cookieless visitor identifier). If you restore the database without the matching salt, cookieless visitors will generate different Config IDs post-restore, which inflates unique visitor counts for any period that spans the restoration boundary. Back up the config file and restore it alongside the database.

mariabackup for zero-downtime hot backup

For large Matomo databases on MariaDB where mysqldump takes too long or the compressed output is unwieldy, mariabackup (Percona XtraBackup’s MariaDB fork) creates a physical hot copy of the data directory. It streams at disk speed rather than row-by-row SQL generation, making it significantly faster for large datasets. The trade-off: the backup directory is larger (raw InnoDB pages vs compressed SQL) and requires a --prepare step before it’s usable.

# mariabackup — requires installation (apt install mariadb-backup)
mariabackup --backup \
  --user=root \
  --password=YOUR_ROOT_PASSWORD \
  --target-dir="$BACKUP_DIR/mariabackup"

# Then prepare it (applies transaction log, makes backup consistent)
mariabackup --prepare \
  --target-dir="$BACKUP_DIR/mariabackup"

Umami backup

Umami’s data model is simpler than Matomo’s: everything lives in a single PostgreSQL (or MySQL) database. Sessions, events, websites, users — all in one schema. One pg_dump is all you need.

The default Umami Docker Compose stack on the Umami install docs names the Postgres service db (Docker derives the container as umami-db-1) and creates a database called umami. Verify the container name with docker ps before scripting.

BACKUP_DIR="/opt/backups/umami/$(date +%Y%m%d)"
mkdir -p "$BACKUP_DIR"

# PostgreSQL
docker exec umami-db-1 pg_dump \
  -U umami umami \
  --format=custom \
  --compress=9 \
  > "$BACKUP_DIR/umami_db.dump"

# If you run Umami with MySQL instead of PostgreSQL:
# docker exec umami-db mysqldump \
#   --user=umami \
#   --password=YOUR_PASSWORD \
#   --single-transaction \
#   umami | gzip > "$BACKUP_DIR/umami_db.sql.gz"

The APP_SECRET environment variable in Umami’s .env is used to sign session tokens. Back it up alongside the database dump for the same reason as Plausible’s SECRET_KEY_BASE: restoring the database without the matching secret invalidates all active sessions.

cp /opt/umami/.env "$BACKUP_DIR/umami.env.bak"
Hands typing on a laptop keyboard in a dimly lit workspace, running backup commands

Automating it: cron + shell script pattern

Running backup commands by hand is not a backup strategy. It’s something you do twice before forgetting. The right pattern is a shell script that you can call from cron (or a systemd timer), with log output and a simple rotation policy.

Read:  Conversion Tracking Beyond E-commerce: Measuring What Matters for Any Site

Here’s a minimal template for Plausible CE — adapt the commands and paths for Matomo or Umami:

#!/usr/bin/env bash
# /opt/scripts/backup-plausible.sh
set -euo pipefail

BACKUP_ROOT="/opt/backups/plausible"
DATE=$(date +%Y%m%d-%H%M%S)
BACKUP_DIR="${BACKUP_ROOT}/${DATE}"
LOG="${BACKUP_ROOT}/backup.log"
KEEP_DAYS=14  # local retention

mkdir -p "$BACKUP_DIR"
echo "[$(date)] Starting backup" | tee -a "$LOG"

# --- ClickHouse ---
docker exec plausible_events_db clickhouse-client \
  --query="SELECT * FROM plausible_events_db.events_v2 FORMAT Native" \
  | gzip > "${BACKUP_DIR}/events_v2.native.gz" \
  && echo "[$(date)] ClickHouse events_v2: OK" | tee -a "$LOG" \
  || echo "[$(date)] ClickHouse events_v2: FAILED" | tee -a "$LOG"

docker exec plausible_events_db clickhouse-client \
  --query="SELECT * FROM plausible_events_db.sessions_v2 FINAL FORMAT Native" \
  | gzip > "${BACKUP_DIR}/sessions_v2.native.gz" \
  && echo "[$(date)] ClickHouse sessions_v2: OK" | tee -a "$LOG" \
  || echo "[$(date)] ClickHouse sessions_v2: FAILED" | tee -a "$LOG"

# --- PostgreSQL ---
docker exec plausible_db pg_dump \
  -U postgres plausible_db \
  --format=custom \
  --compress=9 \
  > "${BACKUP_DIR}/plausible_postgres.dump" \
  && echo "[$(date)] PostgreSQL: OK" | tee -a "$LOG" \
  || echo "[$(date)] PostgreSQL: FAILED" | tee -a "$LOG"

# --- .env ---
cp /opt/plausible/.env "${BACKUP_DIR}/plausible.env.bak"

# --- Rotate old local backups ---
find "$BACKUP_ROOT" -maxdepth 1 -type d -mtime "+${KEEP_DAYS}" -exec rm -rf {} \; 2>/dev/null || true
echo "[$(date)] Backup complete: ${BACKUP_DIR}" | tee -a "$LOG"
# Make it executable
chmod +x /opt/scripts/backup-plausible.sh

# Add to root's crontab — runs every day at 02:30
crontab -e
# Add this line:
# 30 2 * * * /opt/scripts/backup-plausible.sh

A few things worth noting about this pattern:

  • set -euo pipefail exits the script on any unhandled error, unset variable, or broken pipe. Without this, a failed docker exec writes an empty gzip file while the script continues happily and reports success.
  • The explicit || echo "FAILED" on each database command means the log contains the actual failure location when something goes wrong at 2:30 AM.
  • KEEP_DAYS=14 gives you a two-week local history. Adjust based on your disk capacity. For a Hetzner CX22 with 40 GB root disk, 7 days is more realistic if your analytics database is large.
  • This script writes dumps locally. The offsite step (next section) is separate.
External SSD drive connected to a laptop on a desk for offsite backup storage

Offsite copy with rclone or restic

Local backups protect against botched upgrades and accidental data deletion. They do not protect against the disk dying, the VPS provider having an incident, or someone with root access (including yourself) running rm -rf in the wrong directory. Offsite means a physically separate location: an S3-compatible bucket, Backblaze B2, or a Hetzner Storage Box.

Two good tools for this, with different trade-offs:

Tool Best for Encryption Deduplication
rclone Simple sync of files/directories to remote storage Via rclone crypt remote overlay No (copies full files)
restic Versioned, deduplicated, encrypted backup repository Built-in (AES-256-CTR + Poly1305-AES) Yes (content-defined chunking)
rclone is simpler to set up; restic is better when backup data has significant overlap across runs (common with partial SQL dumps).

rclone to Backblaze B2 or Hetzner Storage Box

# Install rclone
curl https://rclone.org/install.sh | sudo bash

# Configure a remote (interactive wizard)
rclone config

# Once configured, sync your local backup directory to the remote
# Remote name 'b2remote' is whatever you named it during config
rclone sync /opt/backups/plausible/ b2remote:your-bucket-name/plausible/ \
  --progress \
  --log-file=/opt/backups/rclone.log

# Add to the end of your backup script, after the local dump completes

Hetzner Storage Boxes support SFTP, which rclone handles natively. Backblaze B2 is a cheap S3-compatible option — roughly $6/TB/month storage (about $0.006/GB) with low egress fees. For a typical self-hosted analytics backup under 5 GB, storage runs a couple of cents a month.

restic to any S3-compatible backend

# Install restic
apt install restic  # or: brew install restic

# Initialize a repository (run once — stores the repository structure)
export RESTIC_REPOSITORY="s3:https://s3.eu-central-003.backblazeb2.com/your-bucket"
export RESTIC_PASSWORD="a-strong-passphrase-stored-safely"
export B2_ACCOUNT_ID="your-b2-key-id"
export B2_ACCOUNT_KEY="your-b2-app-key"

restic init

# Backup the local dump directory into the repository
restic backup /opt/backups/plausible/

# List snapshots
restic snapshots

# Prune: keep 7 daily, 4 weekly, 3 monthly snapshots
restic forget --prune \
  --keep-daily 7 \
  --keep-weekly 4 \
  --keep-monthly 3

restic’s deduplication is valuable for analytics backups because consecutive daily dumps of the same database share a large percentage of content. restic chunks the data into content-defined segments and only uploads new chunks, which keeps the repository size manageable even at daily frequency.

Store the RESTIC_PASSWORD somewhere safe and separate from the data it protects. A password manager, a separately stored encrypted note, or a separate secrets store. If you lose the password, the repository is unrecoverable — restic’s encryption has no master key escrow.

Restoring — the part people skip

A backup you’ve never restored from is an assumption, not a guarantee. The file exists. Whether the restore actually works is a different question. Run through this at least once on a test instance — not when the production database is already gone.

Restoring Plausible CE

# Stop Plausible first (don't restore into a running application)
cd /opt/plausible && docker compose stop

# --- Restore PostgreSQL ---
# Drop and recreate the database (do this inside the pg container)
docker exec -it plausible_db psql -U postgres -c "DROP DATABASE IF EXISTS plausible_db;"
docker exec -it plausible_db psql -U postgres -c "CREATE DATABASE plausible_db;"

# Restore from the custom-format dump. The dump was taken without --create, so it
# restores schema + data into an EXISTING empty database — which is why we DROP/CREATE first.
cat /opt/backups/plausible/20240601-023012/plausible_postgres.dump \
  | docker exec -i plausible_db pg_restore \
    -U postgres \
    -d plausible_db \
    --no-owner \
    --role=postgres

# --- Restore ClickHouse events_v2 ---
# If using clickhouse-backup:
docker exec plausible_events_db clickhouse-backup restore plausible-20240601

# If using the Native format dump:
zcat /opt/backups/plausible/20240601-023012/events_v2.native.gz \
  | docker exec -i plausible_events_db clickhouse-client \
    --query="INSERT INTO plausible_events_db.events_v2 FORMAT Native"

# --- Start Plausible ---
docker compose up -d
docker compose logs -f plausible

Restoring Matomo

# Create an empty database to restore into
mysql -u root -p -e "DROP DATABASE IF EXISTS matomo_restore; CREATE DATABASE matomo_restore CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;"

# Restore from gzipped mysqldump
zcat /opt/backups/matomo/20240601-023012/matomo_full.sql.gz \
  | mysql -u root -p matomo_restore

# Point Matomo at the restored database — edit [database] dbname in config.ini.php:
#   sed -i -E 's/^[[:space:]]*dbname[[:space:]]*=.*/dbname = "matomo_restore"/' /var/www/matomo/config/config.ini.php
# (or rename the restored DB back to the original name and leave config untouched)

# Restore config.ini.php if you backed it up separately
cp /opt/backups/matomo/20240601-023012/config.ini.php.bak \
   /var/www/matomo/config/config.ini.php

Restoring Umami

cd /opt/umami && docker compose stop

# Drop and recreate
docker exec -it umami-db-1 psql -U umami -c "DROP SCHEMA public CASCADE; CREATE SCHEMA public;"

# Restore
cat /opt/backups/umami/20240601-023012/umami_db.dump \
  | docker exec -i umami-db-1 pg_restore \
    -U umami \
    -d umami \
    --no-owner

docker compose up -d

When NOT to self-manage backups

Setting up a backup pipeline takes an hour, but maintaining it — watching the cron logs, checking that the offsite sync actually ran, testing restores periodically — is ongoing. There are setups where the full approach above is overkill.

  • Low-traffic sites where the data is low-stakes. If you’re tracking a personal blog with 500 visits/month and the worst case of losing the data is losing a few months of dashboards, a weekly VPS snapshot (Hetzner charges roughly €0.0119/GB/month for snapshots) covers your actual risk. Set the snapshot, move on.
  • When your VPS provider’s snapshot feature is genuinely sufficient. Hetzner’s VPS snapshots are full disk images. For Plausible or Umami, a daily snapshot covers both the database and the app configuration in one operation. The limitation: snapshots are coarse (full disk), you can’t restore a single table, and they’re stored in the same datacenter. For most personal projects this is fine.
  • When you’d rather pay for managed hosting. Plausible Cloud, Matomo’s hosted tier, or Umami Cloud handle backups as part of the service. If the self-hosting ops overhead isn’t worth it to you, that’s a legitimate trade. The full cost comparison is in the TCO calculator.
  • When your analytics data has no retention requirement. Some teams self-host purely for data sovereignty and GDPR: they don’t want personal data leaving their infrastructure, but they also have a GDPR data minimisation policy that deletes raw visit data after 90 days anyway. In that scenario, a 90-day local dump of modest size with a simple cron is the entire backup strategy — no offsite needed because the data has no long-term value.

The install recipes on this site cover the stack decisions upstream of backups: Plausible CE on Hetzner, Matomo on Hetzner, and Umami on Vercel. Once you know which stack you’re running, the backup approach above applies directly. For the SQL layer, the SQL lab has queries for inspecting your ClickHouse and PostgreSQL data volumes before sizing your backup storage.

FAQ

How often should I back up a self-hosted analytics instance?

For most self-hosted setups, daily is the right default. Analytics data is append-only (new events only), which means the gap between backups is the data you’d lose in a failure. Daily backups mean your worst case is losing 24 hours of data. If that’s acceptable — and for most sites it is — a daily 02:00 cron covers it. For production setups where data loss is more consequential, consider backing up ClickHouse/PostgreSQL every few hours and keeping the full dump daily.

Where should I store the backup files?

Two locations: local (fast restore, same server) and offsite (disaster recovery, different infrastructure). Local means a separate directory on the same VPS with a rotation policy (14 days is a good default). Offsite means an S3-compatible bucket or storage service in a different region: Backblaze B2, Hetzner Storage Box, or Cloudflare R2. The two-location pattern means a single failure — disk failure, accidental deletion, VPS provider outage — doesn’t eliminate both copies.

Should I encrypt the backup files?

Yes, for anything going offsite — especially for GDPR-regulated analytics data. Even privacy-focused analytics tools like Plausible and Umami hash IP addresses, but the hashed data plus session data can still be considered personal data under GDPR in some interpretations. Storing plaintext database dumps on a third-party cloud service is a data transfer under GDPR. Use restic (built-in AES-256 encryption) or rclone crypt for remote storage. For local backups, disk-level encryption (LUKS) on the VPS is reasonable but not strictly required if the VPS provider has physical access controls.

Is a VPS snapshot the same as a database backup?

It covers the same data but is not the same thing. A VPS snapshot is a full disk image — it captures everything at once, including the OS, application files, and all databases. The advantages: zero configuration, point-in-time consistency, fast to set up. The limitations: snapshots are coarse (you can’t restore a single table), they’re stored in the same datacenter as your VPS (one provider incident = both gone), and restore requires spinning up the entire instance rather than piping a dump file into a running database. For low-traffic personal sites, a daily VPS snapshot is entirely reasonable. For sites where you’d want to restore just the analytics database while keeping the rest of the server running, a database-level dump is necessary.

Plausible CE has two databases. Do I really need to back up ClickHouse?

Yes. The PostgreSQL database contains users, sites, and goals — the application configuration. ClickHouse contains the actual analytics data: every pageview, every custom event, every revenue entry. Restoring only PostgreSQL gives you a working Plausible installation with no historical data. Restoring only ClickHouse gives you data you can’t access because the user accounts and site configs are missing. Both are required for a functional restore. ClickHouse backups are typically larger than PostgreSQL for any site with more than a few months of events.

What are the GDPR implications for storing analytics backup data?

Analytics backup data is subject to the same GDPR rules as the live data. If your retention policy says “delete raw event data after 12 months,” that policy applies to backups too — a backup containing 3-year-old events you’d have deleted from the live database is a GDPR liability. Two things help: first, use your tool’s built-in retention controls where they exist (in Matomo, Administration → Privacy → Anonymize data has “Regularly delete old visitor logs” and “delete old reports” toggles). Plausible and Umami don’t expose a built-in raw-log retention toggle, so prune on a schedule — a dated DELETE/ALTER TABLE … DELETE against old rows run from cron. Second, apply the same rotation policy to backups — if you delete raw events after 12 months in production, your backup retention should not exceed 12 months either.

Can I back up only part of the Plausible ClickHouse data to save space?

Yes — query by date range. The events_v2 table has a timestamp column you can filter on. Backing up the last 90 days of events is a valid strategy if you don’t need to restore historical data beyond that window. Just be explicit about what you’re trading off: any failure followed by a restore will show a blank dashboard for everything older than your backup window. Match your backup window to your actual recovery requirements, not to your disk budget.


Found this useful?

Try the Stack Picker to get a personal recommendation, or browse the install recipe library.