Skip to content

How to Archive a Website with Visual Backups

Set up automated visual website backups using the screenshot API. Capture every page as PNG and PDF, store in S3 or Google Drive, and maintain a timestamped archive.

Why visual backups matter

HTML source and database dumps capture data — but not how the page looked. After a hack, a botched deploy, or a legal dispute, you need timestamped visual evidence of exactly what visitors saw. A visual website backup is a screenshot (or PDF) of every page, stored in your cloud storage, captured automatically on a schedule.

Architecture overview

  1. Discover pages — use the website crawl API to find every URL on your domain, or maintain a URL list from your sitemap.
  2. Capture at intervals — schedule daily, weekly, or monthly batch captures via cron or a scheduler.
  3. Store with date structure — deliver to S3 or Google Drive with date-based paths so you can browse by date.
  4. Retain and rotate — set lifecycle rules on your storage bucket to archive or delete old captures.

Setting up the schedule

Use a cron job, GitHub Action, or Zapier schedule trigger to call the batch API daily:

# Daily at 6 AM UTC
0 6 * * * curl -X POST "https://api.screenshotcenter.com/api/batch/create" \
  -H "X-API-KEY: $API_KEY" \
  -F "urls=@/data/sitemap-urls.txt" \
  -F 'options={"full_page":true,"pdf":true,"apps":[{"app":"s3","bucket":"website-backups","path":"{yyyy}/{mm}/{dd}/{domain}/{id}"}]}'

Date-organized storage

The {yyyy}/{mm}/{dd} path template creates a folder structure like:

website-backups/
  2026/02/19/example.com/abc123.png
  2026/02/19/example.com/abc123.pdf
  2026/02/20/example.com/def456.png
  ...

This makes it trivial to compare yesterday's version with today's — just diff the two date folders.

Adding timestamps to screenshots

For compliance and legal purposes, you may want a visible timestamp on each screenshot. Use the script_inline parameter to inject JavaScript that adds a timestamp overlay before capture:

// Injected via script_inline parameter
const ts = document.createElement('div');
ts.textContent = new Date().toISOString();
ts.style.cssText = 'position:fixed;bottom:8px;right:8px;background:#000;color:#fff;padding:4px 8px;font-size:12px;z-index:99999;border-radius:4px;';
document.body.appendChild(ts);

PDF + PNG dual capture

For archival-grade backups, capture both PNG (visual reference) and PDF (print-ready, text-searchable) in the same batch. The PDF preserves selectable text and links, while the PNG provides exact pixel rendering.

Monitoring and alerting

After each batch completes, check the results CSV for failures. Send a Slack notification via the Slack integration if any URLs failed, so your team can investigate before the next scheduled run.

Next steps

See compliance screenshots for legal archival patterns, and S3 integration for detailed storage configuration.