Skip to content

Use Case

Crawl and screenshot an entire website with one API call

Run a website crawl screenshot job with a single API call. Provide a domain and start URL — the crawler discovers pages automatically, captures each one, and returns screenshots when done. No sitemap or URL list needed.

1000
Max URLs per crawl
80+
Countries available
Auto
Link discovery

How it works

From domain to screenshots in four steps

The crawl API discovers pages by following links within your domain, then screenshots each one. You define the start URL and options once — discovery and capture run automatically.

1

Set domain and start URL

Provide the domain you want to crawl (e.g. example.com) and the starting URL where discovery begins. The crawler will follow links within the domain to discover pages automatically.

2

POST to /crawl/create

Submit a JSON body with url, domain, and max_urls. Optionally set country, size, delay, browser, and other capture options. The API returns a crawl ID immediately.

3

Poll for status

Call /crawl/info with the crawl ID to track progress. The response includes status (processing, finished, cancelled, error), total_discovered, processed, and failed counts.

4

Get screenshots via /crawl/info

When status is "finished", the /crawl/info response includes a screenshots array with each capture — URL, image URL, and metadata. Download or process them as needed.

API Example

Create a crawl and poll for completion

Submit a JSON body to /crawl/create with url, domain, and max_urls. Poll /crawl/info for status and screenshots when status is "finished".

# 1. Create the crawl
curl -X POST "https://api.screenshotcenter.com/api/v1/crawl/create?key=YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com", "domain": "example.com", "max_urls": 100}'
# => { "data": { "id": 12345, "status": "processing", ... } }

# 2. Poll for status and screenshots
curl "https://api.screenshotcenter.com/api/v1/crawl/info?key=YOUR_API_KEY&id=12345"
# => { "data": { "status": "finished", "processed": 98, "screenshots": [...] } }

Use Cases

What teams use crawl for

Crawl turns manual site audits into a single API call. No sitemap export, no URL list to maintain — just a domain and start URL. Here are the most common crawl workflows teams automate.

🔄

Site migration QA

Crawl your staging or new site before launch and compare screenshots against production. Catch broken links, missing assets, and layout regressions across every discovered page — no sitemap required.

  • Pre-launch visual audit of a migrated site
  • Compare staging vs production page by page
  • Verify redirect chains and internal links
🔍

SEO audits

Discover and screenshot every indexed page on a domain for SEO analysis. Identify thin content, duplicate layouts, and rendering issues. Export screenshots for client reports or internal audits.

  • Full-site crawl for technical SEO review
  • Detect pages with broken or inconsistent rendering
  • Build a visual inventory of all crawlable pages
🖼️

Visual regression testing

Run periodic crawls to establish a visual baseline and detect unintended changes. Compare screenshots over time to catch CSS breaks, layout shifts, or content changes before users notice.

  • Automated visual diffing across the full site
  • Catch layout regressions after deployments
  • Track design consistency across templates
📊

Competitor monitoring

Crawl competitor sites to track their page structure, design updates, and content changes over time. Screenshot key pages on a schedule for competitive intelligence and market research.

  • Monitor competitor homepage and key landing pages
  • Track design and content changes over time
  • Build a visual archive for competitive analysis
⚖️

Compliance archiving

Crawl policy pages, terms of service, and regulatory disclosures across a domain. Build a timestamped visual archive for legal, compliance, and audit purposes without manual page-by-page capture.

  • Archive all policy and legal pages on a schedule
  • Document published disclosures for regulatory filings
  • Timestamped evidence for contract and compliance audits
📂

Content inventory

Generate a visual inventory of every page on a domain. Use crawl discovery to find pages that might not be in your sitemap — orphaned pages, old URLs, or dynamically generated content.

  • Full content audit with screenshot evidence
  • Discover orphaned or forgotten pages
  • Build a visual sitemap for large or complex sites

Crawl Parameters

Key parameters for crawl creation

Required fields are url, domain, and max_urls. All standard screenshot options apply to every page in the crawl.

ParameterRequiredDescription
urlrequiredStarting URL for the crawl. Must be a full URL (e.g. https://example.com).
domainrequiredDomain being crawled. Links outside this domain are not followed.
max_urlsrequiredMaximum number of pages to screenshot. Between 1 and 1000.
countryoptionalISO country code for browser location. Defaults to "us".
sizeoptional"screen" (visible viewport) or "page" (full scrollable page).
screen_widthoptionalBrowser viewport width in pixels. Default: 1024.
delayoptionalSeconds to wait after page load before capturing.
browseroptionalBrowser type: "chromium", "firefox", or "webkit".
hide_adsoptionalStrip ads and cookie banners from all captures.
formatoptionalOutput format: "png", "jpeg", "webp", or "pdf".

Start your first crawl today

Get 500 free captures to test the crawl workflow. No credit card required. Provide a domain and have your first crawl running in minutes.

Frequently asked questions

What is website crawl?

Website crawl is an API that discovers and screenshots pages on a domain automatically. You provide a starting URL and domain, and the crawler follows links within that domain to find pages, then captures a screenshot of each one — up to the max_urls limit you set.

How does discovery work?

The crawler starts at your provided URL and follows links (a, area hrefs) that point to pages on the same domain. It respects robots.txt and avoids duplicate URLs. Pages are discovered in breadth-first order, and each discovered page is queued for screenshot capture.

What's the max_urls limit?

You can request between 1 and 1000 URLs per crawl. This is the maximum number of pages that will be screenshotted. The crawler may discover more URLs than max_urls — it will stop capturing once the limit is reached. For larger sites, run multiple crawls with different start URLs or increase the limit in separate jobs.

Can I cancel a crawl?

Yes. Call POST /crawl/cancel with the crawl ID to stop the crawl. The crawler will stop discovering new pages and processing queued jobs. Already completed screenshots remain available — you can still fetch them via /crawl/info.

Are completed screenshots kept on cancel?

Yes. When you cancel a crawl, any screenshots that were successfully captured before the cancel are preserved. The /crawl/info response includes the screenshots array with all completed captures, so you can download or process them even after cancellation.

What domains can I crawl?

You can crawl any publicly accessible domain. The crawler respects robots.txt and follows links only within the domain you specify. Subdomains are treated as separate domains — to crawl blog.example.com and www.example.com, you would run separate crawls for each.