Will this scraper get me banned?

It is designed to be polite, with per-host rate limiting, a delay, and backoff with jitter. But that depends on a sane `[concurrency]` setting and the target's tolerance, so start conservatively and respect the site's robots.txt and terms of service.

Why use Rust instead of Python for scraping?

Rust gives you speed, low memory use, and a single binary you can drop on a server with no runtime. It is most worthwhile when the job runs constantly or at scale; for a one-off script, Python is often simpler.

Does it handle flaky or failing requests?

Yes. It uses reqwest with a retry policy and exponential backoff plus jitter on transient errors, so temporary failures are retried sensibly rather than crashing the run or hammering the origin with immediate retries.

Will the CSS selectors work out of the box?

Not necessarily. The scraper crate depends on the exact HTML structure of `[site]`, so you will likely need to inspect the live page and adjust selectors. The generated code gives you the structure; the selectors need verification.

Claude/ChatGPT Prompt to Build a Concurrent Web Scraper in Rust | AI Prompt Library

What this prompt does

This prompt builds a concurrent, well-behaved web scraper in Rust and returns working code rather than pseudocode. It frames the model as a senior Rust engineer and asks for six things: a tokio async runtime with a bounded worker pool, reqwest with retry and exponential backoff plus jitter, HTML parsing via the scraper crate into typed structs, per-host rate limiting with a polite delay, a clap-based CLI, and graceful shutdown on Ctrl+C that drains in-flight requests.

Four variables shape the scraper. [site] is the target. [target_data] defines the fields parsed into typed structs. [concurrency] caps the bounded worker pool so you don't overwhelm the origin or your own machine. [output_format] sets how results are emitted, such as newline-delimited JSON. The core discipline here is being polite to the target instead of running a self-inflicted DoS — tuning rate limiting and backoff together is what keeps you from getting IP-banned mid-run. Parsing into typed structs rather than loose strings also means malformed pages surface as compile-time or deserialization errors instead of silently corrupting your output downstream.

When to use it

You need a fast, memory-light scraper that ships as a single binary
You want concurrency capped so you don't hammer the origin
You need retry with backoff and jitter for flaky endpoints
You want extracted data parsed into typed structs, not loose strings
You need graceful shutdown that drains in-flight requests on Ctrl+C
You want a CLI to control seed URL, concurrency, and output path

Example output

You get a full Cargo.toml and src/main.rs. The code sets up a tokio runtime with a bounded worker pool capped at your concurrency limit, reqwest with retry and exponential backoff plus jitter on transient errors, scraper-crate parsing of your target data into typed structs, per-host rate limiting with a polite delay, a clap CLI for seed URL, concurrency, and output path, and graceful Ctrl+C shutdown that drains in-flight requests before exiting.

Pro tips

Tune [concurrency] and the backoff together; per-host limits plus exponential backoff with jitter are what keep you from getting IP-banned
Set [concurrency] conservatively at first (e.g. 8 concurrent requests) and raise it only after confirming the target tolerates it
Make [target_data] specific so the typed structs and selectors match the page's real structure
Respect the target's robots.txt and terms; this prompt builds a polite scraper, but the legal and ethical call is yours
Pick [output_format] to match downstream needs; newline-delimited JSON streams well into other tools
Verify the CSS selectors against the live page, since scraper depends on the exact HTML structure of [site]
Test the Ctrl+C path under load to confirm in-flight requests actually drain before exit, since a botched shutdown corrupts your output file

Details

Claude/ChatGPT Prompt to Build a Concurrent Web Scraper in Rust

Fill in the placeholders

What this prompt does

When to use it

Example output

Pro tips

Frequently Asked Questions

Engr Mejba Ahmed

More in Rust & Go Prompts

Claude/ChatGPT Prompt to Generate a Production-Ready Go HTTP Server Template

Claude/ChatGPT Prompt to Port a Node.js Script to a Rust CLI

Claude/ChatGPT Prompt to Build a Go Microservice with gRPC + Connect-go

Ready to Transform

Your Ideas?

Engr Mejba Ahmed

Hey there!