Contents

Chapter 1

Features

This chapter covers the core features and capabilities of Web Scraper.

Features

  • CSS-like selectors — Extract elements by tag, .class, or #id
  • Automatic pagination — Follow next-page links across multiple pages
  • Rate limiting — Configurable delay between requests (be a polite scraper)
  • Proxy rotation — Round-robin through a list of proxy servers
  • Retry with backoff — Exponential backoff on 429/5xx errors
  • Multi-format export — Save to JSON, CSV, or SQLite
  • Deduplication — Content hashing prevents duplicate entries
  • Config file support — Define scrape jobs in JSON for repeatability
  • Metadata tracking — Each result tagged with source URL, page number, timestamp

Requirements

  • Python 3.10+
  • No external dependencies (stdlib only)
Chapter 2

Quick Start

Follow this guide to get Web Scraper up and running in your environment.

Quick Start

bash
# Scrape all h2 elements from a page
python src/web_scraper.py --url "https://example.com" --selector "h2"

# Scrape product cards with pagination
python src/web_scraper.py --url "https://example.com/shop" --selector ".product" \
    --follow --next-selector "a.next" --max-pages 5

# Export to CSV
python src/web_scraper.py --url "https://example.com" --selector ".item" \
    --format csv --output items.csv

# Use a config file for complex jobs
python src/web_scraper.py --config examples/scrape_config.json

Selector Syntax

SelectorMatches
divAll
elements
.productElements with class="product"
#mainElement with id="main"
a.link elements with class="link"
span.price elements with class="price"
Chapter 3
🔒 Available in full product

Configuration Reference

Chapter 4
🔒 Available in full product

FAQ

You’ve reached the end of the free preview

Get the full Web Scraper and unlock everything.

All Chapters

Get the complete guide with every chapter unlocked, including code samples, diagrams, and best practices.

Full Tool Suite

Access all interactive tools with complete data, all workload profiles, and the full scenario library.

Source Files

Downloadable source code, configuration files, and working examples from every chapter.

Lifetime Updates

Free updates for life. Every new chapter, tool, and improvement included.

Buy Now — $29 →
📦 Free sample included — download another copy for the full product.
Web Scraper v1.0.0 — Free Preview