What is Web Scraping? — Definition + Examples

Full definition

Web scraping covers everything from a 5-line Python script that fetches one page to a distributed system pulling petabytes a day. The basic loop is: send an HTTP request, parse the HTML response, extract the data you care about, store it. The hard parts: doing it at scale without getting blocked, handling JavaScript-rendered content, dealing with CAPTCHAs, and respecting the destination site's ToS and robots.txt.

Common tooling: Python with `requests` + BeautifulSoup or Scrapy for static pages; Playwright or Puppeteer for JavaScript-rendered pages; ScrapingBee, Bright Data Web Unlocker, or Oxylabs Web Scraper API for managed solutions that handle proxies + CAPTCHAs.

Legal: web scraping public data is generally legal in most jurisdictions (US: hiQ Labs v LinkedIn, EU: similar precedent), but each site's ToS may forbid it contractually. Don't scrape sites with login-walled content you don't have permission for. Don't scrape personal data without GDPR-compliant basis. When in doubt, talk to a lawyer.

Web Scraping

Full definition

Related terms

What's next