Full definition
Web scraping covers everything from a 5-line Python script that fetches one page to a distributed system pulling petabytes a day. The basic loop is: send an HTTP request, parse the HTML response, extract the data you care about, store it. The hard parts: doing it at scale without getting blocked, handling JavaScript-rendered content, dealing with CAPTCHAs, and respecting the destination site's ToS and robots.txt.
Common tooling: Python with `requests` + BeautifulSoup or Scrapy for static pages; Playwright or Puppeteer for JavaScript-rendered pages; ScrapingBee, Bright Data Web Unlocker, or Oxylabs Web Scraper API for managed solutions that handle proxies + CAPTCHAs.
Legal: web scraping public data is generally legal in most jurisdictions (US: hiQ Labs v LinkedIn, EU: similar precedent), but each site's ToS may forbid it contractually. Don't scrape sites with login-walled content you don't have permission for. Don't scrape personal data without GDPR-compliant basis. When in doubt, talk to a lawyer.