Natasha Lekh crawlee.dev

Crawlee is a web scraping & browser automation library for Node.js  ↦

Here’s Natasha Lekh from Apify describing the project:

This project really is a culmination of 4 years of work trying to make the best library for web scraping in production. Web scraping is a very dynamic environment and what works today might not work tomorrow, so we at Apify had to go through a lot of trial and error to figure out the most reliable and convenient ways of crawling the web and scraping data. We hope that we finally cracked it and that now many developers will enjoy working with our new library and it will make their scrapers more reliable and time to production faster.

I like how it starts with simple HTTP-based scraping, but can switch to browser-based automation when a site has JavaScript rendering. I don’t love the built-in proxy rotation features. Not because they’re bad, per se, but because they make spammers lives easier…


Discussion

Sign in or Join to comment or subscribe

Player art
  0:00 / 0:00