this post was submitted on 16 May 2024
515 points (97.1% liked)

Technology

59622 readers
4311 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 1 points 6 months ago* (last edited 6 months ago) (1 children)

Scraping at scale is actually cheaper than buying API access. It's a massive rising market, try googling "web scraping service" and there are hundreds of services that provide API to scrape any public web page and bypass the blocks for you and render all of the javascript.

[–] [email protected] 1 points 6 months ago (1 children)

Scraping ia nice for static conten, no doubt. But I wonder at what point it is easier to request changes to a developing thread via API than to request the whole page with all nested content over and over to find the new answes in there.

[–] [email protected] 1 points 6 months ago

Following a developing thread is a very tiny use case I'd imagine and even then you can just scrape the backend API that is used on the public page for the same results as private API.