this post was submitted on 23 Mar 2025
1236 points (98.3% liked)
Technology
68400 readers
5518 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related news or articles.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Eh I’m fine with the illegal harvesting of data. It forces the courts to revisit the question of what copyright really is and hopefully erodes the stranglehold that copyright has on modern society.
Let the companies fight each other over whether it’s okay to pirate every video on YouTube. I’m waiting.
AI scrapers illegally harvesting data are destroying smaller and open source projects. Copyright law is not the only victim
https://thelibre.news/foss-infrastructure-is-under-attack-by-ai-companies/
That article is overblown. People need to configure their websites to be more robust against traffic spikes, news at 11.
Disrespecting robots.txt is bad netiquette, but honestly this sort of gentleman's agreement is always prone to cheating. At the end of the day, when you put something on the net for people to access, you have to assume anyone (or anything) can try to access it.
You think Red Hat & friends are just all bad sysadmins? Source hut maybe...
I think there's a bit of both: poorly optimized/antiquated sites and a gigantic spike in unexpected and persistent bot traffic. The typical mitigations do not work anymore.
Not every site is and not every site should have to be optimized for hundreds of thousands of requests every day or more. Just because they can be doesn't mean that it's worth the time effort or cost.