this post was submitted on 04 Mar 2025
609 points (99.0% liked)

Technology

64075 readers
7041 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
top 50 comments
sorted by: hot top controversial new old
[–] [email protected] 29 points 1 day ago (1 children)

Need to start spoofing user agent strings again.

[–] [email protected] 5 points 6 hours ago

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Hotbar 3.0)

[–] [email protected] 56 points 1 day ago (2 children)

It is obvious that Cloudflare is being influenced to enforce browser monopolies. Imagine if Cloudflare existed in 2003 and stopped non Internet Explorer browsers. If you use cloudflare to "protect" your site you are discriminating against browser choice and are as bad as Microsoft in 1998.

[–] sugar_in_your_tea 1 points 8 hours ago

Agreed. I use cloudflare for domain hosting because they're cheap, but I have never liked their protections.

[–] [email protected] 7 points 1 day ago

If you use cloudflare to "protect" your site you are discriminating against browser choice and are as bad as Microsoft in 1998.

😕

[–] [email protected] 22 points 1 day ago (1 children)

What doesn't work with Lynx is a wrong website.

[–] sugar_in_your_tea 1 points 8 hours ago* (last edited 8 hours ago)

Agree for static content like news and blogs. Disagree for dynamic content like games and social media. And the latter is mostly for scale (having server-side templating is expensive for rapidly changing content).

Then again, there's a case for snapshotting SM pages every so often for things like crawlers and cli browsers.

[–] [email protected] 13 points 1 day ago (1 children)

I was planning on moving away from Cloudflare to European providers anyway, so this just adds fuel to the fire.

I'm considering using BunnyDNS for DNS management, not using a CDN at all, and using Scaleway for serverless functions.

[–] admin 2 points 23 hours ago (1 children)

Maybe is against the ToS but I've used github as CDN for free in the past... Might work for you.

I never felt it was wrong, it was around the time of the Microsoft acquisition.

[–] [email protected] 2 points 12 hours ago

I appreciate the suggestion, but Github is also an American company. I've been moving my git repositories to Codeberg.

My sites don't get enough traffic to warrant a CDN really, but if necessary, BunnyCDN looks like it can fit the bill. Plus, my static sites are in Scaleway object storage.

[–] [email protected] 23 points 1 day ago* (last edited 1 day ago)

So make useragent sniffing useless by all being Chrome?

Funnily enough, some webpages work better if you block webgl and set the user agent to Lynx or Dillo.

[–] [email protected] 9 points 1 day ago

Should change my user agent to sod off

[–] [email protected] 209 points 2 days ago (8 children)

Disgusting and unsurprising.

Most web admins do not care. I've lost count of how many sites make me jump through CAPTCHAS or outright block me in private browsing or on VPN. Most of these sites have no sensitive information, or already know exactly who I am because I am already authenticating with my username and password. It's not something the actual site admins even think about. They click the button, say "it works on my machine!" and will happily blame any user whose client is not dead-center average.

Enter username, but first pass this CAPTCHA.

Enter password, but first pass this second CAPTCHA.

Here's another CAPTCHA because lol why not?

Some sites even have their RSS feed behind Cloudflare. And guess what that means? It means you can't fucking load it in a typical RSS reader. Good job!

The web is broken. JavaScript was a mistake. Return to ~~monke~~ gopher.

Fuck Cloudflare.

[–] [email protected] 107 points 2 days ago* (last edited 2 days ago) (14 children)

I get why you're frustrated and you have every right to be. I'm going to preface what I'm going to say next by saying I work in this industry. I'm not at Cloudflare but I am at a company that provides bot protection. I analyze and block bots for a living. Again, your frustrations are warranted.

  • Even if a site doesn't have sensitive information, it likely serves a captcha because of the amount of bots that do make requests that are scraping related. The volume of these requests can effectively DDoS them. If they're selling something, it can disrupt sales. So they lose money on sales and eat the load costs.

  • With more and more username and password leaks, credential stuffing is getting to be a bigger issue than anyone actually realizes. There aren't really good ways of pinpointing you vs someone that has somehow stolen your credentials. Bots are increasingly more and more sophisticated. Meaning, we see bots using aged sessions which is more in line with human behavior. Most of the companies implementing captcha on login segments do so to try and protect your data and financials.

  • The rise in unique, privacy based browsers is great and it's also hard to keep up with. It's been more than six months, but I've fingerprinted Pale Moon and, if I recall correctly, it has just enough red flags to be hard to discern between a human and a poorly configured bot.

Ok, enough apologetics. This is a cat and mouse game that the rest of us are being drug into. Sometimes I feel like this is a made up problem. Ultimately, I think this type of thing should be legislated. And before the bot bros jump in and say it's their right to scrape and take data it's not. Terms of use are plainly stated by these sites. They consider it stealing.

Thank you for coming to my Tedx Talk on bots.

Edit: I just want to say that allowing any user agent with "Pale Moon" or "Goanna" isn't the answer. It's trivially easy to spoof a user agent which is why I worked on fingerprinting it. Changing Pale Moon's user agent to Firefox is likely to cause you problems too. The fork they are using has different fingerprints than an up to date Firefox browser.

[–] [email protected] 7 points 1 day ago (1 children)
[–] [email protected] 3 points 1 day ago (1 children)

Thanks for reading and commenting!

[–] [email protected] 4 points 1 day ago* (last edited 1 day ago) (3 children)

During my first (shitty) job as a dev outta school, they had me writing scrapers. I was actually able to subvert it pretty easily using this package that doesn't appear to be maintained anymore https://github.com/VeNoMouS/cloudscraper

Was pretty surprised to learn that, at the time, they were only checking if JS was enabled, especially since CF is the gold standard for this sort of stuff. I'm sure this has changed?

load more comments (3 replies)
[–] [email protected] 35 points 1 day ago (1 children)

Dude, thank you for this context. I was already aware of these considerations but just wanted to thank you for sharing this with everyone. Its participation like this that makes the internet a better place. 🍻

load more comments (1 replies)
[–] [email protected] 8 points 1 day ago (1 children)

But captchas have now proven useless, since bots are better at solving them now than humans?

[–] [email protected] 4 points 1 day ago (1 children)

Welcome to bot detection. It's a cat and mouse game, an ever changing battle where each side makes moves and counter moves. You can see this with the creation of captcha-less challenges.

But to say captcha are useless because bots can pass them is somewhat similar to saying your antivirus is useless because certain malware and ransomware can bypass it.

[–] [email protected] 1 points 12 hours ago* (last edited 12 hours ago) (1 children)

But they are better than humans at solving them.

[–] [email protected] 1 points 8 hours ago

How are you measuring this? On my end, when I look at the metrics I have available to me, the volume of bot requests that are passing captcha does not exceed that of humans. We continually review false positives and false negatives to make sure we aren't impacting humans while still making it hard for bots.

[–] [email protected] 1 points 1 day ago

Terms of use are plainly stated by these sites. They consider it stealing.

I consider it more trespassing than stealing myself.

load more comments (10 replies)
[–] [email protected] 8 points 1 day ago

Ever been down the gemini rabbit hole? It's not perfect, but quite interesting.

load more comments (6 replies)
[–] [email protected] 99 points 2 days ago (10 children)

These bastards haven’t MITMed half the internet for nothing. This isn’t the first time they abuse that either.

I hate that I once fell for it too when I just started out hosting stuff and put it behind their proxy.

load more comments (10 replies)
[–] turnip 55 points 2 days ago* (last edited 2 days ago) (2 children)

I can't use my Browser without it being created by a tech giant, cant use my new computer without having my software uefi signed by Microsoft, AI will soon need me to have my GPU licensed and registered.

The world is heading to crap.

[–] [email protected] 1 points 39 minutes ago* (last edited 26 minutes ago)

You can always build a PC and not have to deal with that UEFI signing stuff as you're expected to provide your own OS still, that option hasn't been eliminated yet.

Also, AMD cards are more friendly to Linux users than Nvidia cards are, even with the existence of NVK for the latter; NVK only supports Turing and newer cards and Maxwell, Pascal, and Volta are too old for it, and since Nouveau is broken on Maxwell and newer by firmware signing, once those cards lose support in the proprietary drivers, unless NVK gets backported to them somehow, you'll be SOL in the near future for 900-series, 10-series, and the Titan V, while Kepler and older is still supported by Nouveau, meanwhile over at AMD, Mesa actively supports Radeon cards going back to GCN1.

Basically, if you still have an R9 Fury or an RX 580 sitting around, for example, those cards will still be actively supported by Mesa open drivers for the foreseeable future, meanwhile your GTX 980Ti or 1080Ti, at least currently, are fully at the mercy of Nvidia's closed drivers.

load more comments (1 replies)
[–] [email protected] 59 points 2 days ago (2 children)

On librewolf, i get blocked. its a firefox fork and still it happens. had to set up a Firefox User Agent plugin.

load more comments (2 replies)
[–] [email protected] 60 points 2 days ago

Lol... You gonna browse how daddy told you or you won't get to browse

[–] [email protected] 50 points 2 days ago (2 children)

I would be very interested to know how they plan to resolve these issues with "Ladybird." Using a new engine will likely clash with the FALSE "security measures" of many websites and harm the browsing experience. It’s often said that users should demand respect for web standards, but in the meantime, as usability declines, users will gradually drift away. Firefox learned this lesson the hard way.

load more comments (2 replies)
load more comments
view more: next ›