this post was submitted on 18 Oct 2023
34 points (97.2% liked)

Selfhosted

40394 readers
374 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 2 years ago
MODERATORS
 

My latest Google search replacement recently made a decision that basically forces me to turn off ad block in order to click results. I was wondering if there was any self hosted solution that is fairly easy to deploy in TrueNAS scale or if it is even worth doing. Bonus points if it's federated somehow. I'll deal with bad results if it needs time to grow as a project.

I also want to add that what little self hosting I've done so far has felt like cutting out a festering cancer and it feels so good to be in control of my online life again. Thanks so much for the guidance since the Rexxit. Finding out that you could easily self host a Reddit replacement with other people was what got me going to into this to begin with.

top 24 comments
sorted by: hot top controversial new old
[–] [email protected] 27 points 1 year ago (4 children)
[–] [email protected] 19 points 1 year ago (3 children)

It looks like a few people are recommending this, so just a quick note in case people are unaware:

If you want to avoid being tracked, this is not a good solution. Searxng is a meta search engine, meaning it is effectively a proxy: you search on Searxng, it searches multiple sites and sends all the results back to you. If you use a public instance, you may be protected from the actual search engine*, because many people will use the same instance, and your queries will be mixed in with all of them. If you self host, however, all the searches will be your own - there is then no difference between using Searxng and just going to the site yourself.

*The caveat with using the public instances is while you may be protected from the upstream engine, you have to trust the admins - nothing stops them from tracking you themselves (or passing your data on).

Despite the claims in their docs, I would not consider this a privacy tool. If you are just looking for a good search engine, this may work, and it gives you flexibility and power to tune it yourself. But it's probably not going to do anything good for your privacy, above and beyond what you can get from other meta search engines like Startpage and DuckDuckGo, or other "private" search engines like Brave.

[–] [email protected] 4 points 1 year ago (2 children)

OP isn't asking for a secure search engine though, they're asking for one without ads that they can control themselves. Also while searxng and other meta search engines won't neccesarily protect you from data harvesting they will protect you from tracking cookies and the absolute trash mountain of fake results (imo especially noticeable with google search)

[–] [email protected] 3 points 1 year ago (1 children)

google's results got so bad recently I had to turn it off in my searxng instance

[–] [email protected] 2 points 1 year ago

Use Yandex.ru, if you are looking for free access to the content in English. https://www.reddit.com/r/Piracy/comments/nd7w7s/lpt_if_you_cant_find_a_torrent_via_google_because/

[–] [email protected] 2 points 1 year ago

They are explicitly trying to move away from Google, and are looking for a new option because their current solution is forcing them to turn off ad-blocking. Sounds to me like they are looking for a private option. Plus, given the forum in which we are having the discussion (Lemmy), even if OP is not specifically concerned with privacy, it seems likely other users are.

As for cookies, searxng can't do any more than your browser (possibly with extensions) can do, and relying on your browser here is a much better solution, because it protects you on all sites, rather than just on your chosen search engine.

"Trash mountain" results is a whole separate issue - you can certainly tune the results to your liking. But literally the second sentence of their GitHub headline is touting no tracking or profiling, so it seems worth bringing attention to the limitations, and that's all I'm trying to do here.

[–] Sethayy 1 points 1 year ago (1 children)

I'm not an expert but one could funnel all web traffic through a VPN if they needed right? Gaining possibly even more obscurity and shifting the trust to a company vs a small user

(relative whether that's an upgrade or not in privacy)

[–] [email protected] 1 points 1 year ago

You mean between their instance and the final search engines? Or between them and a public instance of searxng?

In either case, I'm not sure it buys you anything in terms of privacy you wouldn't get by using the VPN and going directly to the search engines.

[–] [email protected] 1 points 1 year ago (1 children)

You're partially right about self hosting, but it still strips out the user tracking scripts and only provides the pure results, and you can make SearXNG route to Tor..

[–] [email protected] 2 points 1 year ago

I noted in another comment that SearXNG can't do anything about the trackers that your browser can't do, and solving this at the browser level is a much better solution, because it protects you everywhere, rather than just on the search engine.

Routing over Tor is similar. Yes, you can route the search from your SearXNG instance to Google (or whatever upstream engine) over Tor, and hide your identity from Google. But then you click a link, and your IP connects to the IP of whatever site the results link to, and your ISP sees that. Knowing where you land can tell your ISP a lot about what you searched for. And the site you connected to knows your IP, so they get even more information - they know every action you took on the site, and everything you viewed. If you want to protect all of that, you should just use Tor on your computer, and protect every connection.

This is the same argument for using Signal vs WhatsApp - yes, in WhatsApp the conversation may be E2E encrypted, but the metadata about who you're chatting with, for how long, etc is all still very valuable to Meta.

To reiterate/clarify what I've said elsewhere, I'm not making the case that people shouldn't use SearXNG at all, only that their privacy claims are overstated, and if your goal is privacy, all the levels of security you would apply to SearXNG should be applied at your device level: Use a browser/extension to block trackers, use Tor to protect all your traffic, etc.

[–] [email protected] 5 points 1 year ago

Seconding this. I use it and it works fairly well.

[–] [email protected] 1 points 1 year ago

I'm really happy with my searxng instance

[–] sorghum 1 points 1 year ago

I had no idea that was what that was. Learn something new every day.

[–] [email protected] 13 points 1 year ago* (last edited 1 year ago)

Like others, I use searxng.

But you can also try whoogle and librex

[–] [email protected] 11 points 1 year ago (1 children)

I am using searxng

You can customize a lot and the results are good imo 🌞

[–] sorghum 3 points 1 year ago (2 children)

Now i need to figure how I need to make it public/accesible so I can use it when I'm out and about.

[–] [email protected] 8 points 1 year ago

Easiest way if it's only for yourself is using tailscale

[–] RvTV95XBeo 4 points 1 year ago

Set up a VPN. Safest / best way to do it

[–] [email protected] 7 points 1 year ago
[–] [email protected] 6 points 1 year ago* (last edited 1 year ago)

Search engines take a LOT of work to run, which is why there's so few of them. You can self-host a search engine that indexes one site, but not one that indexes the entire internet lol. The closest you'll find is SearxNG as others mentioned. It's not a search engine itself though; it just uses other search engines.

[–] quizno50 4 points 1 year ago (1 children)
[–] [email protected] 3 points 1 year ago

Yes, Yacy is what you want OP (https://yacy.net). It's rather pathetic that people are still trying to be a parasite, but wanting to do so anonymously. Roll up your sleaves and commit your resources to making community search engines work. You have the control.

[–] m_randall 2 points 1 year ago

Huh…so there’s currently no open source search engine out there? I see a few crawlers, and some UIs the crawlers can use but no one project consolidating the two.

[–] [email protected] 2 points 1 year ago

Instead of a 'normal' search engine, you could take a look at a Gpt like replacement, maybe there is one that also protects you your privacy, and it can certainly be used to find what normal search engines could find