this post was submitted on 26 Jul 2024
105 points (92.7% liked)

Fediverse

28205 readers
506 users here now

A community to talk about the Fediverse and all it's related services using ActivityPub (Mastodon, Lemmy, KBin, etc).

If you wanted to get help with moderating your own community then head over to [email protected]!

Rules

Learn more at these websites: Join The Fediverse Wiki, Fediverse.info, Wikipedia Page, The Federation Info (Stats), FediDB (Stats), Sub Rehab (Reddit Migration), Search Lemmy

founded 1 year ago
MODERATORS
 

All the posts about Reddit blocking everyone except Google and Brave got me thinking: What if SearNGX was federated? I.E. when data is retrieved via a providers API, that data is then federated to all other instances.

It would spread the API load out amongst instances, removing the API bottlenecks that come from search providers.

It would allow for more anonymous search, since users could cycle between instances and get the same results.

Geographic bias would be a thing of the past.

Other than ActivityPub overhead and storage, which could be reduced by federating text-only content, I fail to see any downside.

Thoughts?

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 5 points 3 months ago

Sure. SearX is a meta-search engine. It does (only) queries to other search engines to get results. YaCy on the other hand is itself a search engine. It has the data available and doesn't do queries to other engines. In theory you could combine the two concepts. Have a software that does both. But that requires some clever thinking. The returned (Google) ranking only applies to the exact search term. And it's questionable if you can store it and do anything useful with it except for when some other user searches for the exact same thing. And also the returned teaser texts are very short and tailored to the search query. So maybe also useless. It'd be hard.

One thing you could do is crawl the results that users actually click on. And I think YaCy already does that. AFAIK they had an browser add-on or a proxy or something to intercept visited pages (and hence search results).