this post was submitted on 07 Aug 2023
55 points (77.8% liked)

Ask Lemmy

27334 readers
2017 users here now

A Fediverse community for open-ended, thought provoking questions


Rules: (interactive)


1) Be nice and; have funDoxxing, trolling, sealioning, racism, and toxicity are not welcomed in AskLemmy. Remember what your mother said: if you can't say something nice, don't say anything at all. In addition, the site-wide Lemmy.world terms of service also apply here. Please familiarize yourself with them


2) All posts must end with a '?'This is sort of like Jeopardy. Please phrase all post titles in the form of a proper question ending with ?


3) No spamPlease do not flood the community with nonsense. Actual suspected spammers will be banned on site. No astroturfing.


4) NSFW is okay, within reasonJust remember to tag posts with either a content warning or a [NSFW] tag. Overtly sexual posts are not allowed, please direct them to either [email protected] or [email protected]. NSFW comments should be restricted to posts tagged [NSFW].


5) This is not a support community.
It is not a place for 'how do I?', type questions. If you have any questions regarding the site itself or would like to report a community, please direct them to Lemmy.world Support or email [email protected]. For other questions check our partnered communities list, or use the search function.


6) No US Politics.
Please don't post about current US Politics. If you need to do this, try [email protected] or [email protected]


Reminder: The terms of service apply here too.

Partnered Communities:

Tech Support

No Stupid Questions

You Should Know

Reddit

Jokes

Ask Ouija


Logo design credit goes to: tubbadu


founded 2 years ago
MODERATORS
 

It's excruciatingly obnoxious to have to rely on third party sources for what should be a first-party feature.

Like, I select all and then search a query. "Oh no, nobody on your server used a third party service to find it, so you won't see it here."

Like, how short-sighted is that, really? If I search for a string in the 'all' servers, I should have a list of 'all' the servers containing that string.

It's a really simple concept. Not sure why this post even has to be made, but I'm wondering if there's something I can do to make these 'features' more intuitive.

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 25 points 1 year ago (3 children)

totally understand the frustration, and i’m not going to try and invalidate it!

… however, it’s definitely not a problem with a simple solution

since anyone can start an instance, when you search “all”, where should it search? i don’t mean generally like “all the instances”, i mean where specifically? things like lemmy.world, lemmy.ml, kbin.social, etc are obvious… but what about lemmy.mydomainforfriends.social (not real but let’s pretend someone created their own little instance for friends there!)?

let’s say you say yes that should be searched, okay… how does your instance know it’s there? does it tell all other instances that it exists at some point? where does IT get that list from? (the current solution to this is that your instance starts to “know about” an instance after someone interacts with it, but this has the problem you’ve described)

let’s say that instance shouldn’t be searched… now, what are the rules (automatic id assume; not with human intervention) that would allow an instance to be added to some big list somewhere? also where is that list? now we’re back at problem 1: how do you store a federated list of servers?

the problem gets even harder when you consider mastodon, pixelfed, peertube, etc… all these services interact: should all include them? only certain things in them?

[–] [email protected] 1 points 1 year ago* (last edited 1 year ago) (1 children)

While it has problems of its own, instances could pool and share that knowledge. The first time an instance talks to a different insta ce it could just ask "hey, what other instances are you aware of?". The main issue there is just instances obsessively sending exponentially growing lists of instances back and forth.

But no, that is the main bane of federated social media, discoverability without a center of truth

[–] [email protected] 1 points 1 year ago

yup! 100% agree! federation is kind of a new thing and we have some issues to work out that’s for sure!

heck, i could even see some kind of federated search service: activitypub instances could submit their content for indexing and individual instance could choose an existing, or run their own federated fediverse search… importantly, there would need to be choice for each individual instance with no centralised repository

[–] [email protected] -1 points 1 year ago (1 children)

So many options, doing none seems lazy. I can source all kinds of lists for my pihole to block traffic. I can put a lot of repos in my yum.conf. It’s not like this should be reliant on any one single source of truth. There could certainly be an open source list maintained. I’m surprised this is considered such a difficult problem with so many smart folks involved, I’m obviously really ignorant to how this stuff works. I just don’t get how a problem that seems to have been solved across a litany of technical products using shared sources in defederated environments is such an exotic hurdle here.

[–] [email protected] 1 points 1 year ago (1 children)

okay so now you have a decentralised list with 1000 servers on it. does your instance… make 1000 requests when you search?

[–] [email protected] 0 points 1 year ago* (last edited 1 year ago)

Lists can be cached and updated. Even if posts from all doesn’t include all active content it would be very manageable to have queries include communities across instances based on names and other fields. All this shit is already solved problems.

[–] [email protected] -1 points 1 year ago (1 children)

since anyone can start an instance, when you search “all”, where should it search?

Easy! It should search all the servers your server is federated with! Servers should contain a list of their community names that can be easily and quickly queried by other servers.

[–] [email protected] 6 points 1 year ago (2 children)

Federation isn't opt-in though. It would be VERY easy to spin up a bunch of instances with millions or billions of fake communities and use them to DDOS a server's search function.

Searching current active subscriptions helps mitigate that vector a little.

[–] Benj1B 4 points 1 year ago

I would suggest that instances should have settings that allow them to decide whether to "advertise" a community list. With configurable settings like "all, "most active", "top X", or even a manually maintained list depending on the admins and instances preferences.

Then your home instance, when searching, should have it's own settings to decide what results it's going to ping other servers for. Big/popular/high confidence instances can have an open all/all relationship, while you might query only the top 10 communities from unknown or new instances to handle the scenario you describe.

Federation can be binary yes/no but there should be room to add more logic around enabling search on communities from your instance and controlling the search results from other instances. I don't think the two are mutually exclusive, unless I fundamentally misunderstand how federation works!

[–] [email protected] 0 points 1 year ago (1 children)

I... don't think you know what ddossing means but okay.

Would it really be very easy? Especially considering once instances find your doing that, they just block you? Would it be worth people's time?

Is there any way around this, perhaps querying a global repository of federated instances and sorting them by popularity?

In all honesty, you don't have a point. If you did, third-party services already wouldn't offer this. Seeing as they can, it's clearly possible.

[–] [email protected] 3 points 1 year ago* (last edited 1 year ago)

Sorry you're right that I wasn't being precise with my terminology. It's not a DDOS but it could be used to slow down targeted features, take up some HTTP connections, inflate the target's DB, and waste CPU cycles, so it shares some characteristics of one.

In general, you want to be very very careful of implementing features that allow untrusted parties to supply potentially unbounded resources to your server.

And yeah, it would be trivial to write a set of scripts that pretend to be a lemmy instance and supply an endless number of fake communities to the target server. The nice thing about this attack vector is that it's also not bound by the normal rate limiting since it's the target server making the requests. There are definitely a bunch of ways lemmy could mitigate such an attack, but the current approach of "list communities current users are subscribed to" seems like a decent first approach.