this post was submitted on 04 Jul 2023
84 points (97.7% liked)

No Stupid Questions

35027 readers
1331 users here now

No such thing. Ask away!

!nostupidquestions is a community dedicated to being helpful and answering each others' questions on various topics.

The rules for posting and commenting, besides the rules defined here for lemmy.world, are as follows:

Rules (interactive)


Rule 1- All posts must be legitimate questions. All post titles must include a question.

All posts must be legitimate questions, and all post titles must include a question. Questions that are joke or trolling questions, memes, song lyrics as title, etc. are not allowed here. See Rule 6 for all exceptions.



Rule 2- Your question subject cannot be illegal or NSFW material.

Your question subject cannot be illegal or NSFW material. You will be warned first, banned second.



Rule 3- Do not seek mental, medical and professional help here.

Do not seek mental, medical and professional help here. Breaking this rule will not get you or your post removed, but it will put you at risk, and possibly in danger.



Rule 4- No self promotion or upvote-farming of any kind.

That's it.



Rule 5- No baiting or sealioning or promoting an agenda.

Questions which, instead of being of an innocuous nature, are specifically intended (based on reports and in the opinion of our crack moderation team) to bait users into ideological wars on charged political topics will be removed and the authors warned - or banned - depending on severity.



Rule 6- Regarding META posts and joke questions.

Provided it is about the community itself, you may post non-question posts using the [META] tag on your post title.

On fridays, you are allowed to post meme and troll questions, on the condition that it's in text format only, and conforms with our other rules. These posts MUST include the [NSQ Friday] tag in their title.

If you post a serious question on friday and are looking only for legitimate answers, then please include the [Serious] tag on your post. Irrelevant replies will then be removed by moderators.



Rule 7- You can't intentionally annoy, mock, or harass other members.

If you intentionally annoy, mock, harass, or discriminate against any individual member, you will be removed.

Likewise, if you are a member, sympathiser or a resemblant of a movement that is known to largely hate, mock, discriminate against, and/or want to take lives of a group of people, and you were provably vocal about your hate, then you will be banned on sight.



Rule 8- All comments should try to stay relevant to their parent content.



Rule 9- Reposts from other platforms are not allowed.

Let everyone have their own content.



Rule 10- Majority of bots aren't allowed to participate here.



Credits

Our breathtaking icon was bestowed upon us by @Cevilia!

The greatest banner of all time: by @TheOneWithTheHair!

founded 1 year ago
MODERATORS
 

The tech giants make enough money that they could keep on growing forever, from my understanding.

But the fediverse? Sure the main instances that get enough funding are going to be okay, but what about the single-user instances 10 years from now on when there's a lot more content to download? Won't they go bankrupt just by trying to annex the big instances?

And I have the impression that the lemmy giants are going to change over time: does that mean that 50 years from now on, the posts I'm posting here today might get lost in time because the instances that annex it will have shut down by then?

I probably misunderstand how the fediverse works, but my worry is that the small instances won't be able to hold an ever-growing amount of data forever.

I spoke in absolutes for the sake of readability, but I'm as in-the-dark as can be.

top 32 comments
sorted by: hot top controversial new old
[–] [email protected] 63 points 1 year ago (2 children)

No, after a sufficient amount of time has passed, we would run out of useable matter and energy in the universe. This theorized end-state of heat death puts a finite cap on the size of the Fediverse.

Constrained to Earth, it'd probably be fine. Though I do see it splintering eventually, with sub-communities existing independently from the main organism.

[–] [email protected] 7 points 1 year ago (1 children)

But would it work with spherical servers in vacuum?

[–] [email protected] 2 points 1 year ago (1 children)

Time to invent the Dyson Server!

[–] [email protected] 1 points 1 year ago

We already got one: the dyson.com server. /jk

[–] [email protected] 4 points 1 year ago

Is this amount of time clearly stated or defined. Indefinite does not mean infinite.

[–] [email protected] 42 points 1 year ago (1 children)

Mostly serious answer: the current implementation is not going to scale effectively with growth. The software implementation is still rough around the edges, and the ActivityPub protocol probably needs more knobs to handle bulk data synchronization. Within the service, moderaton is a serious challenge with many unanswered questions.

Likewise, the back end software implementation is monolithic, meaning it's one software stack that does everything from sign in to subscriptions to synchronization and scheduling. Housekeeping and garbage collection probably isn't that tight, either. This is mostly speculation as I've watched things over the last couple of weeks' growth.

I believe the data store is based on Postgres RDBMS, which while being robust and scalable is fussy and needs tuning when turning over large amounts of highly unique data.

None of this is an indictment on the devs! Rather the opposite, because the software IS chugging along while experiencing tremendous growth.

I expect over time the back end will devolve into micro services that communicate over a highly scalable, or stream-based messaging bus. Larger instances could probably also benefit from static caching and CDN techniques to keep pages loading quickly even while the back end thrashes.

The structure.if the ecosystem needs to strike a balance between fewer large instances and many-many small instances. In the first scenario, the scaling limit is in the monolithic stack, which introduces I/O bottlenecks and serialization delays (even if massively threaded). In the latter scenario, message state and synchronous distribution become challenging because a full mesh of federations could scale faster than network state tables have room to support. Some middle tier might be needed, and I have no idea what that might even look like.

So to answer your question, can it scale indefinitely? Probably not because we hit scaling limits pretty quickly on a number of dimensions. Nevertheless, smart people.are starting to hang out here, and I expect will take an interest in how it all works. Improvement is inevitable, and I think the early roadblocks will be overcome easily enough

[–] [email protected] 12 points 1 year ago (1 children)

There's nothing wrong with a monolith. Microservices are not inherently more scalable. Their advantage is around scaling teams. If anything, a monolith can be more performant as in-process calls are much faster thent network calls.

[–] [email protected] 2 points 1 year ago

There can be better efficiencies by disaggregating the full stack into microservices and making IPC calls among scalable workers versus strictly service-per-server models which, yes, incur scaling issues from network iowait. Modern network operating systems do this, which allows heavier loaded processes more access to resources while lesser loaded processes are deferred.

[–] [email protected] 27 points 1 year ago (1 children)

The world produces 15Mt of beans every year. The average shit post with beans has 700g of beans in it. This means Lemy can scale to around 22 billions shitposts/year. We have some margin.

#shitpost

[–] [email protected] 7 points 1 year ago (1 children)

This math checks out. I ran it through the bean calculator using OpenBeanAI. 32.33% of the simulations show these numbers.

[–] [email protected] 2 points 1 year ago (1 children)
[–] [email protected] 1 points 1 year ago
[–] [email protected] 14 points 1 year ago

Each instance only needs to hold the data from communities its users are subscribed to. And images live on their host instances anyway. No instance needs to hold the entirety of Lemmy. :)

[–] [email protected] 11 points 1 year ago (1 children)

Smaller instances don't grab everything from every other server, it only grabs data from other servers when their users are subscribed to specific communities, also I suspect it doesn't grab all historical data automatically (though I don't know how much it does grab by default)

Right now there's no migration tool for when instances shut down, but it should be technically possible someone just needs to implement it.

[–] [email protected] 3 points 1 year ago* (last edited 1 year ago) (1 children)

It doesn't get everything. At least it's not what kbin is doing, and I expect lemmy is the same. How it works for kbin is, once you subscribe you will start getting info about ANY change. A like, comment or anything like that.

So, someone likes a comment, anywhere on any instance. We'll get sent that like. But we don't have the comment. So, we fetch the comment. But, wait this one was a comment on a previous comment. So we'll fetch that too. All the way to the post. Also any users involved (the liker, and commenters back to original poster) will also be fetched. The hierarchy to the comment that was liked is then built, and then the like itself is applied.

That's why you will start to see a lot of old posts, but not all of them. It's just going to slowly build up over time as people interact and of course you'll get anything new.

Even with this I'm getting a LOT of content right now delivered.

EDIT: Not sure why I posted without finishing typing before.

[–] [email protected] 1 points 1 year ago

I see, that clears up a lot, thank you! I just hope that Lemmy is, as you suspect, doing the same as kbin.

[–] [email protected] 10 points 1 year ago (2 children)

I guessy answer is, who cares? Don't treat a social media account as some immortal time capsule of your life. Keep a photo album, write some diary entries, but don't rely on any form of social media to be the historical record of your existance. If it's inportant keep it somewhere you can ensure the preservation.

I'm pretty sure the world will continue long after we've forgotten beans and not pooping for X days.

[–] [email protected] 8 points 1 year ago (1 children)

I needed to be reminded of this, thanks.

Still, Reddit is probably the biggest and most accessible source of information in the world, written out of passion by people, experts, professors, neckbeards... trolls... uni students, researchers,

and I wish Lemmy could also become the archive that Reddit is, but if information has a high likelihood to get lost with time, why bother? It should then really only be treated as a very temporary social media which is... okay, I guess.

[–] [email protected] 1 points 1 year ago

Everything is temporary. Nothing is permanent. Embrace it and live in the now.

[–] [email protected] 6 points 1 year ago (1 children)

I think people need to be reminded of two big things when it comes to Lemmy:

  1. It is impermanent. Not intentionally, I'm sure most instances will try to keep all the posts for as long as possible. But we're just hosting this stuff on independent servers (also known as "somebody else's computer") and we can't rely on them to stay online forever.

  2. Lemmy is NOT PRIVATE. You cannot delete your posts, and this is by design. You can edit them, but there's an edit history, and even if there wasn't, it would be impossible to ensure that the old versions of your posts aren't stored on some random, rarely used instance. There is no big man in charge like Mark Zuckerberg that you can sue to delete your data. If you want to use Lemmy privately, DON'T POST YOUR PERSONAL INFO. Don't post things that can be used to identify you. This is a public forum. Treat it like one. If you don't like that, go somewhere else.

Sorry, #2 is kinda off topic, but I see a lot of confusion about what Lemmy is and isn't.

[–] [email protected] 1 points 1 year ago

Thanks, I'm also definitely confused about what Lemmy is and isn't. This clears up a lot.

[–] [email protected] 8 points 1 year ago

I’ve been wondering about that too. Specifically how long can old posts persist, and how long are instances compelled to host older content especially from other servers.

[–] [email protected] 7 points 1 year ago

The Fediverse needs to encourage different instances. It’s the only way it can work. It has the technical framework to do it and for it to be transparent to the enduser but I feel like it’s not there yet.

For example I think users should be strongly encourages to chose regional instances instead of lemmy.world (I know know, ironic coming from me). It should be default and require the user to go out of their way to select a different instance. It should also be concisely explained that your instance doesn’t matter and that you can see any other federated instance. Yes, this is not always true but it doesn’t matter to someone just joining. Let them get here first and then they’ll naturally learn about the intricacies. Don’t scare them away at the gates.

[–] [email protected] 5 points 1 year ago* (last edited 1 year ago) (2 children)

I imagine the devs aren’t worried about this yet.

Long-term, I imagine that archiving and culling old data would maybe make sense.

Maybe there will be the equivalent of archive.org for lemmy only one day.

[–] [email protected] 3 points 1 year ago* (last edited 1 year ago) (1 children)

you can trigger https://web.archive.org/ to "Save Page Now"

It says below the input URL box "Capture a web page as it appears now for use as a trusted citation in the future."

[–] [email protected] 3 points 1 year ago

archive.org also has an extension that automatically scrapes webpages that haven't been downloaded in 90/60/30/7days/24hrs

[–] [email protected] 2 points 1 year ago

Mastodon already has something to purge posts older than X days And that's with reletively small snippits of text generally. It'll be essential for any kind of ling term running on small nodes to automate removal of posts to avoid blowing up the host's disk space. There's also the possibility of 'purge after X period of time with no activity' which would make sense for something like this where things can turn into long discussions more often than it does with a toot/tweet.

[–] [email protected] 4 points 1 year ago (1 children)

It doesn't even work right, lemmy.ca and beehaw.org are not synchronized even if they federate with each others, certainly the same with multiple instances.

[–] [email protected] 1 points 1 year ago

Yes, I've noticed that on Kbin.social, there are fewer visible comments than on Lemmy.world. It looks like quite a few comments aren't being federated properly.

[–] [email protected] 1 points 1 year ago (1 children)

Why would an instance annex another instance? I can't imagine the pain in trying to merge two databases like that, then handling the changing of the address.

If anything, I expect a single or a collection of larger instances could choose to defederate from the larger federation.

[–] [email protected] 1 points 1 year ago (1 children)

My bad. By annexing, I meant downloading posts from other instances.

[–] [email protected] 1 points 1 year ago

It is basically like when a person goes to a webpage. If anything, it is more likely that a group of smaller instances to take down a big instance hosting tons of communities.

load more comments
view more: next ›