this post was submitted on 10 Nov 2024
84 points (98.8% liked)

Selfhosted

40296 readers
221 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 1 year ago
MODERATORS
 

If you think this post would be better suited in a different community, please let me know.


Topics could include (this list is not intending to be exhaustive — if you think something is relevant, then please don't hesitate to share it):

  • Moderation
  • Handling of illegal content
  • Server structure (system requirements, configs, layouts, etc.)
  • Community transparency/communication
  • Server maintenance (updates, scaling, etc.)

Cross-posts

  1. https://sh.itjust.works/post/27913098
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 39 points 1 week ago (6 children)

We require applications, and most applications we get are extremely low effort and we don't approve them. If you have open registrations you'll be doing a lot of moderation for spam.

Run the software that scans images for CSAM. It's not perfect but it's something. If your instance freely hosts whatever without any oversight, word will spread and all of a sudden you're hosting all sorts of bad stuff. It's not technically illegal if you don't know about it, but I personally don't want anything to do with that.

[–] [email protected] 15 points 1 week ago (1 children)

I will add that if you have open registrations you will be a target for spam and trolls, and if you don't take quick action then some other instances are likely to defederate from your instance.

This depends on the instance, some will have a low tolerance and defederate pretty quickly, some instances will defederate temporarily until the spammers or trolls move to a different instance, and some won't care. But you likely won't know it's happened unless you notice you aren't getting content from that instance anymore.

One other thing is that if you're going to run an instance and aren't already on Matrix, make an account. It's how instance admins tend to keep in contact with each other.

[–] Kalcifer 9 points 1 week ago* (last edited 1 week ago)

[...] if you’re going to run an instance and aren’t already on Matrix, make an account. It’s how instance admins tend to keep in contact with each other.

This is good advice.

[–] Kalcifer 12 points 1 week ago (1 children)

Run the software that scans images for CSAM.

Which software is that?

[–] [email protected] 16 points 1 week ago (2 children)

It's called Lemmy-Safety of Fedi-Safety depending on where you look.

One thing to note, I wasn't able to get it running on a VPS because it requires some sort of GPU.

[–] Kalcifer 10 points 1 week ago (2 children)

One thing to note, I wasn’t able to get it running on a VPS because it requires some sort of GPU.

This is good to know. I know that you can get a VPS with a GPU, but they're usually rather pricey. I wonder if there's one where the GPU's are shared, and you only get billed by how much the GPU is used. So if there is an image upload, the GPU would kick on to check it, you get billed for that GPU time, then it turns off and waits for the next image upload.

[–] [email protected] 5 points 1 week ago (2 children)

I don't think there are services like that, since usually this means deploying and destructing an instance, which takes a few minutes (if you just turn off the instance you still get billed).
Probably the best option would be to have a snapshot, which costs way less than the actual instance, and create from it each day or so yo run on the images since it was last destroyed.

This is kind of what I do with my media collection, I process it on my main machine with a GPU, and then just serve it from a low-power one with Jellyfin.

[–] Kalcifer 2 points 1 week ago* (last edited 1 week ago) (1 children)

create from it each day or so yo run on the images since it was last destroyed.

Unfortunately, for this usecase, the GPU needs to be accessible in real time; there is a 10 second window when an image is posted for it to be processed [1].

References

  1. "I just developed and deployed the first real-time protection for lemmy against CSAM!". @[email protected]. [email protected]. Divisions by zero. Published: 2023-09-20T08:38:09Z. Accessed: 2024-11-12T01:28Z. https://lemmy.dbzer0.com/post/4500908.
    • §"For lemmy admins:"

      [...]

      • fedi-safety must run on a system with GPU. The reason for this is that lemmy provides just a 10-seconds grace period for each upload before it times out the upload regardless of the results. [1]

      [...]

[–] [email protected] 3 points 1 week ago* (last edited 1 week ago) (1 children)

You can actually run it in async model without pictrs safety and just have it scan your newly uploaded images directly from storage. It just doesn't prevent upload this way, just deletes them.

[–] Kalcifer 1 points 6 days ago (1 children)

You're referring to using only fedi-safety instead of pictrs-safety, as was mentioned in §"For other fediverse software admins", here, right?

[–] Kalcifer 1 points 1 week ago

Probably the best option would be to have a snapshot

Could you point me towards some documentation so that I can look into exactly what you mean by this? I'm not sure I understand the exact procedure that you are describing.

[–] [email protected] 1 points 1 week ago (1 children)

The software is setup in such a way that you can run it on your pc if you have a local gpu. It only needs like 2 gb vram

[–] Kalcifer 1 points 1 week ago* (last edited 1 week ago) (1 children)

That is a cool feature, but that would mean that all of the web traffic would get returned to my local network (assuming that the server is set up on a remote VPS), which I really don't want to have happen. There is also the added downtime potential cause by the added point of failure of the GPU being hosted in a much more volatile environment (ie not, for example, a tier 3 data center).

[–] [email protected] 2 points 1 week ago (3 children)

Not all web traffic, just the images to check. With any decent bandwidth, it shouldn't be an issue for most. It also setup in such a way as to not cause a downtime if the checker goes down.

[–] Kalcifer 1 points 1 week ago (1 children)

With any decent bandwidth, it shouldn’t be an issue for most.

It's not only the bandwidth; I just fundamentally don't relish the idea of public traffic being directed to my local network.

[–] [email protected] 2 points 1 week ago (1 children)

You don't get public traffic redirected. It's not how it works

[–] Kalcifer 1 points 6 days ago* (last edited 6 days ago) (1 children)

Yeah, that was poor wording on my part — what I mean to say is that there would be unvetted data flowing into my local network and being processed on a local machine. It may be overparanoia, but that feels like a privacy risk.

[–] [email protected] 1 points 6 days ago (1 children)

I don't see how it's a privacy risk since you're not exposing your IP or anything. Likewise the images are already uploaded to your servers, so there's no extra privacy risk for the uploader.

[–] Kalcifer 1 points 4 days ago (1 children)

"Security risk" is probably a better term. That being said, a security risk can also infer a privacy risk.

[–] [email protected] 1 points 4 days ago (1 children)

Why would it be a security risk?

[–] Kalcifer 1 points 3 days ago (1 children)

For clarity, I'm not claiming that it would, with any degree of certainty, lead to incurred damage, but the ability to upload unvetted content carries some degree of risk. For there to be no risk, fedi-safety/pictrs-safety would have to be guaranteed to be absolutely 100% free of any possible exploit, as well as the underlying OS (and maybe even the underlying hardware), which seems like an impossible claim to make, but perhaps I'm missing something important.

[–] [email protected] 1 points 3 days ago* (last edited 3 days ago) (1 children)

You mean an exploit payload embedded in an image, and pwning a system parsing that image through python PIL? While there's never a 100% chance of anything, you're more likely to be struck by lightning than this coming to pass and at that point you're at more security risk at using the internet altogether.

[–] Kalcifer 1 points 12 minutes ago

I will preface by saying that I am not casting doubt on your claim, I'm simply curious: What is the rationale behind why it would be so unlikely for such an exploit to occur? What rationale causes you to be so confident?

[–] Kalcifer 1 points 1 week ago (1 children)

It also setup in such a way as to not cause a downtime if the checker goes down.

Oh? Would the fallback be that it simply doesn't do a check? Or perhaps it could disable image uploads if the checker is down? Something else? Presumably, this would be configurable.

[–] [email protected] 2 points 1 week ago

It stops doing checks. Iirc you can configure it yes

[–] Kalcifer 1 points 1 week ago

Not all web traffic, just the images to check.

Ah, yeah, my bad this was a lack of clarity on my part; I meant all image traffic.

[–] [email protected] 3 points 1 week ago

https://github.com/db0/fedi-safety and the companion app https://github.com/db0/pictrs-safety which can be installed as part of your lemmy deployment in the docker-compose (or with a var in your ansible)

[–] Kalcifer 6 points 1 week ago

If your instance freely hosts whatever without any oversight, word will spread and all of a sudden you’re hosting all sorts of bad stuff. It’s not technically illegal if you don’t know about it, but I personally don’t want anything to do with that.

Yeah, this is my primary concern. I'm hoping that there are established best practices for handling the majority of this sort of unwanted content.

[–] Kalcifer 4 points 1 week ago (2 children)

If you have open registrations you’ll be doing a lot of moderation for spam.

Perhaps Captchas are sufficient?

[–] [email protected] 4 points 1 week ago

I just checked and we have that turned on, too.

We don't get a lot of applications. A couple per week, maybe.

[–] [email protected] 4 points 1 week ago* (last edited 1 week ago) (1 children)

The spam is not from bots, it's people being paid to spam. Captchas absolutely need to be turned on or else you get bots as well, but they don't stop the spam.

[–] Kalcifer 4 points 1 week ago (1 children)

The spam is not from bots, it's people being paid to spam.

Do you know any specific/official organizations that do this, and/or examples where it's occured on Lemmy?

[–] [email protected] 3 points 1 week ago* (last edited 1 week ago) (1 children)

Its pretty random outside the Russian misinformation sites (which I haven't seen in a while, but they probably got better at hiding).

Its hard to give you a link because mods or admins remove the posts or ban the accounts pretty quick most of the time. But there is a new spam account at least every day (I can think of at least two today. Edit: 4). They come in waves so sometimes there are a whole bunch.

That's probably another thing you need to know. I'm on Lemmy.nz, you're on sh.it.works. If some new spam account signs up on Lemmy.world and posts to lemm.ee, then if it's removed by an admin on your instance it is only removed for people on your instance. Everyone else still sees it as your instance is not hosting either the community or the user so it can't federate our anything to deal with it. The lemm.ee instance could remove the post or comment with the spam in a way that federates out to other instances, but can't ban the user except for on their instance. Only the Lemmy.world instance can ban the user in a way that federates out to other instances. This is something you'll get a better understanding of over time.

Lemmy.world has a lot if help so they don't have issues, but often the spam will come from obscure instances while the admin is asleep and there is no backup, so every other instance has to remove the spam for their own instance. Then you have to work out how to mitigate that for your own instance when you are asleep. Most admins are pretty understanding that this is a hobby and don't expect everyone to be immediately available, but if you have open registrations then you are likely to be targeted more and need a better plan.

[–] Kalcifer 2 points 1 week ago* (last edited 1 week ago) (1 children)

If some new spam account signs up on Lemmy.world and posts to lemm.ee, then if it's removed by an admin on your instance it is only removed for people on your instance. Everyone else still sees it as your instance is not hosting either the community or the user so it can't federate our anything to deal with it. The lemm.ee instance could remove the post or comment with the spam in a way that federates out to other instances, but can't ban the user except for on their instance. Only the Lemmy.world instance can ban the user in a way that federates out to other instances.

This make me think that we should maintain a community curated blocklist in, for example, a Git repository. It could be a list of usernames, and/or a list of instances that are known to be spam that gets updated as new accounts and instances are discovered. Then any instance owner can simply pull the most current version of the blocklist (this could even be done automatically). Once the originating instance blocks the malicious account, they can be removed from the list. This also gives those who have been blocked a centralized method to appeal the block (eg open an issue to create an appeal).

I would honestly have expected something like this to already exist. I think it's partly the purpose of Fediseer, but I'm not completely sure.

[–] [email protected] 2 points 1 week ago (5 children)

This make me think that we should maintain a community curated blocklist in, for example, a Git repository.

There would be a few problems I can think of with this approach. The first one is who controls it? Whoever that is, you haven't solved the issue because now instead of only the instance with the user being able to federate the ban now only the maintainer of the git repo can update the ban list.

If you have many people able to update the repo, then the issue becomes a question of how do you trust all these people to never, ever, ever get it wrong? If you ban a user and opt to remove all their content (which you should, with spam), then if you are automating this you end up with the issue of if anyone screws up then how do you get someone's account unbanned on all those instances? How do you get all their content restored, which is a separate thing and Lemmy currently provides no good way to do this. How do you ensure there are no malicious people with control of the repo but also have enough instances involved to make it worthwhile?

There is a chat room where instance admins share details of spam accounts, and it's about the best we have for Lemmy at the moment (it works quite well, really, because everyone can be instantly notified but also make their own decisions about who to ban or if something is spam or allowed on their instance - because it's pretty common that things are not black and white).

I would honestly have expected something like this to already exist. I think it’s partly the purpose of Fediseer, but I’m not completely sure.

Fediseer has a similar purpose but it's a little different. So far we have been talking about spam accounts set up on various instances, and the time it takes for those mods and admins to remove the spam. But what happens if instead of someone setting up a spam account on an existing instance, they instead create their own instance purely for spamming other instances?

Fediseer provides a web of trust. An instance receives a guarantee from another instance. That instance then guarantees another instance. It creates a web of trust starting from some known good instances. Then if you wish you can choose to have your lemmy instance only federate with instances that have been guaranteed by another instance. Spam instances can't guarantee each other, because they need an instance that is already part of the web to guarantee them, and instances won't do that because they risk their own place in the web if they falsely guarantee another instances (say, if one instance keeps guaranteeing new instances that turn out to be spam, they will quickly lose their own guarantee).

Fediseer actually goes further than this, allowing instances to endorse or censure other instances and you can set up your instance to only federate with instances that haven't been censured or defederate from instances that others have censured for specific reasons (e.g. "hate speech", "racism", etc).

It's quite a cool tool but doesn't help the original discussion issue of spam accounts being set up on legitimate instances.

[–] Kalcifer 2 points 1 week ago

how do you trust all these people to never, ever, ever get it wrong?

The naively simple idea was that the banned user could open an appeal to get their name removed from the blocklist. Also, keep in mind that the community's trust in the blocklist is predicated on the blocklist being accurate.

[–] Kalcifer 2 points 1 week ago (1 children)

If you ban a user [...], then if you are automating this you end up with the issue of if anyone screws up then how do you get someone’s account unbanned on all those instances?

The idea would be that if they are automatically banned, then the removal of the user from the list would then cause them to be automatically unbanned. That being said, you did also state:

If you ban a user and opt to remove all their content (which you should, with spam)

How do you get all their content restored

To which I say that I hadn't considered that the content would be deleted 😜. I was assuming that the user would only be blocked, but their content would still be physically on the server — it would just be effectively invisible.

[–] [email protected] 2 points 1 week ago (1 children)

Technically it is still there. However, when a user is banned, you can also choose to remove their content. You could choose not to, but then what's the point in automatically banning a spam account if you have to go and remove the spam posts yourself.

If you choose to remove them all, and you accidentally hit a real user, you'll remove all their posts and comments. Lemmy doesn't provide an easy way to restore the content. And although there are automated solutions, you come to the next problem of knowing which posts to restore. Many posts were removed by mods of communities, many were removed by the user themselves. You don't want to restore those items, instead you need to remember which you removed and restore only those ones - this is different functionality to Lemmy's option to remove all their content.

This actually exists in some form, there is an AutoMod that keeps a log of removed content for banned users and allows a restore of that content. So it's a solved problem, just would need a similar solution to be built for a ban list.

One thing you'll learn quickly is that Lemmy is version 0 for a reason.

[–] Kalcifer 2 points 1 week ago

One thing you’ll learn quickly is that Lemmy is version 0 for a reason.

Fair warning 😆

[–] Kalcifer 2 points 1 week ago (1 children)

Fediseer provides a web of trust. An instance receives a guarantee from another instance. That instance then guarantees another instance. It creates a web of trust starting from some known good instances. Then if you wish you can choose to have your lemmy instance only federate with instances that have been guaranteed by another instance. Spam instances can’t guarantee each other, because they need an instance that is already part of the web to guarantee them, and instances won’t do that because they risk their own place in the web if they falsely guarantee another instances (say, if one instance keeps guaranteeing new instances that turn out to be spam, they will quickly lose their own guarantee).

How would one get a new instance approved by Fediseer?

[–] [email protected] 2 points 1 week ago* (last edited 1 week ago)

First, don't stress over it. Most instances are not strict on only federating with guaranteed instances. Most do not auto-sync with Fediseer at all, and the ones that do are more likely to only be syncing censures (when other instances are reporting the instance as problematic).

To get guaranteed on Fediseer, you need another instance to guarantee you. If you start your instance, hang out in the spam defense chat, and are generally sensible with your instance, then you'll find someone willing to do it no problem. Guarantees are not a huge risk to an instance since they can also be revoked at any time. If someone guarantees you then you start being a dick, they can just remove your guarantee. So it's not a big decision, people wil be happy to guarantee someone who seems reasonable.

[–] Kalcifer 2 points 1 week ago (1 children)

The first one is who controls it?

Ideally, nobody. Anyone could make their own blocklist, and one could choose to pull from any of them.

[–] [email protected] 2 points 1 week ago (1 children)

I would like functionality similar to this. One problem with a big list is that different instances have different ideas over what is acceptable. I'd love to "subscribe" to, say, Lemmy.world's bans and then anyone they ban would get banned on my instance as well. Of course this makes a bigger mess to clean up when someone gets banned by mistake.

[–] Kalcifer 2 points 1 week ago

One problem with a big list is that different instances have different ideas over what is acceptable.

Yeah, that would be where being able to choose from any number of lists, or to freely create one comes in handy.

[–] Kalcifer 2 points 1 week ago

There is a chat room where instance admins share details of spam accounts, and it’s about the best we have for Lemmy at the moment (it works quite well, really, because everyone can be instantly notified but also make their own decisions about who to ban or if something is spam or allowed on their instance - because it’s pretty common that things are not black and white).

Yeah I think I'm more on the side of this, now. The chat is a decent, and workable solution. It's definitely a lot more hands-on/manual, but I think it's a solid middle ground solution, for the time being.

[–] Kalcifer 2 points 1 week ago* (last edited 1 week ago) (2 children)

We require applications

Is this functionality built into the Lemmy software?


Addendum (2024-11-11T00:32Z):

Ah, yeah, it looks like it is configurable in the admin panel [1].

References

  1. Lemmy Documentation. join-lemmy.org. Accessed: 2024-11-11T00:35Z. https://join-lemmy.org/docs/users/01-getting-started.html#registration.
    • "2. Getting Started". §"Registration".

      Question/Answer: Instance admins can set an arbitrary question which needs to be answered in order to create an account. This is often used to prevent spam bots from signing up. After submitting the form, you will need to wait for some time until the answer is approved manually before you can login.

[–] [email protected] 4 points 1 week ago

Yeah, it's just something like "Tell us why you want to join this instance". If the answer is "to promote my content" or "qq", for example, they don't get approved.

It's done by the Lemmy software.

load more comments (1 replies)
[–] [email protected] 1 points 1 week ago (1 children)

I would just turn off media uploads entirely. It's not worth the risk or disk space.

[–] Kalcifer 3 points 1 week ago* (last edited 1 week ago)

I would just turn off media uploads entirely.

Do you mean also disabling thumbnails? IIUC, pict-rs handles all thumbnail generation [1]. The reason I point this out is that simply disabling image uploads won't itself stop the generation of thumbnails [2]. There's also the question of storing/caching images that come from federated servers.

Referencs

  1. Lemmy Documentation. Accessed: 2024-11-11T01:59Z. https://join-lemmy.org/docs/administration/administration.html.
    • "9. Administration". §"Lemmy Components". §"Pict-rs".

      Pict-rs is a service which does image processing. It handles user-uploaded images as well as downloading thumbnails for external images.

  2. "I just developed and deployed the first real-time protection for lemmy against CSAM!". @[email protected]. Published: 2023-09-20T01:38:09-07:00. Accessed: 2024-11-11T02:16Z. https://lemmy.dbzer0.com/post/4500908.
    • ¶1

      [...] if the content is a link to an external site, lemmy sill caches the thumbnail and stores it in the local pict-rs [...].