this post was submitted on 21 Dec 2023
273 points (97.6% liked)

Fediverse

27732 readers
649 users here now

A community to talk about the Fediverse and all it's related services using ActivityPub (Mastodon, Lemmy, KBin, etc).

If you wanted to get help with moderating your own community then head over to [email protected]!

Rules

Learn more at these websites: Join The Fediverse Wiki, Fediverse.info, Wikipedia Page, The Federation Info (Stats), FediDB (Stats), Sub Rehab (Reddit Migration), Search Lemmy

founded 1 year ago
MODERATORS
 

Hey everyone,

This isn't an announcement, just wanted peoples thoughts on this.

I think everyone knows searching the fediverse can be better. Googling doesn't work too well, etc. So I wanted to do my part and help out.

Indexing all posts, etc is quite a lot to handle, so I wanted to start small and just focus on video search. I've started indexing videos from Peertube and other video websites. (Even YouTube but this could be removed to just focus on independent sites)

I know Peertube has their own search engine for videos. I will be reaching out to them. Compared to my site I'm planning it'll have other video sources and be easier to use.

So that leads to feedback from you guys.

  • What do you think about indexing videos posted on the fediverse and other independent platforms?
  • Are there similar services?
  • Am I just wasting my time?
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 33 points 8 months ago (3 children)

I found FediSearch, and also this post basically saying that a fediverse search engine would just be used as a tool by trolls.

[–] [email protected] 11 points 8 months ago

It's worth noting that since FedSearch, Mastodon has actually natively implemented opt-in search on posts.

[–] [email protected] 9 points 8 months ago (2 children)

That’s a good point. But those people can be banned? I guess Reddit handles this by moderation and archiving old posts.

[–] [email protected] 4 points 8 months ago

Yes, but moderation teams on the fediverse are very small, and by nature of it, can make hundreds of account of different servers all trailing that would need to be individually sought out and banned.

It is a game of cat & 100 mice

[–] [email protected] 1 points 8 months ago (1 children)

People will take the harassment off site especially if they are dedicated enough or use it to scrape for potential personal info to publicly release.

[–] [email protected] 12 points 8 months ago (1 children)

How is that different from Reddit? If trolls want to search and scrape and find information on people, they're going to. You can't put your information on the open Internet and not appreciate there's always a danger of that.

[–] [email protected] -2 points 8 months ago

There is more effort barrier if the trolls have to do all the scraping and sorting themselves than just popping a term that is a right wing lightning rod into search and getting a list of targets.

[–] [email protected] 4 points 8 months ago (1 children)

That post wasn't claiming that a search engine would only be used by trolls; it was explaining that they shut down their project because a chunk of the fediverse thinks that and complain about any search engine projects. Discoverability is one of the network's biggest challenges and a search engine could really help with that.

[–] [email protected] 4 points 8 months ago (1 children)

Yes, not only used by trolls, but would be a tool that could be leveraged by trolls. And I think the fediverse makes it easier to establish instances for marginalized groups, but also has more admins that just don't want trolls because nobody here is making $ off them like the corporate socials are. I think if adding search that is going to try and vacuum up everyone's posts in the fediverse and make them easily sortable/targetable without instance admins permission, then that isn't cool. If someone is running a general instance that covers nothing that a troll could latch onto and wants the instance catalogued and searchable then that's fine by me. I don't think boys should be doing that to the fediverse as a whole without admin permission though.

[–] [email protected] 2 points 8 months ago (1 children)

I don't think an admin's permission has anything to do with it. If you post publicly on the fediverse, your posts are public. You should have the option to opt out of any indexing (just like you do for the rest of the open web). But saying its ok for you to read this post if it happens to come across your feed but you shouldn't be allowed to find it via a search is ridiculous. Users get to make the choice with each post whether its public or not, but they don't get to control how people consume those public posts.

[–] [email protected] 1 points 8 months ago (1 children)

Reading a post and having a bot thrashing a server indexing everything are 2 different things. If a user used the site like that they would be throttled and if repeated afterwards, banned. It is also one thing to read/interact with a site as that adds value to the site as a whole. A bot that just mass hits links cataloging everything is just a strain on the server an Admin needs to support, with no upside for the instance, as it's a bot ingesting and no real interaction actually took place.

[–] [email protected] 1 points 8 months ago (1 children)

and having a bot thrashing a server indexing everything

This is a completely separate argument and one that we already have mechanisms for. Servers can use status codes and headers to warn about rate limits and block offenders.

It is also one thing to read/interact with a site as that adds value to the site as a whole

A search index adds value as well; that's why this keeps coming up. And, again, there are existing mechanisms to handle this. A robots.txt file can indicate you don't want to be crawled and offenders can be IP blocked

[–] [email protected] 1 points 8 months ago

Should a dedicated search not use/index ActivityPub instead of the html interface?

If so, instances can simply defederate from search engine instances. So the point you are trying to make still holds.