this post was submitted on 17 Jun 2023
126 points (99.2% liked)

Lemmy.World Announcements

28383 readers
4 users here now

This Community is intended for posts about the Lemmy.world server by the admins.

Follow us for server news 🐘

Outages 🔥

https://status.lemmy.world

For support with issues at Lemmy.world, go to the Lemmy.world Support community.

Support e-mail

Any support requests are best sent to [email protected] e-mail.

Report contact

Donations 💗

If you would like to make a donation to support the cost of running this platform, please do so at the following donation URLs.

If you can, please use / switch to Ko-Fi, it has the lowest fees for us

Ko-Fi (Donate)

Bunq (Donate)

Open Collective backers and sponsors

Patreon

Join the team

founded 1 year ago
MODERATORS
 

One of the side effects of the reddit meltdown is that many search results were unavailable because of communities going private. It would be great if we could fill in the void with lemmy content instead.

top 19 comments
sorted by: hot top controversial new old
[–] [email protected] 34 points 1 year ago

Yes it does!

[–] [email protected] 24 points 1 year ago (1 children)

I heard that reddit has a dedicated cdn each for Microsoft and Google scraping. That's why they work so well to search reddit posts. It will probably take some effort to feed data so we'll from the fediverse.

On that note, perhaps we should have some per-community as well as per-post scrape/noscrape toggle. Might be difficult to get buy-in from all parties.

[–] [email protected] 6 points 1 year ago

Whether a community gets to opt out of being scraped depends on the scraper respecting robots.txt and/or the meta tag of the page.

Not all do, particularly the ones scraping for SEO purposes, so instances might to add IP bans for scrapers that refuse to respect restrictions in those places.

[–] [email protected] 20 points 1 year ago

I've just tried a quick test using some popular queries and it looks as though communities are indexed but individual posts aren't? I agree, it would be nice to replace Reddit in this regard.

Maybe the above is only a temporary measure to help maintain server load?

[–] [email protected] 14 points 1 year ago (1 children)

One way to check is to do a search. e.g. for lemmy.world, google site:https://lemmy.world/

[–] [email protected] 3 points 1 year ago

Already 906 results, with minimal SEO!

[–] [email protected] 13 points 1 year ago

Some google searches already give me Lemmy posts, so it seems to work. I think indexing Lemmy posts takes more time, as I couldn't find my 'blog article' about hosting Lemmy on a Raspberry Pi or the community where it was posted yet trough Google yet. But I was able to find older communities on Feddit.nl, So most of the posts probably can't be found yet, as they simply are too new.

[–] [email protected] 11 points 1 year ago

At the same time, I see people making “news articles” using people’s Reddit posts. More people making money on our content.

[–] [email protected] 7 points 1 year ago* (last edited 1 year ago) (1 children)

I think Duck Duck Go has the ability to search.

[–] [email protected] 5 points 1 year ago (1 children)

I used "F1 lemmy" multiple times in google already and the top post(s) were links to the F1 lemmy.

It works.

[–] [email protected] 2 points 1 year ago (2 children)

Which instance are you reading for F1?

[–] [email protected] 2 points 1 year ago

Oh I want to know this too! Can't wait to get a bustling f1 community around here. I'm in [email protected] and @lemmy.ml

[–] [email protected] 3 points 1 year ago

Probably but you’re not seeing it lemmy is not large enough to have the same tracktion as Reddit

[–] [email protected] 1 points 1 year ago

At the same time, I see people making “news articles” using people’s Reddit posts. More people making money on our content.

[–] [email protected] 0 points 1 year ago

At the same time, I see people making “news articles” using people’s Reddit posts. More people making money on our content.