this post was submitted on 19 May 2025
112 points (99.1% liked)

Europe

5932 readers
2005 users here now

News and information from Europe 🇪🇺

(Current banner: La Mancha, Spain. Feel free to post submissions for banner images.)

Rules (2024-08-30)

  1. This is an English-language community. Comments should be in English. Posts can link to non-English news sources when providing a full-text translation in the post description. Automated translations are fine, as long as they don't overly distort the content.
  2. No links to misinformation or commercial advertising. When you post outdated/historic articles, add the year of publication to the post title. Infographics must include a source and a year of creation; if possible, also provide a link to the source.
  3. Be kind to each other, and argue in good faith. Don't post direct insults nor disrespectful and condescending comments. Don't troll nor incite hatred. Don't look for novel argumentation strategies at Wikipedia's List of fallacies.
  4. No bigotry, sexism, racism, antisemitism, islamophobia, dehumanization of minorities, or glorification of National Socialism. We follow German law; don't question the statehood of Israel.
  5. Be the signal, not the noise: Strive to post insightful comments. Add "/s" when you're being sarcastic (and don't use it to break rule no. 3).
  6. If you link to paywalled information, please provide also a link to a freely available archived version. Alternatively, try to find a different source.
  7. Light-hearted content, memes, and posts about your European everyday belong in [email protected]. (They're cool, you should subscribe there too!)
  8. Don't evade bans. If we notice ban evasion, that will result in a permanent ban for all the accounts we can associate with you.
  9. No posts linking to speculative reporting about ongoing events with unclear backgrounds. Please wait at least 12 hours. (E.g., do not post breathless reporting on an ongoing terror attack.)
  10. Always provide context with posts: Don't post uncontextualized images or videos, and don't start discussions without giving some context first.

(This list may get expanded as necessary.)

Posts that link to the following sources will be removed

Unless they're the only sources, please also avoid The Sun, Daily Mail, any "thinktank" type organization, and non-Lemmy social media. Don't link to Twitter directly, instead use xcancel.com. For Reddit, use old:reddit:com

(Lists may get expanded as necessary.)

Ban lengths, etc.

We will use some leeway to decide whether to remove a comment.

If need be, there are also bans: 3 days for lighter offenses, 7 or 14 days for bigger offenses, and permanent bans for people who don't show any willingness to participate productively. If we think the ban reason is obvious, we may not specifically write to you.

If you want to protest a removal or ban, feel free to write privately to the primary mod account @[email protected]

founded 10 months ago
MODERATORS
 

After 2,5 years of intensive research and programming efforts, the entire Openwebsearch.eu project team is excited to grant access to its pilot of the first-ever federated pan-European Open Web Index (OWI).

From June onward, commercial and scientific development teams of any size as well as interested individuals are welcome to access and make use of almost a petabyte (and growing) of open web data under a general research license or – upon request – under a designated commercial license as well.

Given that the European Commission has launched the InvestAI initiative to mobilize €200 billion of investment in artificial intelligence, the Open Web Index comes with perfect timing.

The OpenWebSearch.eu consortium actively calls early adopters to pioneer innovative projects surrounding vertical web search, argumentative search, LLM applications including RAG and more.

“The OWI symbolizes a first step towards true European digital sovereignty and is a fundamental step in paving the way for a comprehensive open European AI landscape.“ says Community Manager Ursula Gmelch and further:

“Our goal behind this initial pilot phase is to onboard a range of projects from diverse domains to get early feedback in. We look forward to users confirming the quality and value in current functionalities and/or helping us pivot in such ways that real market demands can be met and further expanded upon.“

An official kick-off event will be hosted on 6 June from 10 am to 12 am CEST via Zoom.

Registration to the event is open under the following link:

https://cscfi.zoom.us/meeting/register/eATIpDQ5TZidh4Jzkim6FQ#/registration

[,,,]

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 8 points 12 hours ago

This part of the FAQ makes the project interesting:

Services like Google’s Search Console allow website operators to optimize their search page for Google – thus Google crowdsources the robust parsing without making this information available to third parties.

A new search engine is at a disadvantage without that data. Website operators don't bother maintaining their information at an unknown search engine. Hopefully OWS becomes popular enough that operators use it, e.g. to indicate when their site needs a recrawl or which parts of their site have to be indexed.