Lobste.rs

24 readers
27 users here now

RSS Feed of lobste.rs

founded 1 month ago
MODERATORS
26
5
Demystifying secure NFS (blogsystem5.substack.com)
submitted 1 day ago by [email protected] to c/[email protected]
 
 
27
 
 
28
 
 
29
 
 
30
 
 
31
 
 
32
 
 
33
 
 
34
 
 
35
 
 

Abstract: Text embeddings are commonly evaluated on a small set of datasets from a single task not covering their possible applications to other tasks. It is unclear whether state-of-the-art embeddings on semantic textual similarity (STS) can be equally well applied to other tasks like clustering or reranking. This makes progress in the field difficult to track, as various models are constantly being proposed without proper evaluation. To solve this problem, we introduce the Massive Text Embedding Benchmark (MTEB). MTEB spans 8 embedding tasks covering a total of 58 datasets and 112 languages. Through the benchmarking of 33 models on MTEB, we establish the most comprehensive benchmark of text embeddings to date. We find that no particular text embedding method dominates across all tasks. This suggests that the field has yet to converge on a universal text embedding method and scale it up sufficiently to provide state-of-the-art results on all embedding tasks. MTEB comes with open-source code and a public leaderboard at this https URL. Comments

36
 
 
37
 
 
38
 
 
39
 
 
40
 
 
41
 
 
42
 
 
43
 
 

Abstract: The concept of “type” has been used without a consistent, precise definition in discussions about programming languages for 60 years.1 In this essay I explore various concepts lurking behind distinct uses of this word, highlighting two traditions in which the word came into use largely independently: engineering traditions on the one hand, and those of symbolic logic on the other. These traditions are founded on differing attitudes to the nature and purpose of abstraction, but their distinct uses of “type” have never been explicitly unified. One result is that discourse across these traditions often finds itself at cross purposes, such as overapplying one sense of “type” where another is appropriate, and occasionally proceeding to draw wrong conclusions. I illustrate this with examples from well-known and justly well-regarded literature, and argue that ongoing developments in both the theory and practice of programming make now a good time to resolve these problems. Comments

44
 
 
45
 
 
46
 
 
47
 
 
48
 
 
49
 
 
50
 
 
view more: ‹ prev next ›