this post was submitted on 18 Jul 2023
7 points (100.0% liked)
Machine Learning
27 readers
1 users here now
Machine learning (ML) is a field devoted to understanding and building methods that let machines "learn" – that is, methods that leverage data to improve computer performance on some set of tasks. Machine learning algorithms build a model based on sample data, known as training data, in order to make predictions or decisions without being explicitly programmed to do so. Machine learning algorithms are used in a wide variety of applications, such as in medicine, email filtering, speech recognition, agriculture, and computer vision, where it is difficult or unfeasible to develop conventional algorithms to perform the needed tasks.
founded 1 year ago
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
This looks amazing, if true. The paper is claiming state of the art across literally every metric. Even in their ablation study the model outperforms all others.
I'm a bit suspicious that they don't extend their perplexity numbers to the 13B model, or provide the hyper parameters, but they reference it in text and in their scaling table.
Code will be released in a week https://github.com/microsoft/unilm/tree/master/retnet
https://github.com/Jamie-Stirling/RetNet non-official implementation