Machine Learning - Learning/Language Models

10 readers

1 users here now

Discussion of models, thier use, setup and options.

Please include models used with your outputs, workflows optional.

We follow Lemmy’s code of conduct.

Communities

Useful links

founded 1 year ago

MODERATORS

[email protected]

1

3

“Large Language Models (in 2023)” (Talk by Hyung Won Chung, OpenAI, at Seoul National University) (www.youtube.com)

submitted 10 months ago by [email protected] to c/[email protected]

0 comments fedilink

2

1

LLM Finetuning Risks (llm-tuning-safety.github.io)

submitted 10 months ago by [email protected] to c/[email protected]

0 comments fedilink

3

2

Phi 1.5 and the Shift Towards Smaller Models with Curated Data: A Closer Look (medium.com)

submitted 10 months ago by [email protected] to c/[email protected]

0 comments fedilink

4

2

Stanford Online -Statistical Learning (www.youtube.com)

submitted 10 months ago by [email protected] to c/[email protected]

0 comments fedilink

5

3

Stanford University Lecture Collection | Convolutional Neural Networks (www.youtube.com)

submitted 10 months ago by [email protected] to c/[email protected]

0 comments fedilink

6

3

Stanford CS229: Machine Learning (www.youtube.com)

submitted 10 months ago by [email protected] to c/[email protected]

0 comments fedilink

7

2

MIT 6.S191: Introduction to Deep Learning (www.youtube.com)

submitted 10 months ago by [email protected] to c/[email protected]

0 comments fedilink

8

3

Carnegie Mellon University Deep Learning (11785 Fall 2022 Lectures) (www.youtube.com)

submitted 10 months ago by [email protected] to c/[email protected]

0 comments fedilink

9

3

Applied Machine Learning (Cornell Tech CS 5787, Fall 2020) (www.youtube.com)

submitted 10 months ago by [email protected] to c/[email protected]

0 comments fedilink

10

2

DeepMind x UCL | Reinforcement Learning Course 2018 (www.youtube.com)

submitted 10 months ago by [email protected] to c/[email protected]

0 comments fedilink

11

4

Chinchilla’s Death (espadrine.github.io)

submitted 10 months ago by [email protected] to c/[email protected]

0 comments fedilink

12

1

Is AI lying to us? These researchers built an LLM lie detector of sorts to find out (www.zdnet.com)

submitted 10 months ago by [email protected] to c/[email protected]

1 comments fedilink

13

3

Meet Mistral 7B, Mistral’s first LLM that beats Llama 2 (dataconomy.com)

submitted 10 months ago by [email protected] to c/[email protected]

0 comments fedilink

14

4

Comparing Llama-2 and GPT-3 LLMs for HPC kernels generation (arxiv.org)

submitted 10 months ago by [email protected] to c/[email protected]

0 comments fedilink

Abstract

We evaluate the use of the open-source Llama-2 model for generating well-known, high-performance computing kernels (e.g., AXPY, GEMV, GEMM) on different parallel programming models and languages (e.g., C++: OpenMP, OpenMP Offload, OpenACC, CUDA, HIP; Fortran: OpenMP, OpenMP Offload, OpenACC; Python: numpy, Numba, pyCUDA, cuPy; and Julia: Threads, CUDA.jl, AMDGPU.jl). We built upon our previous work that is based on the OpenAI Codex, which is a descendant of GPT-3, to generate similar kernels with simple prompts via GitHub Copilot. Our goal is to compare the accuracy of Llama-2 and our original GPT-3 baseline by using a similar metric. Llama-2 has a simplified model that shows competitive or even superior accuracy. We also report on the differences between these foundational large language models as generative AI continues to redefine human-computer interactions. Overall, Copilot generates codes that are more reliable but less optimized, whereas codes generated by Llama-2 are less reliable but more optimized when correct.

15

1

The Secret Ingredient of ChatGPT Is Human Advice (archive.ph)

submitted 11 months ago by [email protected] to c/[email protected]

1 comments fedilink

Original (pay-walled): https://www.nytimes.com/2023/09/25/technology/chatgpt-rlhf-human-tutors.html

16

4

DALL·E 3 (openai.com)

submitted 11 months ago by [email protected] to c/[email protected]

0 comments fedilink

17

2

Efficient Fine-Tuning for Llama-v2-7b on a Single GPU (www.youtube.com)

submitted 11 months ago by [email protected] to c/[email protected]

0 comments fedilink

18

3

Introducing Refact Code LLM: 1.6B State-of-the-Art LLM for Code that Reaches 32% HumanEval (refact.ai)

submitted 11 months ago by [email protected] to c/[email protected]

0 comments fedilink

19

2

Meta Is Developing a New, More Powerful AI System as Technology Race Escalates (archive.ph)

submitted 11 months ago by [email protected] to c/[email protected]

1 comments fedilink

Original (pay-walled): https://www.wsj.com/tech/ai/meta-is-developing-a-new-more-powerful-ai-system-as-technology-race-escalates-decf9451

20

2

Beating GPT-4 on HumanEval with a Fine-Tuned CodeLlama-34B (www.phind.com)

submitted 11 months ago by [email protected] to c/[email protected]

0 comments fedilink

21

1

Introducing Code Llama, a state-of-the-art large language model for coding (ai.meta.com)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

22

3

Seamless Communication - Meta AI (ai.meta.com)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

23

4

GPT-4 Can’t Reason (medium.com)

submitted 1 year ago* (last edited 1 year ago) by [email protected] to c/[email protected]

4 comments fedilink

Corresponding arXiv preprint: https://arxiv.org/abs/2308.03762

24

5

Meta’s Next AI Attack on OpenAI: Free Code-Generating Software (www.theinformation.com)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

25

5

Google’s Search AI Is Absolutely Horrible at Geography (futurism.com)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink