this post was submitted on 11 Jun 2023
2 points (100.0% liked)

Machine Learning - Theory | Research

21 readers
1 users here now

We follow Lemmy’s code of conduct.

Communities

Useful links

founded 1 year ago
MODERATORS
 

Goat: Fine-tuned LLaMA Outperforms GPT-4 on Arithmetic Tasks Author(s) Tiedong Liu and Bryan Kian Hsiang Low from National University of Singapore

Word Count 6500+

Estimated Read Time Around 15-20 minutes

Source Code A GitHub repo is provided to access their model, dataset, and script for dataset generation: https://github.com/liutiedong/goat

Summary The authors introduce Goat, a fine-tuned LLaMA model that achieves state-of-the-art performance on a range of arithmetic tasks based on the BIG Bench dataset. In particular, the zero-shot Goat-7B model matches or outperforms the accuracy of the few-shot PaLM-540B model.

They show that supervised fine-tuning alone, without any special techniques, enables LLaMA to generate correct answers for large number addition and subtraction. This is attributed to LLaMA's consistent tokenization of numbers.

For large number multiplication and division, they propose a decomposition method based on task learnability. This method breaks down unlearnable tasks into a series of learnable subtasks leveraging basic arithmetic principles.

Goat-7B was trained using the LoRA technique on a modest 24GB GPU, making it easily reproducible. Limitations around extrapolation and interpretability of the proposed method are also discussed.

The code, dataset, and model are released to facilitate research in instruction tuning and mathematical reasoning in language models.

Applicability Evaluation The research demonstrates how LLaMA's consistent tokenization facilitates arithmetic tasks and shows that intermediate supervision, through a decomposition method, can help solve complex problems. These findings could be useful for building applications using large language models or GANs that require mathematical reasoning or multistep computations.

Specifically, the proposed instruction tuning pipeline can potentially be integrated with other instruction-tuned LMs to enhance their arithmetic reasoning for solving math word problems.

However, the limited extrapolation capability of the fine-tuned model and the lack of an optimal decomposition method remain challenges that need to be addressed for applicability in real-world applications.

top 2 comments
sorted by: hot top controversial new old
[–] ott 2 points 1 year ago (1 children)

I'm not an AI researcher or anything, but it seems like a "waste" to use the neural network itself to perform arithmetic. Instead, it would be far more efficient to somehow let the model use a normal calculator when it needs it.

[–] [email protected] 1 points 1 year ago

we could make an even more power hungry blockchain!

id hope no one uses it for math, though it does speak to some of its reliability in planning.