Issue #17: Really big LLMs

Gradient and Abacus.AI release extended versions of Llama-3

May 03, 2024

Welcome to Issue #17 of One Minute AI, your daily AI news companion. This issue discusses extended versions of Llama-3 released by Abacus.AI and Gradient.

A diverse team of researchers, consisting of three men and two women of various ethnicities, working together in a high-tech laboratory. The scene shows them around a large digital screen displaying complex algorithms and data. The men are dressed in casual attire, one Asian with glasses, one Black with short hair, and one Caucasian with a beard. The women, one Hispanic with long hair tied back and one Middle-Eastern with a hijab, are actively discussing and pointing at the screen. The lab is filled with modern computers and scientific equipment.

Gradient releases the Llama-3 8B Gradient Instruct 1048k model

Gradient has released its Llama-3 8B Gradient Instruct 1048k model. This model extends Llama-3 8B's context length from 8k to > 1040K.

It was built on top of the EasyContext Blockwise RingAttention library to scalably and efficiently train on contexts up to 1048k tokens on Crusoe Energy's high-performance L40S cluster.

Discover on Hugging Face

Abacus.AI presents its longer-necked variant of Llama-3 70B

Abacus.AI has released an extended version of Llama-3 70B with an effective context length of approximately 128k.

Their methodology for training uses PoSE and dynamic-NTK interpolation and has been trained using ~1B tokens on eight H100 GPUs with Deepspeed Zero Stage 3.

Discover on Hugging Face

Want to help?

If you liked this issue, help spread the word and share One Minute AI with your peers and community.

Share One Minute AI

You can also share feedback with us, as well as news from the AI world that you’d like to see featured by joining our chat on Substack.

Join Team One Minute AI’s subscriber chat

Available in the Substack app and on web

One Minute AI

Issue #17: Really big LLMs

Gradient and Abacus.AI release extended versions of Llama-3

Gradient releases the Llama-3 8B Gradient Instruct 1048k model

Abacus.AI presents its longer-necked variant of Llama-3 70B

Want to help?

Discussion about this post