AMD improves open source model with less training data

3 months ago 2
ARTICLE AD BOX

AMD has released its first open-source language model with one billion parameters. The model builds on a previous version but uses significantly less training data.

While based on the same open-source architecture, AMD's OLMo differs from the original in key aspects. According to AMD, the model was trained with less than half of the training tokens used in the original OLMo. Still, it achieves comparable performance.

 Dreistufiger Trainingsprozess für AMD OLMo 1B von Pre-training über SFT bis DPO Alignment mit spezifischen Datensätzen.The three-stage development of AMD's OLMo 1B model shows its evolution from the base language model through chat optimization to the final alignment with human preferences. Each phase uses specific datasets to enhance AI capabilities. | Image: AMD

AMD's version of OLMo went through a three-stage training process. In the first phase, the base model was trained with 1.3 trillion tokens across 16 server nodes, each equipped with four AMD Instinct MI250 GPUs.

The second phase involved two-step supervised fine-tuning with various datasets to improve capabilities in areas like science, programming, and mathematics. The third phase consisted of human preference alignment based on the UltraFeedback dataset.

Ad

THE DECODER Newsletter

The most important AI news straight to your inbox.

✓ Weekly

✓ Free

✓ Cancel at any time

Strong performance against competitors

According to AMD, the final OLMo model outperforms other open-source chat models in several benchmarks by an average of 2.6 percent.

 Vergleich von 6 LLM-Modellen über 12 Benchmarks, AMD OLMo 1B zeigt Leistungssteigerungen bei mehreren Tests.The performance comparisons of different LLM models show remarkable improvements by AMD OLMo 1B, with increases of up to 6.36 percent in certain benchmarks. | Image: AMD

The two-phase training showed notable improvements: accuracy in MMLU tests increased by 5.09 percent, while GSM8k tests saw a 15.32 percent improvement.

AMD says a key feature of OLMo is its compatibility with various hardware platforms. Beyond data center use, the model can run on laptops with AMD's Ryzen AI processors and integrated Neural Processing Units (NPUs).

The model, training data and code are available on Hugging Face.

AMD's major AI investment push

The release of OLMo is part of AMD's broader AI strategy. The company reported in July that it invested over $125 million in a dozen AI companies over the past twelve months. Recently, AMD acquired Finnish AI company Silo AI for $665 million and open-source AI startup Nod.ai.

Recommendation

At the same time, AMD is advancing specialized AI hardware development. With the AI accelerator Instinct MI355X announced for 2025, the company aims to compete directly with Nvidia.

Read Entire Article
LEFT SIDEBAR AD

Hidden in mobile, Best for skyscrapers.