VIking 33B: Scaling performance of open Nordic AI models

Viking 33B: Scaling performance of open Nordic AI models

Large language models

Generative AI

SiloGen

last updated

30.10.2024

In collaboration with University of Turku’s research group TurkuNLP and the EU-funded HPLT project, Silo AI is releasing a new, larger multilingual model covering all Nordic languages, Viking 33B. Built in a customized fully open-source training framework for LUMI, Viking 33B demonstrates a novel approach to training LLMs for low-resource languages while highlighting the capabilities of AMD compute platforms.

Following the release of the final model in the Poro language model family, Poro 34B, and the releases of, Viking 7B, and Viking 13B of the Viking language model family, Silo AI and TurkuNLP from the University of Turku are now releasing the full 33 billion parameter version of Viking.

Focusing on low-resource languages, without compromising English, Viking 33B is trained on a dataset of 2 trillion tokens, including material in Danish, Finnish, Norwegian, Icelandic, Swedish and several programming languages. Viking models are designed to handle text in English and Nordic languages, and to perform basic translation between English and Nordic languages. The Viking model family consists of base models that can be fine-tuned and instruction-tuned to a variety of tasks.

The model family is built on an updated, and more modern architecture: Viking 33B uses an architecture similar to Llama 2, with flash attention, rotary embeddings, grouped query attention and a 4k sequence length support.

TurkuNLP and Silo AI's collaboration focuses on developing linguistically high performing models that are considerate of local values and cultures, thus contributing to bringing more diversity to the sphere of LLMs. European technological competitiveness is dependent on strong digital infrastructure and a well functioning innovation ecosystem. The initiative by Silo AI and TurkuNLP seeks to contribute to this by democratizing access to LLMs in order to accelerate the integration of the technology into product and service offerings.

Silo AI and TurkuNLP are committed to publishing checkpoints throughout the training process to provide transparency. The models of both the Viking and the Poro families are fully open source and freely available under the Apache 2.0 License, making them a practically available tool for both commercial and research use.

Viking is trained on LUMI, the most powerful supercomputer in Europe. Silo AI and TurkuNLP have shown evidence of training on AMD at scale, with scaling experiments of theoretical throughput predictions utilizing up to 4096 MI 250X GPUs simultaneously. With this new Viking family, they are training models using a new customized open-source training framework, with simultaneous utilization of up to 1024 MI 250X GPUs, proving the ability to train LLMs at scale.

Viking 33B:

Below is a summary of key features of the Viking model family covering English, Finnish, Swedish, Norwegian, Danish, Icelandic and code:

Research Checkpoints: Silo AI and TurkuNLP are committed to publishing checkpoints to provide transparency throughout the model training process.
Model architecture: Viking 33B uses an architecture similar to Llama 2, with flash attention, rotary embeddings, grouped query attention and supports a 4k sequence length support.
Model sizes: 33B parameters
Multilingual capabilities: The model is designed to process English and Nordic languages, and have proficiency within a variety of programming languages. Additionally, it can perform basic translation between English and Nordic languages.
Dataset: The model is trained with a dataset of 2 trillion tokens, including Danish, English, Finnish, Icelandic, Norwegian, Swedish and a variety of programming languages represented.
Open source: The model is freely available under the Apache 2.0 License, implying applicability for both commercial and research use.
Training hardware: The model is trained using up to 1024 AMD MI250X GPUs on the LUMI supercomputer in Finland.

Considerations for Use

The intended audience for Viking research checkpoints is academic and industry research. These checkpoints are not suitable for deployment in a production use case without further training, fine-tuning and testing.

Acknowledgments

We wish to thank the operators of the LUMI/EuroHPC supercomputer for computational resources and technical support, including AMD, HPE and CSC – the IT Center for Science, Finland. TurkuNLP researchers have received funding from the European Union’s Horizon Europe research and innovation programme High Performance Language Technologies (HPLT) under grant agreement No 101070350.

‍

About

Silo AI

Silo AI is a leading AI lab on a joint mission with AMD to shape the future of AI computing. We’re a trusted AI partner that brings competitive advantage to leadership AI solutions. We build AI to enable smart devices, autonomous vehicles, industry 4.0, and smart cities. Silo AI trains state-of-the-art open source AI models, and offers customers unique access to world-class AI capabilities and the SiloGen platform. With advanced compute, a full-stack AI platform and world-leading AI scientists, our approach empowers organizations to develop AI that they own and control.

www.silo.ai

TurkuNLP

The TurkuNLP Group is a group of researchers at the University of Turku, with a research focus on various aspects of natural language processing, language technology and digital linguistics. TurkuNLP has contributed to a large number of open source NLP resources, such as FinBERT, WikiBERT, FinGPT, Turku Dependency Treebank, Universal Dependencies, Turku Neural Parsing Pipeline, Large internet corpora, Turku Paraphrase Corpus, Turku Sentiment Corpus, Wikidata normalization, TurkuONE etc. The University of Turku is an international academic community of 25,000 students and staff and was ranked among the 301–400 best universities in the 2023 Shanghai Ranking.

Want to discuss how Silo AI could help your organization?

Get in touch with our AI experts.

Peter Sarlin, PhD

Co-founder

peter.sarlin@silo.ai

Author

Authors

Share on Social

Subscribe to our newsletter

Join the 5000+ subscribers who read the Silo AI monthly newsletter to be among the first to hear about the latest insights, articles, podcast episodes, webinars, and more.

Country of residence

By submitting this form you agree to the processing of your personal data by Silo AI as described in the Privacy Policy.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Ready to level up your AI capabilities?

Succeeding in AI requires a commitment to long-term product development. Let’s start today.

Talk to an expert Join our team

Viking 33B: Scaling performance of open Nordic AI models

Viking 33B:

Considerations for Use

Acknowledgments

About

Silo AI

TurkuNLP

Want to discuss how Silo AI could help your organization?

What to read next

AMD Silo AI and appliedAI expand their partnership with a program to accelerate AI adoption in life sciences, robotics and automotive

Europe's leading AI companies and research institutions combine their forces to develop next-generation open LLMs

AMD Silo AI and Combient unlock enterprise AI value on leadership compute platforms

Ready to level up your AI capabilities?