Together with University of Turku’s research group TurkuNLP and HPLT, Europe’s largest private AI lab Silo AI is releasing a new, larger multilingual model natively trained in all Nordic languages, Viking 13B. While showcasing the ability to scale LLM training to thousands of nodes with a new customized open source training framework for LUMI, it’s also continued evidence of the novel approach to training LLMs for low-resource languages.
Following the completions of the language models Poro 34B and Viking 7B, Silo AI and TurkuNLP of University of Turku are now releasing the full 13 billion parameter version of Viking.
In addition to the Nordic languages, the Viking model family also covers English and programming languages. Focusing on low-resource languages without compromising English, Viking 13B includes Danish, Finnish, Norwegian, Icelandic, Swedish and programming languages. The model family comes with an updated, and more modern architecture, and in a variety of model sizes, of which this is the second.
TurkuNLP and Silo AI's collaboration focuses on developing models that excel in linguistic performance and inclusivity while respecting local values and cultures. This effort aims to democratize access to LLMs, thus enhancing Europe's digital infrastructure and innovation ecosystem. By doing so, the initiative seeks to accelerate the adoption of LLM-driven products and applications.
Viking is trained on LUMI, the most powerful supercomputer in Europe. Silo AI and TurkuNLP have shown evidence of training on AMD at scale, with scaling experiments of theoretical throughput predictions utilizing up to 4096 MI-250X GPUs simultaneously. With this new Viking family, they are training models using a new customized open source training framework, with simultaneous utilization of up to 1024 MI-250X GPUs, proving the ability to train LLMs at scale.
Viking 13B: A modern architecture with more languages
Below is a summary of key features of the Viking model family covering English, Finnish, Swedish, Norwegian, Danish, Icelandic and code. For transparency with respect to model architecture, data and other technical information, please refer to the official model card (Viking 7B, Viking 13B, Viking 33B).
- Research Checkpoints: Silo AI and TurkuNLP are committed to publishing checkpoints throughout the training process, providing transparency on the model training process.
- Model architecture: Viking 13B uses an architecture similar to Llama 2, with flash attention, rotary embeddings, grouped query attention and supports a 4k sequence length
- Model sizes: 13B parameters
- Multilingual capabilities: The model is designed to process English and Nordic languages, and have proficiency with a variety of programming languages. Additionally, they can perform basic translation between English and Nordic languages.
- Dataset: The model is trained with a dataset of 2 trillion tokens, including Danish, English, Finnish, Icelandic, Norwegian, Swedish and a variety of programming languages represented.
- Open source: The model is freely available under the Apache 2.0 License, implying applicability for both commercial and research use.
- Training hardware: The model is trained using up to 1024 AMD MI250X GPUs on the LUMI supercomputer in Finland.
Considerations for Use
The intended audience for Viking research checkpoints is academic and industry research. These checkpoints are not suitable for deployment in a production use case without further training, fine-tuning and testing.
Acknowledgments
We wish to thank the operators of the LUMI/EuroHPC supercomputer for computational resources and technical support, including AMD, HPE and CSC – the IT Center for Science, Finland. TurkuNLP researchers have received funding from the European Union’s Horizon Europe research and innovation programme High Performance Language Technologies (HPLT) under grant agreement No 101070350.
About
TurkuNLP
University of Turku
Silo AI
Want to discuss how Silo AI could help your organization?
Join the 5000+ subscribers who read the Silo AI monthly newsletter to be among the first to hear about the latest insights, articles, podcast episodes, webinars, and more.