In collaboration with University of Turku’s research group TurkuNLP and the EU-funded HPLT project, Silo AI is releasing a new, larger multilingual model covering all Nordic languages, Viking 33B. Built in a customized fully open-source training framework for LUMI, Viking 33B demonstrates a novel approach to training LLMs for low-resource languages while highlighting the capabilities of AMD compute platforms.
Following the release of the final model in the Poro language model family, Poro 34B, and the releases of, Viking 7B, and Viking 13B of the Viking language model family, Silo AI and TurkuNLP from the University of Turku are now releasing the full 33 billion parameter version of Viking.
Focusing on low-resource languages, without compromising English, Viking 33B is trained on a dataset of 2 trillion tokens, including material in Danish, Finnish, Norwegian, Icelandic, Swedish and several programming languages. Viking models are designed to handle text in English and Nordic languages, and to perform basic translation between English and Nordic languages. The Viking model family consists of base models that can be fine-tuned and instruction-tuned to a variety of tasks.
The model family is built on an updated, and more modern architecture: Viking 33B uses an architecture similar to Llama 2, with flash attention, rotary embeddings, grouped query attention and a 4k sequence length support.
TurkuNLP and Silo AI's collaboration focuses on developing linguistically high performing models that are considerate of local values and cultures, thus contributing to bringing more diversity to the sphere of LLMs. European technological competitiveness is dependent on strong digital infrastructure and a well functioning innovation ecosystem. The initiative by Silo AI and TurkuNLP seeks to contribute to this by democratizing access to LLMs in order to accelerate the integration of the technology into product and service offerings.
Silo AI and TurkuNLP are committed to publishing checkpoints throughout the training process to provide transparency. The models of both the Viking and the Poro families are fully open source and freely available under the Apache 2.0 License, making them a practically available tool for both commercial and research use.
Viking is trained on LUMI, the most powerful supercomputer in Europe. Silo AI and TurkuNLP have shown evidence of training on AMD at scale, with scaling experiments of theoretical throughput predictions utilizing up to 4096 MI 250X GPUs simultaneously. With this new Viking family, they are training models using a new customized open-source training framework, with simultaneous utilization of up to 1024 MI 250X GPUs, proving the ability to train LLMs at scale.
Viking 33B:
Below is a summary of key features of the Viking model family covering English, Finnish, Swedish, Norwegian, Danish, Icelandic and code:
- Research Checkpoints: Silo AI and TurkuNLP are committed to publishing checkpoints to provide transparency throughout the model training process.
- Model architecture: Viking 33B uses an architecture similar to Llama 2, with flash attention, rotary embeddings, grouped query attention and supports a 4k sequence length support.
- Model sizes: 33B parameters
- Multilingual capabilities: The model is designed to process English and Nordic languages, and have proficiency within a variety of programming languages. Additionally, it can perform basic translation between English and Nordic languages.
- Dataset: The model is trained with a dataset of 2 trillion tokens, including Danish, English, Finnish, Icelandic, Norwegian, Swedish and a variety of programming languages represented.
- Open source: The model is freely available under the Apache 2.0 License, implying applicability for both commercial and research use.
- Training hardware: The model is trained using up to 1024 AMD MI250X GPUs on the LUMI supercomputer in Finland.
Considerations for Use
The intended audience for Viking research checkpoints is academic and industry research. These checkpoints are not suitable for deployment in a production use case without further training, fine-tuning and testing.
Acknowledgments
We wish to thank the operators of the LUMI/EuroHPC supercomputer for computational resources and technical support, including AMD, HPE and CSC – the IT Center for Science, Finland. TurkuNLP researchers have received funding from the European Union’s Horizon Europe research and innovation programme High Performance Language Technologies (HPLT) under grant agreement No 101070350.
About
Silo AI
TurkuNLP
Want to discuss how Silo AI could help your organization?
Join the 5000+ subscribers who read the Silo AI monthly newsletter to be among the first to hear about the latest insights, articles, podcast episodes, webinars, and more.