silo.ai
  • Services
  • Solutions
  • Work
  • Research
  • Contact
  • •••
    • About
    • Careers
    • Learn
Menu
  • Services
  • Solutions
  • Work
  • Research
  • Contact
  • •••
    • About
    • Careers
    • Learn
silo.ai
  • Services
  • Solutions
  • Research
  • About
  • Careers
  • Learn
  • Contact
Menu
  • Services
  • Solutions
  • Research
  • About
  • Careers
  • Learn
  • Contact
Article / News / 

TurkuNLP ranks #1 in global benchmarks

  • August 18, 2018

Finland’s TurkuNLP group has reached the highest aggregate ranking in the global natural language parsing Shared Task 2018, beating 25 other top universities and company research groups in performance. Filip Ginter, the head of TurkuNLP, is an AI scientist in Silo.AI’s NLP team.

Read Filip Ginter’s description of the Universal Dependencies Shared Task 2018:


Automatic syntactic parsing is one of the major tasks (and challenges) of natural language processing. The objective is to split running text into words and sentences and, for every sentence, build its syntactic tree with a full morphological analysis of every word. The tree might look like this:

And, for the word “finds”, you would expect to be told that the base form is “find” and the morphological features are Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin. Text pre-processed using such a parser is a great source of features for downstream applications, as well as for search-and-extract tasks and linguistic research. That is why we care about having good parsers for as many languages as possible.

Nowadays, these parsers are, for the most part, machine-learned systems, which need data to be trained. Universal Dependencies (www.universaldependencies.org) is a large initiative to gather such training data (called treebanks), and currently covers 71 languages, ranging from just a few dozen sentences for some to well over a million words for others. Universal Dependencies is one of the very visible data initiatives in natural language processing, and, happily, Turku has been at its core since day one.

To drive the development and testing of syntactic parsers, as well as to dampen the “developed on English, tested on English, works on nothing else” problem, the Universal Dependencies initiative has organized a Shared Task in 2017 and 2018. A shared task is essentially a competition where everybody receives the exact same data to train their parsers on and competes to gain the best accuracy across the little-over 50 languages, which have sufficient amount of data to test (not necessarily train) a parser. Participating in the shared task is a great test of one’s parsing methodology, as well as their technical skills, as training dozens of parsers on a tight schedule is no small feat. The shared tasks tend to attract well-known groups in the field and their results constitute the state of the art in parsing.

In 2018, the TurkuNLP team did well, ranking 1st, 2nd and 2nd  of 26 teams on the three primary metrics in the task, which included the universities of Stanford, Prague and Uppsala – the traditional parsing research heavyweights. The technical work was led by Jenna Kanerva, with contributions from Filip Ginter, Akseli Leino and Niko Miekka.

http://universaldependencies.org/conll18/results.html

The parser was built from a combination of existing tools, in part used as is and in part retrained in new ways. Especially, we relied on the Stanford parser https://github.com/tdozat/Parser-v2 and the OpenNMT neural machine translation system http://opennmt.net/, which we creatively bent to lemmatise words. Special challenge were such languages as Breton and Thai, for which zero training data was available and knowledge from other languages had to be transferred.

As an important practical outcome of the shared task is the Turku Neural Parser Pipeline https://turkunlp.github.io/Turku-neural-parser-pipeline/, which distributes the parser with its trained models for over 50 languages under an open license. The paper describing the pipeline will be published with other systems in the shared task session at EMNLP’18 in Brussels, in November.

Share

Share on twitter
Share on facebook
Share on linkedin

Author

  • Filip Ginter, PhD Filip Ginter, PhD

Topics

AI for BusinessNatural Language ProcessingResearchWorking at Silo AI

You might be interested in

Silo AI and Mila join forces to connect leading AI experts in the Nordics and Canada 

Pauliina Alanen 2.2.2023

Silo AI, one of Europe’s largest private Artificial Intelligence (AI) labs, is proud to announce a partnership with Mila – Quebec AI Institute, the world’s largest academic deep learning research center. Founded by the leading AI researcher, Yoshua Bengio, Mila brings together more than 1,000 academic researchers in machine learning (ML). The partnership aims to connect state-of-the-art AI research with industry needs. With a strong experience in building AI-driven products, Silo AI has gathered a unique pool of AI talent including 240 AI experts, out of which 120 have a PhD degree. 

Read more

Read More

Hype, hope or revolution: What is ChatGPT and do you need to care?

Peter Sarlin 31.1.2023

The hype is most definitely real. OpenAI’s conversational chatbot ChatGPT has in recent weeks provided hope. But is it a true technological revolution? Put simply, the answer is both yes

Read More

We challenge you to ask why

We don’t only deliver projects but we challenge you to think different.
Contact

Subscribe to Silo AI newsletter

Join 5000+ subscribers who read the Silo AI monthly newsletter

silo.ai
Contact

+358 40 359 1299

info@silo.ai

  • Helsinki, Finland
  • Stockholm, Sweden
  • Copenhagen, Denmark
Menu
  • Home
  • Services
  • Solutions
  • Research
  • Work
  • About
  • Careers
  • Contact
Menu
  • Home
  • Services
  • Solutions
  • Research
  • Work
  • About
  • Careers
  • Contact
Resources
  • Learn
  • Inference podcast
  • For media
  • MLOps
  • Predictive maintenance
  • Nordic State of AI report
Menu
  • Learn
  • Inference podcast
  • For media
  • MLOps
  • Predictive maintenance
  • Nordic State of AI report
Linkedin Facebook-square Twitter Instagram Spotify
©2017-2023 All Rights Reserved.

|

Website Privacy Policy / Cookie Policy / Newsletter Privacy Policy / Recruitment Privacy Policy

Manage cookies
We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept All”, you consent to the use of ALL the cookies. However, you may visit "Cookie Settings" to provide a controlled consent. Read Cookie Policy
Cookie SettingsAccept All
Manage cookies

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
CookieDurationDescription
cookielawinfo-checkbox-analytics11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional11 monthsThe cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy11 monthsThe cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
Functional
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Analytics
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
Advertisement
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.
Others
Other uncategorized cookies are those that are being analyzed and have not been classified into a category as yet.
SAVE & ACCEPT
Powered by CookieYes Logo