silo.ai
  • Services
  • Solutions
  • Work
  • Research
  • Contact
  • •••
    • About
    • Careers
    • Learn
Menu
  • Services
  • Solutions
  • Work
  • Research
  • Contact
  • •••
    • About
    • Careers
    • Learn
silo.ai
  • Services
  • Solutions
  • Research
  • About
  • Careers
  • Learn
  • Contact
Menu
  • Services
  • Solutions
  • Research
  • About
  • Careers
  • Learn
  • Contact
Article / 

Citizens of Silopolis: Luiza Sayfullina

  • March 23, 2020

Luiza Sayfullina is a senior machine learning expert with 7+ years of experience in various machine learning projects. She holds a PhD in neural networks and natural language processing (NLP) from Aalto University (2019), and has a deep understanding of NLP for English and Finnish languages. In her work, Luiza helps companies find and implement AI solutions ranging from text classification and clustering, information extraction, content generation, summarization to speech-to-text applications.

AI solutions that deal with unstructured textual data

Luiza works at Silo.AI, the largest AI lab in the Nordics, as an NLP AI Scientist. In her work, she’s in charge of developing solutions that deal with textual data. She uses both general machine learning approaches as well as language specific ones. These include text preprocessing and understanding the grammatical structure through parsing, named entity recognition, neural networks, decision trees and other text processing algorithms used in AI solutions we build for our clients.

Luiza’s current work isn’t so far from her prior career in academia: “Sometimes I joke that little changed in my move from academia to Silo.AI. In fact, I use the same programming language (Python) and deep learning libraries (PyTorch or TensorFlow), read scientific papers and involve myself into deep thinking and how to approach the problem at hand.” Luiza describes.

However, the client work brings her the needed balance where she gets to apply her knowledge into real-world:  “Theory and practice are more at balance now. I both read papers and work on concrete cases. I enjoy writing more code than what I used to, and aim to improve my code standards.”

Most knowledge we have at the moment is expressed in natural language, either in text or in audio

In NLP projects, the aim is to make unstructured textual data useful by putting it into a structured format. Once more structured, certain types of events or information can be extracted from it. Unstructured data can include media, news articles, user feedback, user reviews and customer feedback, reports, invoices, pdfs and other documentation. Typically NLP projects focus on finding quantity information in vast amounts of unstructured data.

“In one client case, we’ve been evaluating the sustainability of a company based on sustainability reports. It is crucial to find key numbers that indicate which activities led to decrease of emissions, and by how much the emissions were reduced. There’s a lot of this kind of data available, but reading the texts manually and putting the information together takes time,” Luiza says.

For Luiza, NLP provides a new way of creating value: these days the value of many products comes from smart processing of unstructured data and producing the most relevant information that is worth reading. 

“I believe that NLP can bring much value by making information more accessible to various groups of people despite the language they speak or the difficulty of text they can handle. Machine translation has achieved tremendous success in making all this possible.” Luiza explains.

If you don’t have labeled data, additional rules can help

For the majority of AI tasks, the data needs to be annotated so that the machine can “learn” what’s in the data. Annotating is the process of adding labels to the data, in other words, explaining what is in the piece of data. It isn’t that common to have annotated data ready, which is why we at Silo.AI have created annotation tools to speed up to process.

In one example, we were categorizing risks in financial reports into four different categories. This kind of classification task required labeled data, so we needed to provide annotation tools before we could build the categorization tool. Usually the client is the best annotator, as the process requires domain knowledge. However, in some cases annotation can be outsourced.

In Luiza’s opinion, you shouldn’t be afraid to support the algorithm with additional rules or logics: “It’s good to check if an existing solution can adapt to your data. Sometimes though these trained solutions might not have encountered certain specific domain samples, and there might not be enough labeled data to fine-tune the solution. In such cases it doesn’t hurt to improve the accuracy by incorporating additional logics.”

Responsibility and freedom in projects

Luiza enjoys her responsibility in each project. So far, every project has been different. During her two years at Silo.AI, Luiza has worked on creative content generation, clause recommendation in contracts, unemployment predictions and sentiment analysis on financial reports. She has also worked on a Finnish language Speech2Text project.

At Silo.AI, we have project owners, but no project managers. In her work, Luiza needs to be responsible for teaching clients and presenting material in an easy to understand way. She’s constantly learning about client interaction and communication. AI Scientists also get a lot of autonomy and they are free to choose ways of working and approaches to try.

“I love the creativity in problem solving, involving freedom in exploring the ways to solve the problem. I’m used to iterating on different approaches.” Luiza says.

“In one case, I work with environmental data and climate change, assessing how companies act towards reducing emissions. I need to get familiar with the topic and its terminology, and then try to ask good questions from the client. This domain knowledge helps me to see possible opportunities and to assess the ongoing project critically.” Luiza says.

The meaning behind the lines 

In NLP, the scientific community took a large step forward towards understanding the semantic relations between words with the invention of vector representations for words.

“In NLP, we try to teach the model the meanings of the phrases. The machine learning model is like a foreigner who sometimes fails to understand the full meaning behind the sentences. This often happens with complicated logic, jokes or sarcasm. When we can approximate meaning, we are able to solve a variety of problems including answering simple questions, finding similar relevant content and parsing the grammatical structure of the text. Inferring the exact meaning behind the lines is not something that we can delegate yet to the algorithms, which lack the reasoning component.” Luiza explains.

Luiza appreciates language as data, as it’s intuitively understandable by humans:
“When it comes to data analysis, working with language is valuable because the text that we analyze can be intuitively understood by humans too. Compared to vector data, where you have a sequence of numbers, you don’t always get an intuitive understanding.”

“Studying languages has also been my priority and a fun past-time throughout my life. The beauty of working with text is the power of its expression that makes this field challenging,” she concludes.

Music and sports give energy

Luiza enjoys the most reading books and doing sports, like bicycling or badminton. She always enjoys from time to time playing guitar and singing. Recently she’s been combining her research on artificial intelligence with studying psychology and deeper understanding of the human mind at University of Helsinki. She enjoys learning this different but still academic perspective.

Favorite Silo.AI value?

“While my most favorite value is Keep Learning, this year I would like to combine more our two values Keep Learning and Build Bonds by exchanging more knowledge with my colleagues coming both from tech and operational teams. The value of understanding the tech side as well as business side is vital while growing as an AI-scientist.”

Resources & Libraries from Luiza

  • While coming up with alternative ways to solve a specific type of task one can check the ranking of state-of-the art algorithms in terms of performance on the website
    https://nlpprogress.com/. Sometimes the performance is crucial and 1% of accuracy matters, then state-of-the-art algorithms should be considered for the case. 
  • My favourite NLP book which is still in progress is Speech and Language Processing, written by Dan Jurafsky and James Martin. Since the book is in progress it is possible to contribute by sending comments and suggestions on the existing chapters, which I am planning to do. 
  • I recommend as well neural parsing pipeline for segmentation, morphological tagging, dependency parsing and lemmatization developed by Turku NLP group (https://turkunlp.org/Turku-neural-parser-pipeline/docker.html) which supports more than 50 languages!
  • I use SpaCy library on a daily basis when it comes to text preprocessing and quick prototyping. For running pre-trained models such as BERT and GPT2 and I use hugging-face libraries with PyTorch. 
  • I also recommend the Deep Pavlov library containing a variety of pre-trained SOTA models for English or Russian. 

Would you like to join Silo.AI as Luiza’s colleague?

We are especially looking for NLP experts to our offices in Helsinki and Turku to solve real-life cases using the latest NLP techniques.

Share

Share on twitter
Share on facebook
Share on linkedin

Author

  • Pauliina Alanen Pauliina Alanen

Topics

Natural Language ProcessingWorking at Silo AI

You might be interested in

Silo AI and Mila join forces to connect leading AI experts in the Nordics and Canada 

Pauliina Alanen 2.2.2023

Silo AI, one of Europe’s largest private Artificial Intelligence (AI) labs, is proud to announce a partnership with Mila – Quebec AI Institute, the world’s largest academic deep learning research center. Founded by the leading AI researcher, Yoshua Bengio, Mila brings together more than 1,000 academic researchers in machine learning (ML). The partnership aims to connect state-of-the-art AI research with industry needs. With a strong experience in building AI-driven products, Silo AI has gathered a unique pool of AI talent including 240 AI experts, out of which 120 have a PhD degree. 

Read more

Read More

Hype, hope or revolution: What is ChatGPT and do you need to care?

Peter Sarlin 31.1.2023

The hype is most definitely real. OpenAI’s conversational chatbot ChatGPT has in recent weeks provided hope. But is it a true technological revolution? Put simply, the answer is both yes

Read More

We challenge you to ask why

We don’t only deliver projects but we challenge you to think different.
Contact

Subscribe to Silo AI newsletter

Join 5000+ subscribers who read the Silo AI monthly newsletter

silo.ai
Contact

+358 40 359 1299

info@silo.ai

  • Helsinki, Finland
  • Stockholm, Sweden
  • Copenhagen, Denmark
Menu
  • Home
  • Services
  • Solutions
  • Research
  • Work
  • About
  • Careers
  • Contact
Menu
  • Home
  • Services
  • Solutions
  • Research
  • Work
  • About
  • Careers
  • Contact
Resources
  • Learn
  • Inference podcast
  • For media
  • MLOps
  • Predictive maintenance
  • Nordic State of AI report
Menu
  • Learn
  • Inference podcast
  • For media
  • MLOps
  • Predictive maintenance
  • Nordic State of AI report
Linkedin Facebook-square Twitter Instagram Spotify
©2017-2023 All Rights Reserved.

|

Website Privacy Policy / Cookie Policy / Newsletter Privacy Policy / Recruitment Privacy Policy

Manage cookies
We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept All”, you consent to the use of ALL the cookies. However, you may visit "Cookie Settings" to provide a controlled consent. Read Cookie Policy
Cookie SettingsAccept All
Manage cookies

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
CookieDurationDescription
cookielawinfo-checkbox-analytics11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional11 monthsThe cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy11 monthsThe cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
Functional
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Analytics
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
Advertisement
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.
Others
Other uncategorized cookies are those that are being analyzed and have not been classified into a category as yet.
SAVE & ACCEPT
Powered by CookieYes Logo