These days it is possible to learn pretty much anything online. Kaggle is a platform that allows people into machine learning to learn together. Many Silo.AI experts are familiar with the global online community of over one million data scientists. Particularly two of us, AI Engineer Joni Juvonen and AI Scientist Mikko Tukiainen can’t seem to get enough of exploring Kaggle’s computer vision competitions, which they tackle with their two-person team rähmä.ai.
Currently, Joni and Mikko are fighting for a spot on the top 10 at the Deepfake Detection challenge, where the winners (typically top 5 participants) will get a stunning $1 million. The challenge to identify Deepfake content has been put together by companies like Amazon Web Services, Facebook, Microsoft and academics. Let’s hear from Joni and Mikko what makes Kaggle such a special way for them to stay up to date on the latest computer vision technologies and learn together.
What is Kaggle?
Kaggle is an online community of 1.000.000+ registered data scientists. In the central part of the platform are the data-science challenges, that often have a machine learning twist and come with some form of winning prize like kudos, swag or money. Anyone can enroll in the competitions to test their skills. This makes Kaggle an amazing way to see where you are as a data scientist and learn from the community as you go from one challenge to the next.
To me Kaggle is, most of all, the world’s largest community for data scientists and machine learning practitioners. In Kaggle, users can publish their own datasets, write and share code, and use Kaggle’s cloud-based Jupyter Notebooks to build models. All of this is available for free and can help you as a data scientist to learn faster. In a quickly advancing field such as artificial intelligence and machine learning, it’s crucial to have a community where you can boost your skills on a regular basis to try out new models on fresh real-world datasets.
What kind of challenges have you participated in?
I started Kaggling two years ago, in 2018. Since then I’ve participated in nine computer vision competitions. My most successful one so far was to score on the top 3% in Histopathologic cancer detection. The best position in a serious contest was at the top 6% when we teamed up with Mikko and Antti Karlsson to detect steel defects for the steel and mining company Severstal. My previous competition was organized by Uber competitor Lyft to improve 3D object detection for self-driving cars. I also wrote about my experience with LIDAR U-Net model.
I got interested in Kaggle by getting drawn in by Joni. Since then, I’ve joined four competitions, all together with Joni. All of these have had the computer vision aspect in common and they have been tremendously helpful in getting experience with various kinds of industry problems.
Kaggle is an important hobby for both of you. What makes tweaking your models in the online community so fun?
The problems presented in the competitions are topical and often come from real life needs, such as the ongoing Deepfake Detection Challenge (rähmä.ai is in the top 15 at the moment). Kaggle has made it easy to participate and to get motivated by receiving different rewards throughout the challenge. For example, you can get praised by sharing your findings with other Kagglers in the community. Kaggle veterans often openly share their knowledge, so that each competition can be an oasis for learning the newest and best tricks in the field.
The competitions offer an excuse to learn new skills and a to try them in practice with real-world data. I think learning and puzzles are fun, but what quickly got me addicted was the competition aspect. Seeing your solution get a high score on the competition’s leaderboard is rewarding. As the challenges usually last a few months, it’s a race of problem-solving and continuous improvements to keep up and finish with a top score.
Many others would like to develop their data skills. How to get started with Kaggle?
It’s really easy to get started: you simply sign in with your Google account, fill in some personal information and then you’re ready to go. The scripts are written in Kaggle kernels that are similar to Jupyter Notebook, a code sharing and documentation tool we use at work too. Kaggle has a weekly GPU computation quota so that (small) models can also be trained on-site. Other competitors on Kaggle very eagerly share starter kernels along with baseline models, so no one will need to start developing their solutions from scratch.
Pick an ongoing competition that interests you and get familiar with the problem by exploring the forum and baseline notebooks that others have shared. Then, you may copy, lightly modify, and submit one of the public notebooks to get yourself on the competition’s leaderboard.
What have you learnt at Kaggle that is useful in your job as AI Scientist?
I’ve learned a lot in regards to teamwork skills, sharing and searching for ideas. What comes to computer vision, I’ve been able to test many new CV techniques, that I look forward to using in a client project too. As always with AI projects, I’ve also had to gain a big bunch of mindfulness for the data preparation.
The field of deep learning and computer vision is advancing rapidly, and many of the methods I learned from school are no longer something you would use in practice. For me, Kaggle competitions offer a fun way of keeping up and getting familiar with new advances. To score high on any challenge, you have to learn all the new tricks that gain an advantage.
Kaggle forums and notebooks are usually the first time I see a new method in action. After a competition ends, the winning teams usually share their solutions, findings, or even code, and I have learned a great deal by just reading them.
The best thing I got from Kaggle, however, is the hands-on practice. Last year, I gained experience from Kaggle competitions in detecting metastases from tissue images, classifying diabetic retinopathy and cellular images, identifying pneumothorax from chest x-rays, detecting different defect types from steel plates, locating 3D objects from self-driving car’s LIDAR sensor, finding vehicles position and translation from images, and identifying deepfake videos. Many of these learnings I’ve been able to bring into my client work at Silo.AI too.
Follow team rähmä.ai’s current competition for detecting deepfakes. Joni and Mikko look forward to open sourcing their model and to sharing more about how they built it after the competition is over – subscribe to our newsletter to stay tuned.
We’re looking for more curious AI experts like Joni and Mikko! Get in touch with our recruiter firstname.lastname@example.org to discuss our current projects and check out our open positions.