We all heard that “data is the new oil”. However, just like its petroleum predecessor, data is of no use until it is processed. One processing step that is often required for unstructured data (e.g. text, images, audio and video files) is data annotation. This is done manually, can require highly trained domain experts (e.g. engineers, medical doctors, etc), and is one of the major hurdles on the way to democratizing AI due to the time and expenses involved.
What if we could enlist AI’s help in dealing with data labelling? This is the solution proposed by the field called “active learning”: having the machine learning model itself request labels for the data that it deems most useful for its training.
This webinar will start off by Scaleway’s Machine Learning Engineer Olga Petrova presenting the theory behind active learning. We will then hear from Kairntech, a french startup working on an AI-powered NLP (Natural Language Processing) platform. We will learn how Kairntech is working on harnessing active learning and automatic pre-labelling to offer a superior user experience and substantial savings to its clients who need to annotate data for NLP tasks. Finally, we will discuss the infrastructure requirements for making this all happen in a cost-efficient and scalable way on the cloud.