Open sourcing machine learning research for natural language processing (NLP)
Two years ago, Zalando Research launched with a clear purpose to ensure that Zalando Tech is at the forefront of research in the areas of data science, machine learning, natural language processing and artificial intelligence.
Our researchers’ work previously focused mainly within Zalando. Therefore, we are very excited to announce that we have released “Flair”; our state-of-the-art natural language processing (NLP) library. Flair is under the MIT license and will continue as an actively maintained open source project under Zalando leadership.
Zalando Research Team
The Flair project is our cutting edge framework for natural language processing (NLP), meaning a framework to give a computer the ability to understand, tag and classify written texts. Flair is useful when you want to understand the meanings of email messages, customer responses, website comments, or any other scenario where users submit text feedback that you want to automatically classify or otherwise process.
The library is implemented in Python on top of the popular PyTorch deep learning framework. It packages pre-trained models for NLP tasks, including named entity recognition (NER) to detect things like person or location names in text and part-of-speech tagging to detect syntactic word types like verbs and nouns. It allows you to easily apply our pre-trained models to your text, or train your own sequence labeling or text classification models.
For instance, we can train Flair to recognize fashion concepts such as brands, colors or seasons in text, or to classify whole text documents into one or more categories. Check out the results of such below:
Due to its versatility, Flair is already part of several in-production systems at Zalando, as machine learning has become a natural part of our engineering toolbox.
You can find documentation and the source code of Flair on Github.
This is an important milestone for the open source and research teams at Zalando. Having research mature into in-production tooling and made available to the wider tech ecosystem as open source indicates a healthy and cutting-edge engineering culture at Zalando.
Comparison with the state-of-the-art
Flair’s accuracy out-performs all of the previous best methods on a large range of NLP tasks; evaluated against industry-standard datasets shows substantial improvements:
We invite you to start using Flair. There is already extensive documentation available on how to use the framework, so you can quickly get up and running and experiment with the models included, or train your own if you wish.
There is a growing community around Flair already, contributing new features and support for other languages.
Work in an exciting tech environment. Check out our jobs page.