Thomas Perrais


Present your position and your main missions at Proxem.

My name is Thomas Perrais and I am a Text Scientist in the research and development team at Proxem. My main focus is developing and integrating new Machine Learning algorithms into our software in order to facilitate exploration and knowledge acquisition in new fields.


Why do we need these Natural Language Processing libraries?

At Proxem, we carry out our projects with a hybrid approach by combining artificial intelligence algorithms with a powerful semantic engine. We use cutting-edge Natural Language Processing algorithms (word and document embeddings methods, recurrent neural networks, attention models …) to provide leads and guide the user in their data exploration.

These models include high dimension matrix calculations and complex mathematical expressions that we wanted to regroup in an optimized set of libraries. The project was born from this need of being able to add new artificial intelligence bricks to our software


Why have you developed these libraries that are similar to some existing libraries in C# and Python?

Our software is mainly developed in C# and we faced limitations while trying to integrate Machine Learning models in production. In Python, libraries such as numpy, tensorflow or theano greatly facilitated the use and the development of such models and united a great community of Machine Learning experts. However, there is no such equivalent in the .Net world. Starting from this observation, we have drawn inspiration from those libraries to develop NumNet and TheaNet, hoping to gather a community of C# developers around Machine Learning questions.


Why did Proxem decide to make that in Open Source?

By making our libraries open-source, we had a twofold objective: firstly, we wanted to shared with the C# community the fruit of several months of R&D so that everyone could easily integrate Machine Learning models to their projects. With TheaNet, developing and training a classifier is made with a few lines of code and we provide many off-the-shelf models that are just waiting to be used.

Our second objective was to improve our libraries through the ideas and suggestions of other users. Artificial Intelligence is a fast-growing and fast-changing field and we hope to create a large community around the project to help it evolve and progress.


What’s next for Open Source? Does Proxem plan to put something else in Open Source?

Our company has always been research-oriented and we wish to keep contributing by publishing scientific papers but also by open-sourcing some of our tools. For instance, we are currently working on a project to create an open-source collaborative platform (PCU), with French research centers and companies, that will lead us to provide new sentiment mining and topic modelling tools to the community.


Discover all our interviews: here.