Bringing models into production

In Data Science, software quality often is an issue that prevents models to hit production. Issues like no automated data pipelines (including how to make the results available to the outside world), bad quality of code, or not enough attention to non functional requirements (like performance) are showstoppers for applied data science.

How can you successfully bring data science models into production?

Software Quality in Data Science

Sometime ago I wrote a blog post about production ready data science. In case you haven’t read it, the main points were:

No automated data pipelines (including how to make the results available to the outside world);
Bad, or not good enough, code;
Not enough attention to non functional requirements (like performance).

At the end of the post, I concluded that software quality was a big, unaddressed, issue that prevented models to hit production.

After I wrote the post, I started thinking if other factors were left out.

Taking Data Science Theory into Practice

Then I realized that most data scientists I encounter in my daily practice, learned data science from university, trainings (online or not), books, etc.

All these resources teach, with varying degrees of quality, data science. However what is necessary to successfully apply data science in production is… Applied Data Science.

Applied Data Science

For me, applied data science means the remarks about software that I made in the previous post, plus:

Knowing the cost of false positives and false negatives;
Knowing how you can monitor your models when they run in production.

Knowing cost of false positives

With the first, I mean the following: let’s assume a company has a smart meter that disaggregates the energy consumption to an appliance level. Their business proposition is to let you know if some of your appliances are operating inefficiently.

A data scientist might have two models:

Model A can find 99% of the inefficient appliances, but mislabels 10% of the efficient appliances into inefficient appliances;
Model B finds only 80% of the inefficient appliances, but mislabels only 2% of the efficient ones.

Which model would you choose? From a non-applied data science perspective, many metrics would indicate that model A is better.

However if you don’t know the cost of mislabeling efficient appliances, you cannot make a decision. If 10% of your customer base loses trust in your model, there’s a chance they won’t ever take you seriously again. While if 20% never know that they have an inefficient appliance at home, that might not hurt the relationship as much.

Monitoring models in production

The second point, monitoring, it’s about knowing when the model is not performing as desired once in production. You don’t want to know you lost money at the end of the month (or the quarter): you want to know as soon as you start bleeding, and act on it.

If your data scientists aren’t trained in thinking in these terms, it’s gonna be hard to just let the model in the production environment!

As in my previous post, now comes the pitch (again): we can actively train your data scientists, either on the job or through our classroom offering, to become applied data scientists! Get in touch

A big thanks to Ivo Everts for listening to me while I was ranting about these topics!