In a conference, Danny Lange – Director of Machine Learning at Uber – made clear his opinion: Machine Learning must be taken to every corner of the company. Let’s not forget he led the Machine Learning team at Amazon. Let’s remember that Amazon has taken Machine Learning to all its areas to do interesting things such as predicting the demand for its products, setting their prices, making personalized recommendations, optimizing distribution routes, improving computer vision or detecting fraud. Last year they went a step further and created a cloud platform to bring Machine Learning capabilities to all companies.
It’s worth watching the video if you want to know what the big leading data companies are doing right now. These companies have been called “Data Driven“, i.e. they make decisions based on data and in many cases automated (for example, the price of clicking on Google ads is calculated with real-time global auctions based on instantaneous supply and demand). These are companies that were mainly born on the Internet, such as Google, Amazon, Facebook or Uber himself.
One of the main obstacles we are seeing in the implementation of Machine Learning in companies, is precisely that they are not familiar on how to start using it and often do not understand the advantages it offers. This last point is cleared up when we show examples and explain the technology as didactically as possible. On the other hand, the first one, how to start integrating it into the company, is more complicated because it means putting your feet on the ground and starting to work with innovative technology. Innovation has its risks, but we are convinced that Machine Learning is here to stay and that it will revolutionize societies as much as the mobile phone has.
The following is an initial guide on how to start working with Machine Learning in your company. Our point of view is very close to Danny Lange’s and is based on the following principles.
1. Start with something simple
There are companies, even big ones, that are not predicting, for example, which customers are going to cancel their services (churn). These companies focus their efforts on getting more new customers than those who leave, without realizing that they have enough data to predict who will go to the competence. The paradigm shift is of great value: the economic cost of maintaining a customer is much lower than the cost of getting a new one. Predicting casualties is certainly a good way to start. The objective of these initial projects is to have quick-wins that help the business to understand the possibilities that are opened up to it and also that the technological areas begin to value how to integrate it into their systems.
In CleverData we do this type of projects with an average duration of 3 weeks
2. Start with supervised Machine Learning
Supervised Machine Learning makes it easy to make predictions using historical data. The word “supervised” has nothing to do with a human “reviewing” the predictive algorithm, it is only one of the possible techniques of Machine Learning. With Supervised Machine Learning you can:
- Predict demand (how much product to buy next week)
- Predict customer churn (which customers are going to the competence next month)
- Detect fraud (which purchases or transactions are fraudulent)
- Predict cancellations (of hotel reservations, restaurant tables)
- Prevent payment defaults (predict whether a customer will stop paying)
The main advantages of supervised machine leasing over other techniques are that it is easier to understand, answers specific questions (such as those in the previous paragraph) and has powerful methods for evaluating the quality of algorithms before they are implemented in production environments. There is no doubt about it: it is the best technique to get started in the company.
3. Don’t start with Big Data.
Working with Big Data is very expensive, and many companies still do not have an adequate infrastructure to store a huge amount of information. Big Data takes hours to process due to its high volume. But, to use Machine Learning you don’t need such a huge amount of information. Based on our experience, companies today have more than enough data to create very high quality predictive algorithms. It is more important to have good data than a lot of data.
We often meet companies that are in the middle of a data storage whirlpool. The Big Data is in fashion (although this fashion is becoming obsolete) and they collect as much data as possible; “then we’ll see what we do with it”. We think this approach is wrong. Companies already have enough data today to make very interesting projects that add value to their business, without having to wait for large storage infrastructures to store huge amounts of data.
An example: the data we work with in CleverData in projects to introduce Machine Learning in the company does not exceed 30 megabytes.
4. Use Machine Learning in the cloud
From a technological point of view, what is most fashionable is to do Machine Learning by programming in Python or R. We think it is a mistake. The reasons are numerous, although we will highlight only three. First of all, it’s a mistake because you need highly specialized professionals in programming and algorithmic tasks. These are usually people who are not close enough to the business to solve the needs of customers. Secondly, the algorithms programmed in these languages are complicated to put into production and difficult to reuse (any programmer knows that reusing the code written by another programmer is very complex). Finally, platforms in the cloud dramatically reduce costs.
Platforms in the cloud with API-based systems facilitate the reuse of algorithms by residing in a single location and having access to their functionality, not to their code.
Large companies such as Facebook, Amazon or Uber are already implementing Machine Learning systems in the cloud internally as another infrastructure of the company. In the same way that for years companies have had a database service available to any department (or programmer), Machine Learning systems are being incorporated as engines accessible to any application and employee.
5. And above all, start now
Your competitors may already be making use of its competitive advantages. It’s time to get started with Machine Learning. The advantages in areas such as tourism, retail, banking and insurance are indisputable and we still do not know all the possibilities it has in business. It doesn’t matter what sector your company is in. What we do know for sure is that you don’t have to wait for billions of millions of data to create high-value business applications.
On the other hand, the cost has fallen dramatically since cloud-based platforms became available. At CleverData we are partners with two of the leading companies that have the most advanced Machine Learning tools in the cloud: Microsoft’s AzureML and BigML. What big technology companies (Apple, Amazon, Facebook) are doing, such as predicting which customers will churn next month or the demand for products, is already within the reach of all companies. Shall we get started?
Original article (in Spanish): “5 consejos para empezar con Machine Learning en la empresa”
Translation: Sergio Paul Ramos Moreno