
The best approach for data science is using an automated machine learning approach to custom build model architectures and ensure the best fit for the problem at hand. How does this work? First, a good autoML solution will automatically assess the quality of your dataset by proactively analyzing it to uncover problems such as missing data, low statistical variance, and columns with incorrect data types. It can then interactively advise you on the best approaches to address these problems and make your dataset suitable for the automated model-building process.
Next, it generates models suited for your data set, using methods such as neuro-evolution to discover the optimal neural architecture tailored to your data. In this way, rather than simply choosing the best performer in a tournament of predefined algorithms or blueprints, autoML solutions can use an iterative genetic process to build model topologies that are optimized with each passing generation. This approach to autoML effectively creates unique solutions that correctly and accurately reflect your data, translating into higher-quality predictions.
However, the benefits of autoML solutions are hindered by the underlying infrastructure used, which often dictates performance. Machine learning optimization is a resource-intensive process, and in an industry where achieving fantastic results at speed is the final goal, having efficient and reliable computational power is a must to shorten the model prototyping cycles. With technologies such as Intel® Message Passing Interface Library (Intel® MPI Library) and 2nd Generation Intel® Xeon® Scalable processors, autoML can be done in a scalable fashion that best leverages available resources both in cloud environments and on-premises. By standardizing on the x86-based Intel system, model building is performed in a robust, accelerated environment that stacks up very comparably to performance metrics traditionally only reserved for GPUs.
This approach has already been successfully implemented to solve a wide range of data science problems, including predicting and preventing customer churn for telecom companies. In one use case, customer churn was costing a major telecom organization $1.08 billion in profit annually. An autoML solution ingested customer data, cleaned and transformed this data for use, and generated a customer churn profile, identifying the characteristics of these customers and their likelihood of changing telecommunications providers. This model could then be used to predict and reduce churn by proactively reaching out to dissatisfied customers. It could also generate custom offers based on individual customer profiles that are optimized to prevent churn.
Using this solution, the major telecom organization was able to analyze usage patterns in real-time, identify key factors that contribute to churn rates, uncover customer needs and preferences, and use this data to proactively prevent churn. Please read this report to learn more about the project and the associated results.
These are the kind of results that can only be achieved with custom-fitted data science solutions. It’s time for a new approach to automated machine learning—one that acknowledges that data sets are just as varied and unique as individuals and deserve just as much personalized attention.
SparkCognition is a proud member of the Intel AI Builders program, an ecosystem of industry leading independent software vendors (ISVs), system integrators (SIs), original equipment manufacturers (OEMs), and enterprise end users, which have a shared mission to accelerate the adoption of artificial intelligence across Intel® platforms.
Discover what custom-tailored models can do for your data.
Join our upcoming webinar and/or stop by our demo station in the Intel booth at the O'Reilly AI Conference in San Jose.