The IoT world and datacenter world are known to generate information about anything from sensor data to software logs. This data carries information about environmental parameters, system health, software behavior, etc. and their previous state. It can consequently be used to train predictive models and alert IT operation in case of current and potential future issues. This predictive maintenance model or predictive analytics help companies prevent issues and save money.
In this specific blog post, we will focus on the parallelization of model training for machine learning. It will be done at scale with low-code, and little or no need to configure the underlying infrastructure. We will take you through the main steps to setup a training workflow, edit in seconds the training model, the input, the output and finally we will parallelize it with Activeeon Workflow & Scheduling.
The current blog complement a previous one, showing how we can achieve a simple scalable solution with fewer code.
To keep it simple, as shown below, we will focus on IoT sensors, simulated by tasks that generates random logs at fixed intervals.
Grey arrows represent the flow of the data
Orange arrows represent the interaction of Activeeon Workflow & Scheduling with the various components.
Streaming layer
For this IoT use case, the streaming layer is relatively simple. The objective is to send sensors data to Kafka and store the information into a MongoDb database.
With Activeeon libraries, users can deploy Kafka or Azure event hub with a click (see image below). They can quickly connect the relevant services together. Note that the solution is open and if special configuration is required, the code is available to the end user.
Finally, to generate the logs we created a simple task that will generate some data and send them to Kafka.
Build Workflow and Train model at Scale
Activeeon includes in its team a group of machine learning experts that have developed an open source Machine Learning Open Studio (aka. ML-OS) for the needs of data scientists, application people. ML-OS includes built-in pre-packaged libraries with generic code that follows best practices. All the major ML libraries are directly available (TensorFlow, Keras, SciKit, PyTorch, CNTK, MLlib, Deeplearning4j, etc.), together with ML pre-built Services (Azure Cognitive Services). Users can also build their own library with Algorithms and ML Models.
The Drag-and-Drop menus can be leveraged to quickly build the structure of a model training and edited through the web interface for fine tuning.
Step 1 - Connect to a data source
The solution includes a library of data connectors to ease and standardize access to data sources: files, SQL, NoSQL, etc.
In the example below, in a few clicks, the MySql connector is changed by a MongoDb connector. Only the connection parameters have to be added (IP, port, query, etc.)
Step 2 - Build a simple ML training structure
Once the connection to the data source is completed, data scientists can split the data for training purposes and predicting purposes.
Data Scientists will then select a training model relevant for their use case. In our specific example, we will choose an unsupervised algorithm for clustering. Note that with some generic tasks, it is simple to change the algorithm used.
Once trained, a branch can be added to test and validate the accuracy of the model. Another branch will be setup to enable users to download the model or export it. For the latter, we include tasks to export it in the embedded Data Space but also on Azure Blob.
Export to Azure blob storage is added using a pre-built task
We will change the clustering algorithm to show the ease of the solution
Step 3 - Parallelize with a replication control
One strength of Activeeon solution is the ability for data scientists to parallelize their code with no prior knowledge in multi-threading, scheduling or even knowledge about the infrastructure.
The integrated Scheduler and Resource Manager is in charge of connecting resources on prem or in the cloud (Azure, AWS, GCP, …) and scaling according to the load.
The data scientists just have to add the replication control and edit a few parameters.
Step 4 - Monitor and Control
Finally, Activeeon includes a scheduling view that enables users to visualize and monitor the execution of their workflow. Users can fetch the logs, export results, etc. and the scheduler manage priorities, distribution and error management when required.
Conclusion
In conclusion, with Activeeon scheduler, data scientists can focus on adding value to the company business applications. The Machine Learning Open Studio (aka. ML-OS) dramatically increases flexibility and productivity with built-in generic code that can be used to get started quickly. ActiveEon expertise in data science has also enabled to organize Machine Learning and Deep Learning best practices for more flexibility and reuse. Pre-built ML Tasks and ML Workflows are making users agile. They can change the data source, the model they use, or their export in seconds.
With that level of flexibility, companies can expand the scope of their predictive maintenance and predictive analytics strategies and move towards an industry 4.0.
In a following article, we will review how to build a complete lambda architecture and focus on connecting the training model the the streaming layer. And, if you cannot wait, watch our latest video on our integration with Azure cognitive services.
No comments:
Post a Comment