Tuesday, October 17, 2017

Easily Try ProActive Integration with TORQUE

In addition to deploy and manage its own resources, ProActive can be used as a meta-scheduler and benefit from infrastructures that are already deployed and configured. ProActive can interface with several schedulers and resource managers, including TORQUE. In this post, we show how ProActive can manage native scheduler node sources, whose nodes belong to the TORQUE resource manager, and how ProActive can submit jobs to these resources. We provide the ‘activeeon/proactive-torque-integration’ Docker image that allows our users to try this particular integration easily in a Docker container. This Docker image includes an installation of TORQUE and an entrypoint that downloads and runs the latest release of ProActive.

Setup the Docker Container

First start the Docker container:
$ docker run -d -h docker -p 10022:22 -p 18080:8080 --privileged --name proactive-container activeeon/proactive-torque-integration

Before using ProActive, you need to monitor the Docker container until the ProActive scheduler is running (a few minutes are needed). You can do this with the following command:
$ docker logs proactive-container --follow

As long as it does not return anything, it means that the ProActive scheduler is not yet running.
When the SchedulerStarter java process is displayed, open a web browser and go to http://localhost:18080/. Wait a bit more if the page cannot be displayed, ProActive is still starting.

Create a Native Scheduler Node Source

Now we can create a native scheduler node source, that will eventually manage the nodes of TORQUE. In order to manage the resources of another scheduler than ProActive, and to have these resources represented as ProActive nodes, you need to create a node source with an infrastructure and policy that are dedicated to native schedulers.

Go to the ProActive Resource Manager and login with the admin/admin credential. Click on ‘Add Nodes’. Choose a name for your node source and fill the form with the following values (remove the quotes):

  • As Infrastructure, choose: NativeSchedulerInfrastructure
  • As Policy, choose: NativeSchedulerPolicy
  • Upload your ProActive credential file in RMCredentialsPath. If you do not have a credential file already, you can click on the key icon next to this entry and give your information to generate a credential file. You can then use this file both for this entry and for the SchedulerCredentialsPath entry.
  • Fill NSFrontalHostAddress with: 'docker'
  • Fill NSSchedulerHome with: '/home/testuser/proactive'
  • Fill javaHome with: '/home/testuser/proactive/jre'
  • Fill alternateRMUrl with: 'pnp://docker:64738/'
  • Fill nsSubmitCommand with: 'qsub -V -o %LOG_FILE% -S /bin/bash -N %NS_JOBNAME%'
  • Fill nsKillCommand with: 'qdel -f %NS_JOBNAME%'
  • Fill submitReturnJobId with: 'true'
  • Fill SchedulerUrl with: 'pnp://docker:64738/'
  • Upload your ProActive credential file in SchedulerCredentialsPath
Parameters of Native Scheduler Infrastructure and Native Scheduler Policy
Finally click 'Ok'. At this point the new node source shows no node. This is because ProActive acquires the nodes of a native scheduler only when a job should be run on them.

Submit a Native Scheduler Job

Now, we need to submit a job that is explicitly marked to be run on a native scheduler, in our case TORQUE. Without this mark, a job would be executed on other node sources than the native scheduler node source. There is a generic information to be supplied to the job to mark it. The name of the generic information must be ‘NS’ and its value must be ‘true’. You can either provide this information when you define your own workflow within the Workflow Studio, or you can use the following sample job to be submitted through the ProActive scheduler (Scheduling & Orchestration):

<?xml version="1.0" encoding="UTF-8"?> <job xmlns="urn:proactive:jobdescriptor:3.8" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:proactive:jobdescriptor:3.8 http://www.activeeon.com/public_content/schemas/proactive/jobdescriptor/3.8/schedulerjob.xsd" name="Untitled workflow 1" priority="normal" onTaskError="continueJobExecution" maxNumberOfExecution="2"> <taskFlow> <task name="Groovy_Task"> <description><![CDATA[The simplest task, ran by a groovy engine.]]></description> <genericInformation> <info name="NS" value="true" /> </genericInformation> <scriptExecutable> <script> <code language="groovy"><![CDATA[println "uname -a".execute().text sleep(60000)]]></code> </script> </scriptExecutable> </task> </taskFlow> </job>

After submitting the job, a new node appears in the native scheduler node source with the task eventually running on it. When the job is finished, the TORQUE nodes are released and they disappear from the ProActive Resource Manager interface, until a new job is submitted again.

If you would like to try ProActive meta-scheduling on your own installation of TORQUE, download ProActive from the ActiveEon website and adapt the parameters given in this post to your own environment.

Friday, August 18, 2017

Orchestration in the Machine Learning and Big Data Ages

Modern big data and Machine Learning ecosystems are growing fast in a multi-layered and heterogeneous systems. Businesses are not considering a unique technology but an architecture of interconnected solutions that would bring value to a company. It is expected to consolidate as the market matures but at the moment, tech articles promote multiple solutions to answer specific use cases.

Fortunately, most tech companies are now embracing open development. This supports initial integration and setup through Rest API, SDK, CLI, GraphQL, etc. Then managing diverse solutions at scale quickly leads to new challenges for longer term control and governance. Indeed, each company deploys specific data flows and manages access differently, therefore only custom solutions are suitable.

To control and orchestrate these new environments, ActiveEon developed a product to support business IT, sysadmins and data scientists. The concepts followed by the team behind the solution are:

  • “A picture paints a thousand words.” This is why through the workflow studio, users can easily manage dependencies in order to visualize and design business processes. Advanced controls such as replication simplifies use of advanced behaviors.
  • Flexibility and agility drive successful businesses. Integration at customer’s is supported by an open Rest API, CLI and SDK with a resource agnostic solution. Moreover, users can leverage their preferred choice of language from bash scripts, to python through R.
  • Stability through control and governance drive further innovation. IT admins build and test universal workflows. Operators monitor progress of workflows and manage errors at a granular level.
  • Resource consumption directly impact projects’ ROI. Resource aware applications improves utilization and reduce overprovisioning. Granular management of resources leads to better distribution.

Friday, August 11, 2017

Activeeon Job Planner: in depth

Activeeon Job Planner: in depth

In the previous article, we already presented the new job planner, a tool to execute jobs periodically. Now let’s explore how to add exceptions: include or exclude specific dates. To sum up, be more flexible.
For example, if you want to send analytic reports each monday but not the day’s off, you can create a calendar which avoid these days. Or, if you execute a job once a month but need two extra ones, at the beginning and at the end of the summer, it is possible.


Saturday, August 5, 2017

Pilot with your Applications (ETL, CI/CD, etc.)

As any software solution nowadays, the key for company success is the ability to integrate all solutions into one. Indeed, as technology evolves, software solutions are becoming more and more specialized and require tight integration between each other to bring true value.

ActiveEon ProActive has been developed with an open approach. The solution is easy to integrate in any architecture and connect to external services. To be more precise, the solution has an comprehensive open Rest API to let any third party application integrate with it. On the other side, tasks can be developed in any language to execute rest calls or simply execute a command line in the relevant host.

Benefits of using ProActive as a pilot

Without going in too much detail, some of the benefits of using ProActive to pilot third party software are:

  • Shared resources, allocate resource dynamically depending on each service and application needs,
  • Priority, preempt resources from low priority tasks to give them to urgent ones,
  • Multi-language, automate business workflows through custom made scripts written in the most suited language,
  • Automation of preparation tasks before starting third party services,
  • Error handling, monitor and manage errors at a higher level for more control

How simple is it?

Nowadays, most companies are providing ways to connect to their system through API, CLI or SDKs.

  • API - If the company is providing an API, a ProActive task will be responsible for connecting and submitting Rest calls to the ETL endpoints. In Groovy, it is as simple as json = ['curl', '-X', 'GET', '--header', '', 'http://api.giphy.com/v1/stickers/search?q='+input+'&api_key=dc6zaTOxFJmzC'].execute().text
  • SDK - If the company is providing a library in Java/Python/etc., a ProActive task in the relevant language will be responsible for connecting and submitting relevant request to the ETL service. In that case, the library will have to be loaded within the ProActive folder or using a fork environment such as with Docker.
  • CLI - If the company is providing a CLI, a ProActive task will be responsible for connecting and submitting requests to the CLI service. In that case. a selection script may be used to select the relevant host and execute the command within it or as explained above a Docker container with the relevant SDK can be used.
Do not hesitate to try these solutions on our user platform.

Successful integrations at customers

Thursday, July 27, 2017

Accelerating machining processes

The MC-SUITE project proposes a new generation of ICT enabled process simulation and optimization tools enhanced by physical measurements and monitoring that can increase the competence of the European manufacturing industry, reducing the gap between the programmed process and the real part.

Automatization of the full machining process using ProActive workflows

Figure 1: The full machining process

The workflow controls the complete execution of all tools involved in the virtual machining process and automatically manages file transfers between tool executions. Figure 1 depicts the graphical representation of the orchestration xml file.Using dataspaces is crucial since tasks are submitted to ProActive nodes that could live remotely. Therefore, required files by a task must be placed in the scheduler dataspace to be automatically transferred to the running task temporary dir. To achieve that, ProActive provides dedicated tags (transferFromUserSpace, transferToUserSpace,...). Moreover, files will be referred from the task script using the file name, i.e. without specifying the path.

This workflow suffers from a lack of automatization. Indeed, the CAD task pops up a GUI, requiring parameters to be set for the Himill configuration file generation. This step breaks the full procedure automatization. To tackle that, we proposed an updated version of the workflow by first, migrating all the CAD parameters in the workflow parameters section. This can be easily achieved since the orchestration code follows the xml syntax, clearly separated from the functional code.
 removing the CAD task and the CAD installation path parameter. Then by adding a groovy section to dynamically generate the Himill configuration file according to the workflows parameters. Each task supports most of the main programming language, and we used Groovy which offers advanced methods to easily work with .ini files.

Friday, July 21, 2017

High Availability / Disaster Recovery Plan

Today let's discuss about high availability (HA) or more precisely disaster recovery plan (DRP).

As with any system, downtime can have major consequences for businesses. This quick article simply discuss two ways of achieving HA for ProActive.

Overall architecture

There are multiple ways for ProActive to be configured for High Availability (HA) / Disaster Recovery Plan (DRP).

  • ProActive stores its state within a database and includes an abstraction layer to configure the connection to a database. By default, the database is embedded within the ProActive folder. The objective is consequently to connect ProActive to a HA database (e.g. MariaDb can be configured this way, AWS RDS, ...)
  • The state being stored in an external database, it is important to monitor the behavior of ProActive. If it does not respond, it can be restarted which will then restart the scheduler, the diverse interfaces and connect to the database.

Below are two simple examples.

Monday, July 10, 2017

Introduction to Job Planner

Job Planner Methods

  • polling informations from other website,
  • updating regularly your data,
  • performing verifications and maintenance,
  • testing for new file in folder,
  • etc.

In this article, we will review these methods and go deeper into the latest one: the job planner.