Tuesday, March 25, 2014

ProActive Cloud Connectors

ProActive Scheduler is able not only to launch computing jobs on infrastructure where it’s installed but also to create infrastructures on demand. It is often useful when you want to isolate your computations, have a root access to machines or use an exotic software for your computations.

In the core of ProActive Scheduler stands a part responsible for managing resources. All resource are splitted by sources sharing the same infrastructure and access policy. A node source can be a set of desktop machines available through the network or a cloud infrastructure dynamically deployed on a servers.

We support two major cloud management software OpenStack and CloudStack. To use them with ProActive clone and build it from the dedicated repository

Then drop the jars into addons folder of ProActive Scheduler and enable them in configs by adding

  • org.ow2.proactive.iaas.cloudstack.CloudStackInfrastructure to the config/rm/nodesource/infrastructures configuration file (CloudStack)
  • org.ow2.proactive.iaas.openstack.NovaInfrastructure to the config/rm/nodesource/infrastructures configuration file (OpenStack)
  • org.ow2.proactive.iaas.IaasPolicy to the config/rm/nodesource/policies configuration file (deployment policy)

Once enabled you can either deploy infrastructure manually using ProActive Resource Manager interface or configure your computations to deploy the infrastructure on demand.

In orders to launch you computations on Virtual Machines we need to configure them so that ProActive daemon is launched at the moment of VM boot. For this purpose we use “user data” (see cloudinit) to pass a launching script to the VM instance or just preconfigured images. E.g. for cloudstack infrastructure ProActive Scheduler launches daemon using the following script (this daemon will be used later by the Scheduler to run computations on this host)

This script uses pre installed ProActive but it can be modified to download ProActive automatically. Once your VMs are up and running you can submit jobs to them through ProActive Scheduler.

Sometimes it’s important to launch a set of VMs for particular computations on demand and prevent other jobs to be scheduled on these hosts. We developed a special deployment policy (IaasPolicy) for this purpose (see our doc for details). It scans the queue of jobs in the scheduler and triggers the infrastructure deployment for jobs with special markers. The infrastructure will be protected by a special token (see the first parameter in generic information below) and only jobs having it will be scheduled there.  Here is an example of such job

For this job the policy will start the infrastructure described in the generic information. Once deployed the scheduler launches tasks on these computing resources and turns them off at the end of computations. It is also possible to control the exact moment of resource deployment / undeployment from a workflow but this will be discussed in another post.

Monday, March 17, 2014

Native script engine with JSR 233 and ProActive

Behind the cryptic name of JSR 223 lies an interesting feature of Java: the ability to run script engines inside the JVM. You might know dynamic languages that integrate with the JVM using this JSR such as Groovy, Jython or JRuby

At Activeeon, we also use this capability to enable the users to customize parts of a workflow. It is possible to specify pre, post, selection and cleaning scripts using the JSR 223. These scripts are often used to customize the task execution, to setup or clean the environment before running the task.

Last year we introduced a new type of task where you can directly write scripts. It is known as the script task and again leverages the JSR 223 to easily integrate with scripting languages. You can quickly test it on try.activeeon.com by following the quick start tutorial. It uses the JDK Javascript engine to run the well known Hello World example. 

The JSR 223 is an interesting and quite old (8 years!) specification. It lacks a few features that probably prevented it to become more popular. For instance, there is no way to secure the classes and methods the script runtime has access to, i.e sandboxing. It can become problematic if you intend to run scripts provided by end users as this is a big hole in your system's security. In ProActive, it is not really an issue as users tends to have full access to the system. Most of the installation are dedicated to a small set of power users and not opened to everyone. Then system policies are often in place to prevent abuse. For others it can become a problem and the Riviere Groovy User Group recently organized a hackathon to secure Groovy script execution. The solution is specific to Groovy and goes beyond JSR 223, you can find more details here

As I mentioned at the beginning, several dynamic languages running on the JVM are supported as script engines but surprisingly I could not find any implementation that supported Bash. Sometimes you have existing scripts that you would like to use inside a ProActive workflow. Sometimes Bash is just the right tool to do simple things (pipe to the rescue!). Well Bash is still a scripting language so why not create a JSR 223 implementation for Bash? And by the way could it work for BAT scripts too? 

Ta-dah! Here comes the JSR 223 for native scripts: https://github.com/youribonnaffe/jsr223-nativeshell

It enables you to create a script engine inside the JVM that will handle BAT and Bash scripts. For instance it means that you can write ProActive script tasks like this one: 

The implementation is fairly simple and you can take a look at the source code on Github. The idea is just to take the script, write it to a file and run a native process, Bash or Cmd.exe with this script file as a parameter. My first version passed the script content directly as a parameter but I quickly hit a limitation on the size of the command line… You can also access the output of the script easily using JSR 223. 

You can easily test it by cloning the project, building it and running the script engine directly as shown below: 

One interesting feature of script engines is the ability to pass data in and out, aka bindings. In the case of native scripts, you can define bindings that will be visible from the native script as environment variables. Due to the fact that the native script engine is a native process there is no easy way to modify bindings within the script and get them once the script engine exits. The only output you get from a native process is the exit code. As a workaround, you can always use a file and read it back in your application. A few common Java objects are supported as bindings as detailed here

If you want to integrate this particular script engine in your application, it is just a matter of adding the JAR to the classpath (no external dependencies required). To integrate it with ProActive, you need to have this JAR file on every ProActive node. The script engines are automatically detected and you can access them with names or extensions.