The ProActive Scheduler is an open-source software to orchestrate, scale and monitor tasks among many hosts. It supports several languages, one of them is the statistical computations and graphics environment R. This environment is known for providing computational intensive functionality, so write your R scripts on a laptop and execute them on different, more powerful machines.
Docker container for portability and isolation
On the Cloud Expo Europe in London you can see an exciting and heavily developed feature of the ProActive Scheduler which is: Docker container support. Running tasks in containerized form has the advantage of increasing isolation between tasks and providing self defined environments in which you can them. Thought further, containers could be used as a replacement for tasks and run them in your environment inside a container. The possibilities are endless and you do not have to care about error recovery, network outage or other complications running in distributed environments, because the ProActive software deals with that.Machine Learning with ProActive and R: Local Setup
Following, a few steps on how to install and run ProActive and finally execute an R script with the ProActive Scheduler. The following steps are done using an Ubuntu
operating system.
Requirement:Installing the R Environment and RJava
Install the R Environment and RJava by typing:
# sudo apt-get install r-base r-cran-rjava
Download ProActive
- Create an account on www.activeeon.com
- Download the current ProActive Workflows & Scheduling
- Unzip ProActiveWorkflowsScheduling-linux-x64-6.1.0.zip
- Download the ProActive-R-Connector (par-script-6.1.0.zip)
- Unzip par-script-6.1.0.zip into the ‘ProActiveWorkflowsScheduling-linux-x64-6.1.0/addons’ folder
Ready, you just installed ProActive and R support.
Start ProActive Server
Execute:
The standard setting will run the ProActive Scheduler and local 4 nodes.
# ./ProActiveWorkflowsScheduling-linux-x64-6.1.0/bin/proactive-server
The standard setting will run the ProActive Scheduler and local 4 nodes.
Note: ProActiveWorkflowsScheduling-linux-x64-6.1.0 is the
ProActive home directory, it might be called different when you downloaded a
newer version.
Wait until you see “Get started at” showing the link to
access the web-interface.
Start the ProActive Studio
The interface will show three possibilities, the most left orange circle is a link to the ProActive Studio, which is used to create workflows and execute them. Click on the left circle to open the Studio. Login with: admin and password admin
Create an R task
After creating a workflow and opening it, the interface will show a 'Tasks' drop down menu, select 'Language R'to create an R task.
Add your R code
Add your R code, here you can download an altered example from http://computationalfinance.lsi.upc.edu/.Add your code under the "Execution" menu which appears after selecting the R_Task.
Note: When R is executed on another machine, it must have
all necessary packages installed and loaded, ensure it by installing packages in advance,
it can be done within a script by specifying the library and the mirror
Add datasets to R_Task
The script will load a dataset, the SP500_Shiller dataset which you can download here. The R script will be send to one ProActive Node and to ensure that the node has the data we need to specify the data dependency inside the 'Data Management' settings of the R_Task. Specify the SP500_Shiller.csv as an input file from user-space. The file must be copied to ProActiveWorkflowsScheduling-linux-x64-6.1.0/data/defaultuser/admin which is the user-space for the admin user.