Tuesday, December 5, 2017

Agility for Data Scientists

The process followed by data scientists can be simplified by the diagram below. As shown, it is an iterative process where hypothesis are made and the model improved incrementally. Multiple platforms are competing on supporting data scientists’ needs but too often focus on standard workflows and methods whereas each use case is different and require specific development.

At Activeeon, our data scientist team is focusing on building templates that can be used and tweaked easily at each iteration.

Monday, November 27, 2017

Scheduling Recurring Jobs - Best Practices

This article aims to briefly present the best practices and suggest tools to plan and monitor recurring jobs with Activeeon’s solution. The main concerns are addressed through different features and services:

  • Schedule management through the Job Planner service,
  • Workflow validity and management through the Catalog,
  • Notification on event with an integrated feature to be more proactive,
  • Requirements checks prior to execution through a selection script to avoid unnecessary issues.

Job Planner

The Job Planner is a service included in ProActive to manage recurring jobs. The main benefits are:

  • Dedicated and centralized interface,
  • Clear forecast of workflows,
  • Simplified management of exceptions: additional executions and/or exclusion periods.

Generate cron expressions

Monday, November 20, 2017

Activeeon supports Python natively

In an objective to build an open platform for Machine Learning workflows and better data analytics, the latest release of Activeeon's solution includes a native Python task.

To keep it to the main benefits:

  • Analyzing data in Python using numpy, pandas, TensorFlow, etc. is now greatly simplified.
  • Native Python tasks run 10 to 100 times faster than Jython tasks.
  • It fully integrates with existing system such as Generic Information or Variable propagation.
  • Multiple Python versions are supported , even within the same workflow.

How to

Monday, October 23, 2017

AZURE PoC in the Box

As part of the $3 billions invested in Europe, Microsoft is opening multiple datacenters to support fast growing cloud adoption of Azure. The partnership between Microsoft and Activeeon resulted in developing the “AZURE PoC in the BOX” program. The main objective is to support companies on their path to gain competitive advantage through cloud services, flexibility and open solutions and support their transition.

AZURE PoC in the BOX principles

The AZURE PoC in the BOX program aims to reduce transition to the Cloud by tackling adoption barriers. The challenges usually faced are on multiple levels:

  • Management approval,
  • Investment justification and money allocation,
  • Environment configuration and time to market deployments,
  • Workload transition.

Compute resources will be allocated on the fly for the AZURE PoC to replicate your environment. Thus, investment is greatly reduced and a replicated infrastructure is instantly available.

Activeeon is an open source Software Vendor, the solution aims to simulate the workload on the platform. Its flexibility lets any business (ie “métier”) users of the PoC to orchestrate application framework across environments (e.g. local and Azure) while controlling execution. Scenarios can then be tested in a timely manner thus giving instant feedback to the testing teams.

Moreover, Activeeon supports transition to the cloud in multiple ways. Activeeon’s engineers are Azure certified expert. A workflow conversion tool has been developed to quickly adapt hundreds of workflows to ProActive standards. Finally, Activeeon’s solution being agnostic to the resource, the transition to the cloud is seamless and progressive without IT process interruption and therefore end users disruption.

Leveraging those technologies, the AZURE PoC in the BOX also aims to present companies how to boost agility at a process and infrastructure levels and estimate real cost savings. Moreover, it will stimulate access to new services enabling new insights or free additional resources. For instance, access to high available databases, user friendly replication mechanisms, etc.

AZURE PoC in the BOX journey

In three simple steps, get setup and explore new opportunities.

Tuesday, October 17, 2017

Easily Try ProActive Integration with TORQUE


In addition to deploy and manage its own resources, ProActive can be used as a meta-scheduler and benefit from infrastructures that are already deployed and configured. ProActive can interface with several schedulers and resource managers, including TORQUE. In this post, we show how ProActive can manage native scheduler node sources, whose nodes belong to the TORQUE resource manager, and how ProActive can submit jobs to these resources. We provide the ‘activeeon/proactive-torque-integration’ Docker image that allows our users to try this particular integration easily in a Docker container. This Docker image includes an installation of TORQUE and an entrypoint that downloads and runs the latest release of ProActive.

Setup the Docker Container



First start the Docker container:
$ docker run -d -h docker -p 10022:22 -p 18080:8080 --privileged --name proactive-container activeeon/proactive-torque-integration


Before using ProActive, you need to monitor the Docker container until the ProActive scheduler is running (a few minutes are needed). You can do this with the following command:
$ docker logs proactive-container --follow


As long as it does not return anything, it means that the ProActive scheduler is not yet running.
When the SchedulerStarter java process is displayed, open a web browser and go to http://localhost:18080/. Wait a bit more if the page cannot be displayed, ProActive is still starting.

Create a Native Scheduler Node Source


Now we can create a native scheduler node source, that will eventually manage the nodes of TORQUE. In order to manage the resources of another scheduler than ProActive, and to have these resources represented as ProActive nodes, you need to create a node source with an infrastructure and policy that are dedicated to native schedulers.

Go to the ProActive Resource Manager and login with the admin/admin credential. Click on ‘Add Nodes’. Choose a name for your node source and fill the form with the following values (remove the quotes):

Friday, August 18, 2017

Orchestration in the Machine Learning and Big Data Ages

Modern big data and Machine Learning ecosystems are growing fast in a multi-layered and heterogeneous systems. Businesses are not considering a unique technology but an architecture of interconnected solutions that would bring value to a company. It is expected to consolidate as the market matures but at the moment, tech articles promote multiple solutions to answer specific use cases.

Fortunately, most tech companies are now embracing open development. This supports initial integration and setup through Rest API, SDK, CLI, GraphQL, etc. Then managing diverse solutions at scale quickly leads to new challenges for longer term control and governance. Indeed, each company deploys specific data flows and manages access differently, therefore only custom solutions are suitable.

To control and orchestrate these new environments, ActiveEon developed a product to support business IT, sysadmins and data scientists. The concepts followed by the team behind the solution are:

  • “A picture paints a thousand words.” This is why through the workflow studio, users can easily manage dependencies in order to visualize and design business processes. Advanced controls such as replication simplifies use of advanced behaviors.
  • Flexibility and agility drive successful businesses. Integration at customer’s is supported by an open Rest API, CLI and SDK with a resource agnostic solution. Moreover, users can leverage their preferred choice of language from bash scripts, to python through R.
  • Stability through control and governance drive further innovation. IT admins build and test universal workflows. Operators monitor progress of workflows and manage errors at a granular level.
  • Resource consumption directly impact projects’ ROI. Resource aware applications improves utilization and reduce overprovisioning. Granular management of resources leads to better distribution.

Friday, August 11, 2017

Activeeon Job Planner: in depth

Activeeon Job Planner: in depth

In the previous article, we already presented the new job planner, a tool to execute jobs periodically. Now let’s explore how to add exceptions: include or exclude specific dates. To sum up, be more flexible.
For example, if you want to send analytic reports each monday but not the day’s off, you can create a calendar which avoid these days. Or, if you execute a job once a month but need two extra ones, at the beginning and at the end of the summer, it is possible.

HOW TO USE IT ?