Page tree
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

HPCC Systems Platform 4.x has the ability to integrate with Java directly. This page takes you through a few steps that you have to implement to configure Java correctly. In this particular case we will see how to make it work on a Ubuntu 13.04 system

Configuring for Java Integration

1. Download and Install the HPCC Systems platform 4.x

Follow the installation instructions [here]

2. Install OpenJDK 1.7

In my case I had to install the default-jdk package

3. Set the classpath in the HPCC Systems configuration file

By default you will find the configuration file located at /etc/HPCCSystems/environment.conf

Edit the file with your favorite editor and configure the classpath

For example mine looks like: classpath=/opt/HPCCSystems/classes:/home/arjuna/workspace/StreamAPI/bin

where /home.... is pointing to my eclipse project build directory

4. Start the HPCC Systems Platform. I am running the platform on a single node for simplicity

 

Useful Commands:

 

sudo service hpcc-init start    => command for starting

sudo service hpcc-init stop     => command for stopping

sudo service hpcc-init restart  => command for restarting

5. Test if Java integration is working correctly

The HPCC Systems platform comes bundled with a Java example class.

 

Execute the following example in your favorite ECL IDE (or ECL Watch Playground)

IMPORT java;

integer add1(integer val) := IMPORT(java, 'JavaCat.add1:(I)I');

add1(10);

Other examples can be located here

6. THE END

Feel free to raise an issue at http://track.hpccsystems.com if it does not work as expected. I assure you that it will be addressed promptly.

A simple Java Integration example

The idea is to create a Java class that acts as a consumer of external data (think Kafka consumer). For sanities sake let us create a simple implementation of a class with a static method that returns a string. Making this a true Kafka consumer will be material for another Wiki page.

The Java Consumer Class

package org.hpccsystems.streamapi.consumer;

public class DataConsumer {
	
	public static String consume() {
		return "<dataset><rows><row>sample row1</row><row>sample row2</row></rows></dataset>";
	}

}

 

Now, assuming that you have Java configured correctly (if not, read the setting up Java wiki), the sample ECL code to call the Java class will look like:

The ECL Script

IMPORT java;

STRING consume() := IMPORT(java, 
        'org/hpccsystems/streamapi/consumer/DataConsumer.consume:()Ljava/lang/String;');


messages := consume();

OUTPUT(messages);

messagesDS := DATASET([{messages}], {STRING line});

ExtractedRow := RECORD 
  STRING value;
END; 

ExtractedRows := RECORD
  DATASET(ExtractedRow) values;
END;

ExtractedRows RowsTrans := TRANSFORM
  SELF.values := XMLPROJECT('row', TRANSFORM(ExtractedRow, SELF.value := XMLTEXT('')));
END;

parsedData := PARSE(messagesDS, line, RowsTrans, XML('/dataset/rows'));

OUTPUT(parsedData);

 

The calling of the Java consume method is really accomplished the first three lines. The rest of the code is used to extract the XML content into something more meaningful.

Give it a try and see how easy it is to extend HPCC using Java libraries. HPCC provides you the framework to perform Big Data analytics. This example shows you how you can easily extend ECL to perform advanced tasks like streaming data, text extraction, sentiment analysis etc. using Java libraries.

 

  • No labels