Page tree
Skip to end of metadata
Go to start of metadata

Welcome to the Red Book for HPCC Systems® 7.0.0 Gold. There are several sections in this Red Book as follows:

  • General HPCC Systems core platform
  • Significant new features core platform
  • ECL IDE

Users may benefit from glancing at other Red Book entries when making a large jump between releases.

Here's how to contact us if you find an issue or want to add something to the Red Book:

  • To raise an issue, use our Community Issue Tracker. Please create yourself an account if you don't already have one, to get automatic updates as your issue progresses through the workflow.
  • To ask a developer a technical question about something you are doing or have encountered, post in the Developer Forum.
  • To add a note into the RedBook, please contact Lorraine Chapman with full details.

General HPCC Systems core platform

New cross platform SOAPCALL behaviour in HPCC Systems 7.0.0

Prior to version 7.0.0 of the HPCC Systems platform, credentials were automatically sent with a SOAPCALL or HTTPCALL.

Versions 7.0.0 and later use security tokens and credentials are no longer available to be sent with a SOAPCALL. This is transparent when a 7.0.0 system is communicating with another 7.0.0 ESP in the same environment (same Dali). However, when a SOAPCALL is sent from a 7.0.0 system to a 6.x.x ESP (or to an ESP in another environment) credentials are not sent and the recipient cannot authenticate the request. There are two ways to remedy this:

  1. Upgrade the system receiving the SOAPCALL to 7.0.0
    This is the optimal solution, but may not fit in to your upgrade schedule or needs. This only works if the ESP is in the same environment.
  2. Embed encoded credentials in the SOAPCALL either in the URL or using HTTPHEADER.
    This is less secure, but can serve as a temporary solution. We strongly recommend using TLS (HTTPS) to send a SOAPCALL with embedded credentials.

Example:

IMPORT STD;
ip := 'http://127.0.0.1:8002/'; // Not secure
ips := 'https://127.0.0.1:18002/'; TLS Encryption
ipspw := 'https://username:password@127.0.0.1:18002/'; //credentials in URL
svc := 'MyModule.SomeService';
creds := D'username:password'; //Never put actual credentials in ECL code
// this is for example purposes
encodedCreds := STD.Str.EncodeBase64(creds);
//Using credentials in the URLOUTPUT(SOAPCALL(ipspw,svc,
                {STRING500 InData := 'Some Input Data'}, OutRec)); 
//Using HTTPHEADER to pass Authorization info
OUTPUT(SOAPCALL(ips,svc,
{STRING500 InData := 'Some Input Data'}, OutRec;
           HTTPHEADER('Authorization', 'Basic ' + encodedCreds)));

https://track.hpccsystems.com/browse/HPCC-21204

Spark Connector - Invalid null pointer exception might occur when creating an HpccFile object

An invalid null pointer exception might occur when creating an HpccFile object using the following constructor:

public HpccFile(String fileName, String connectionString, String user, String pass)

The workaround for this is to use the following alternative constructors:

public HpccFile(String fileName, Connection espconninfo) throws HpccFileException

or

public HpccFile(String fileName, Connection espconninfo, String targetColumnList, String filter, RemapInfo remap_info, int maxParts, String targetfilecluster) throws HpccFileException

These constructors require Connection objects which can be created as follows:

Connection espcon = new Connection(“http”, “myespaddress”, “8010”);
espcon.setUserName(“myuser”);
espcon.setPassword(“mupass”);

https://track.hpccsystems.com/browse/JAPI-136

Uninstall HPCC Systems including Spark to sucessfully install HPCC Systems with plugins without Spark

Installing HPCC Systems 7.0.0 with plugins but without Spark over a previous 7.0.0 version of HPCC Systems with Spark (including pre gold release candidates) causes multiple conflicts. To avoid these conflicts, please completely uninstall your previous HPCC Systems 7.0.0 version first.

https://track.hpccsystems.com/browse/HPCC-20726

Spark-shell must be started using the actual IP address

When trying to install the spark shell using the following command, it will fail to connect to the local host:

/opt/HPCCSystems/externals/spark-hadoop/bin/spark-shell --master=spark://localhost:7077

You must supply the actual IP address as follows:

 /opt/HPCCSystems/externals/spark-hadoop/bin/spark-shell --master=spark://10.1.2.34:7077

https://track.hpccsystems.com/browse/HPCC-20730

The way disk files are projected has changed in HPCC Systems 7.0.0


A new flag has been added to disk reads and indexes to control whether the new or old method of translating file formats is used.  It is primarily used for testing. Occasionally if a dataset contains an alien data type and the results are projected it may be necessary to add the following to a dataset definition:  

__OPTION__(LEGACY)

For example:

d := dataset('imgfile', { rawLayout x, unsigned8 _fpos{virtual(fileposition)} }, FLAT, __OPTION__(legacy)); 

The compiler will report an error message indicating that the option is required:

"This dataset contains deprecated record formats and virtual fields. Remove the alien data types, or temporarily add __OPTION__(LEGACY) to the table definition"

https://track.hpccsystems.com/browse/HPCC-20296

HPCC Systems 7.0.0 now allows index translation when merging index parts

This change potentially changes the order that records are returned from stepped index reads with multiple input keys. Previously the payload was used for ordering, now it is restricted to the keyed portion.

Previously the order was determined by the trailing keyed components, the payload memcmp'd and finally any leading keyed components.  This would cause problems if payload fields are removed or added since the ordering would no longer be consistent.  (The previous code also potentially returned data in an undefined order since it was possible for exact matches to access uninitialised data.).

The order is now determined by the trailing keyed components and then leading keyed components.  If there are exact matches between multiple keys all results will be returned from the first matching keys, before exact matches from the next key.

https://track.hpccsystems.com/browse/HPCC-20126

Inconsistent UTF8 comparisons

In versions of the platform prior to HPCC Systems 7.0.0, the utf8 strings that do not contain characters above 0x80 order 'A' before 'a', but the unicode collation is to order 'a' before 'A'.  from HPCC Systems 7.0.o utf8 strings will always use the icu collation order.  This may cause the results of some queries to change, because values may now be sorted in a different order.

https://track.hpccsystems.com/browse/HPCC-20083

Changes to Std.Date in HPCC Systems 7.0.0

is_local_time has been added to some functions and issues with daylight savings time have now been resolved.

https://track.hpccsystems.com/browse/HPCC-18148

SEQUENTIAL is now converted to ORDERED inside a child query

Using SEQUENTIAL in an APPLY was reported as causing the following error:

value global('gl3NV181') in workunit is undefined (code generator issue)

Most of the times SEQUENTIAL is used, it would be better to use ORDERED, however there may be times when the previous functionality is needed such as in cases where the conversion changes something that happens to work as expected. To revert back to using the previous functionality use the following:

#option('transformNestedSequential', false);

https://track.hpccsystems.com/browse/HPCC-19879

DALI checks file scopes when CheckScopeScans is disabled

CheckScopeScans is meant to control scope enforcement when a user accesses a file list.  However, in HPCC Systems 5.6.8, https://track.hpccsystems.com/browse/HPCC-16185 introduced a regression whereby if CheckScopeScans was disabled, scopes would not be checked even for a single file request. This is now resolved in HPCC Systems 7.0.0.

https://track.hpccsystems.com/browse/HPCC-19726

The default implementation of GROUP(dataset, fields, ,ALL) has changed

Previously the output of the operation happened to be globally sorted by the grouping fields, but this will no longer be the case.  If any ECL code had been relying on this side-effect, that ECL may generate different results in HPCC Systems 7.0.0  Adding  ,SORTED to the GROUP statement will cause the system to use the pre 7.0 implementation and generate globally sorted results.

https://track.hpccsystems.com/browse/HPCC-17415

Uninstall WS-SQL before downloading HPCC Systems 7.0.0 Beta 1 to avoid compatibility errors

WS-SQL is no longer a stand alone module with a separate installation process. It has now been integrated into the HPCC Systems platform as a core feature. If you previously installed WS-SQL as a standalone add on to HPCC Systems that predates version 7.0.0 Beta 1, please uninstall the earlier version of WS-SQL before installing HPCC Systems 7.0.0 Beta 1, to avoid compatibility issues causing an error.

Updates and new features to WS-SQL will now be recorded in the HPCC Systems changelog.

https://track.hpccsystems.com/browse/HPCC-19033

Internet Explorer 8, 9 and 10 are not supported in HPCC Systems 7.0.0.

Microsoft stopped supporting these browser versions in January 2016, leading us to stop supporting them from HPCC Systems 7.0.0 Beta .

https://track.hpccsystems.com/browse/HPCC-16869

Unicode TRIM now removes only space characters rather than all whitespace

Prior to HPCC Systems 7.0.0, the implementation of TRIM on unicode strings related to removing whitespace, was inconsistent with the implementation applied to standard STRING values. This inconsistency has been fixed in HPCC Systems 7.0.0. In STRING, this is space (0x20), horizontal tab, vertical tab, line feed, form feed, carriage return (0x09 to 0x0D) and non-breaking space (0xA0). In UNICODE, it is all characters with the white space property.

However, it’s possible some code may have been relying on the previous behaviour. If so, you can specify, WHITESPACE on any TRIM expression where you wish to use the prior semantics of removing other whitespace characters as well as space characters.

https://track.hpccsystems.com/browse/HPCC-19075 and https://track.hpccsystems.com/browse/HPCC-19008

New option to disable built-in scheduler in eclserver

A 'schedulerDisabled' option has been added to the eclserver configuration which defaults to FALSE. 

If an eclscheduler component has been added to the environment, schedulerDisabled should be set to TRUE in eclserver components to disable the built in scheduler to avoid the situation where multiple schedulers are dealing with the same eclserver. The same problem may also be an issue where multiple load-balanced eclservers are used.

Eventually, we plan to remove the eclscheduler from eclserver completely.

https://track.hpccsystems.com/browse/HPCC-18793

Converting to a varstring from an integer is now always done via a string

This change has one known effect, (varstring2)123 used to produce ' **', it now produces '12'.

https://track.hpccsystems.com/browse/HPCC-18034

Root scope for workunit is now blank

If a statistic or timer previously had a scope of "workunit" in HPCC Systems 6.x.x, it will now have a scope of "" in HPCC Systems 7.x.x. This may affect code that is processing timings from existing workunits.

https://track.hpccsystems.com/browse/HPCC-17329

Assigning to the same field more than once in a transform now produces an error instead of a warning

Inside a transform:

SELF.x := ...
SELF.x.y := ....

Now generates an error that SELF.x.y has already been assigned.

https://track.hpccsystems.com/browse/HPCC-16203

Significant new features core platform

New attribute in EMBED treating it as an activity,  allowing it to stream output

This feature allows embeds to behave in similar ways to the inbuilt activities.  When the activity attribute is specified on an EMBED the system will generate an activity in the graph for that embed.  The inputs to the activity are any leading parameters which have streamed dataset type.  (Any streamed dataset parameters that follow parameters of other types are not processed as input activities.)  If the result has a dataset type, then the output from the activity will be fed into the activtiies that use that dataset.

The code for implementing the embed will need to behave differently depending on whether it is a global activity (only generating a portion of the dataset on each node) or executed within a child dataset, where each activity will generate the complete dataset.  The way this information is passed to the embeded code depends on the language:

  • C++ - An extra parameter IThorActivityContext * activity is passed to the function.  It has the following members:

    interface IThorActivityContext
    {
    public:
        virtual bool isLocal() const = 0;         // is the activity local
        virtual unsigned numSlaves() const = 0;   // How many slaves is this activity executed on
        virtual unsigned numStrands() const = 0;  // How many strands per slave (currently 1)
        virtual unsigned querySlave() const = 0;  // 0 based 0..numSlaves-1
        virtual unsigned queryStrand() const = 0; // 0 based 0..numStrands-1
    };
     
  • Python and other embedded languages - An extra local variable _activity_ is passed to the function.  It has the following members:

    isLocal         - is the activity local
    numSlaves       - How many slaves is this activity executed on
    numStrands      - How many strands per slave (currently 1)
    slave           - which slave is this call for (0..numSlaves-1)
    strand          - which strand is the call for (0..numStrands-1) 


Examples are included in the regression suite, for example, testing/regress/ecl/embedactivity*.ecl and testing/regress/ecl/pyembedactivity.ecl.

https://track.hpccsystems.com/browse/HPCC-13036



  • No labels