HPCC Systems 5.2.x Releases

Welcome to HPCC Systems® 5.2. Please review the information included in both categories shown below.

General - Details of the impact of a change and information about any modifications you may need to make to your ECL code. Some items may contain workarounds with the fix coming in a later release. Where this is the case, the expected release version is shown. Go to the HPCC Systems® download page for the latest releases. 

Issues found in the entire 5.2.x series are listed here. Where possible, we indicate which release is affected by the issue and the most recently reported items are shown at the top of the list.

Significant New Features - We recommend you consider using these features right away. Please note: There is a new security option in DALI (details below).

General

NEW  Spilling streams under heavy memory pressure can cause a deadlock (Fixed in 5.2.8)

When the memory manager is actively trying to free memory at the same time that a spilling stream is spilling and in the process of deregistering its callback, a deadlock can occur (it can block on a rows lock).

https://track.hpccsystems.com/browse/HPCC-14067

NEW - Default newBalanceSpotter to OFF to avoid deadlock (Fixed in 5.2.8)

A new balanced splitter spotter was introduced in 5.2.0 (see https://track.hpccsystems.com/browse/HPCC-11400) to prevent a deadlocking problem caused by balanced splitters being pulled downstream in an unbalanced way. However, in some cases, particularly involving joins downstream, the newly generated graphs with balanced splitters may stall causing a deadlock. This affects all 5.2.x releases and will be fixed in HPCC Systems 6.0.0.

The workaround is to turn off in HPCC Systems 5.2.x series but adding the following to affected jobs:

#option('newBalancedSpotter', false);

https://track.hpccsystems.com/browse/HPCC-14047

NEW - Race condition involving shared spills can cause an assert and job failure (Fixed in 5.2.8)

This affects HPCC Systems 5.2.4 and occurs when dealing with a shared spilling stream. In the JIRA issue below, it was generated when it was used by Graph Results that a helper function is calling.

Currently, if the stream is created, an OOM event causes a spilling callback to the streams owner. The stream construction will block on a mutex and the spilling callback will not have knowledge of the pending new stream.

As a consequence, the spilling callback does not setup the stream correctly (to deal with the pending disk read it must before). When the spilling callback is done, the new stream constructor completes. The next time it is read, it hits the "assertex(((offset_t)-1) != outputOffset);" assert , which is an indication that it was not setup.

Note: This bug is likely to only occur when there is a lot of memory contention and spilling.

https://track.hpccsystems.com/browse/HPCC-14095

NEW - Roxie compatibility check missing (Fixed in 5.4.6)

The code for checking whether a WU was compiled with too old a version of the compiler is missing in versions 5.2.0 – 5.4.4, though the check is still performed when loading the slave queries. If an incompatible workunit is loaded you may therefore see "Query suspended on slave" errors even though it appears to load successfully on the server. In more severe cases the server may crash when attempting to load the incompatible workunit on the server.

Any deployed workunit created with a compiler OLDER than 5.2.0 needs to be deleted before starting a 5.2.0 or later Roxie. 

https://track.hpccsystems.com/browse/HPCC-14525

Upgrading from previous versions of HPCC Systems

The HPCC platform package naming convention was changed at version 5.2.0. This can cause issues with upgrade installs (e.g., rpm -Uvh or dpkg -i) by leaving multiple packages installed which can prevent the platform from functioning properly.

If you have an installed version prior to 5.2.0, you must uninstall before installing any version after 5.2.0.

For details about the name change, see:

https://track.hpccsystems.com/browse/HPCC-12157

Issue deleting logical filenames with leading digits on scopes or names (fixed in 5.4.0)

In some situations, Logical filenames created via spraying may create scopes/files with leading digits. Deleting logical filenames in this situation produces an error similar to the one below and the delete action fails:

Cannot delete .::2015050100.filename__filename: [ 25: SDS: IPropertyTree exception 
SDS Reply Error : SDS: IPropertyTree exception 
IPropertyTree: xpath parse error
XPath Exception: Qualifier expected e.g. [..]
in xpath = 2015050100_filename_95_filename^ in xpath '/_Locks/SuperOwnerLock/2015050100_filename_95_filename']

Logical filenames  of this type can only be removed using Daliadmin .  

This issue was introduced in HPCC Systems 5.0 as a result of this issue: https://track.hpccsystems.com/browse/HPCC-10770 and still exists in HPCC Systems 5.2.0. However it has been fixed in HPCC Systems 5.4.0. 

https://track.hpccsystems.com/browse/HPCC-13981 

Restart needed for ECL IDE after downloading GraphViewer in ECL Watch

When installing a clean version of the platform, you may find that graphs are not displaying in ECL IDE and you need to install the new Graph Viewer. To do this, go to ECL Watch and you will be prompted to confirm that you want to use the new Graph Viewer. After confirming in ECL Watch, restart the ECL IDE and your graphs will be displayed.

https://track.hpccsystems.com/browse/IDE-405

Blank row tag changes in HPCC Systems® 5.x

In versions prior to 5.0, specifying a blank row on an xml output using row('') would cause invalid xml to be generated containing blank row tags of the form <>.

In 5.0 this issue has been fixed and the empty row tag is suppressed.  However, this has revealed an issues in the despray process whereby an xml file which has no row tag and has a header or footer specified cannot successfully be desprayed. A zero byte file is desprayed instead.

The simplest way to work rround this is by adding a dummy, non-blank row tag which can be removed by post processing after despraying

https://track.hpccsystems.com/browse/HPCC-13331

Roxie published queries referencing foreign files no longer always access those files remotely

In HPCC Systems® 5.2.0 and later a foreign reference is now resolved in the same way remote files are resolved when explicitly using a remote dali ip at publish time.

  1. DFS meta data for the file is cloned to the local DALI DFS.
  2. Roxie will access the file as it does other remote files depending on the roxie configuration values:
    • copyResources is true, useRemoteResources is true - the file will be copied to the roxie cluster in the background, the file will be accessed remotely while copying
    • copyResources is false, useRemoteResouces is true - the file will be accessed remotely
    • copyResources is true, useRemoteResources is false - the file will be copied to the roxie cluster, and then accessed.

https://track.hpccsystems.com/browse/HPCC-12623

Option added to INDEX to ignore the legacy fpos field

Changes to INDEX have been made to make it easier and less confusing to use as well improving some index sizes.

 Previously,

  • If the last field in an index payload is an integer it is stored differently (as an unsigned8) in a different location in the index file.
  • If the last field in an index payload is NOT an integer, an implicit file position field is added to the index.

Adding ,FILEPOSITION(FALSE) to the index definition, or the build statement that creates it will prevent this implicit fileposition field from being created, and will not treat a trailing integer field any differently from the rest of the payload. (The use of a boolean as a parameter to an attribute is new, but a concept that we hope to gradually introduce whenever we can.)

This new option is only valid on an INDEX or on a BUILD used to build an index.

https://track.hpccsystems.com/browse/HPCC-10230
https://track.hpccsystems.com/browse/HPCC-10793

varstrings should not be used as key fields

Varstrings are terminated with a character with value \0, and can cause issues if they are used as keyed fields in an index. For fixed length varstrings the memory after the terminating \0 isn’t initialised, but it is compared when searching the key.  This could cause inconsistencies and problems in the index code in that rows used to create an index could be in the correct order if the strings were compared, but out of order when the uninitialized data was also compared.  From 5.2 an error will be reported if they are used as key fields in this way. The suggested course of action is to create new indexes using fixed length fields.

https://track.hpccsystems.com/browse/HPCC-12528

Multiple jvmoptions field separator change

Since version 4.2 onwards, when using the java plugin to make calls to java from within ECL, it has been possible to specify configuration options for the jvm in the environment.conf file.

Prior to version 5.2, multiple options to be passed to the jvm were separated by a colon (:). However since many jvm options actually contain colon characters, this limited the options that could be set in this way. Therefore from version 5.2 we use a space as a separator between multiple options in the environment.conf file.

If you have used this feature to specify options to the JVM, and have separated multiple options using a colon, you will need to change these colons to spaces when upgrading to 5.2.

https://track.hpccsystems.com/browse/HPCC-12782

List of files in a superfile is not complete in new ECL Watch

When viewing input superfiles for a workunit, you can open the super file to view the list of files. However the list shown in new ECL Watch list is the “current” list (at the time of opening). In legacy ECL Watch, the list it shown was the list as of the start of the job.

If you want to see the list as of the start of the job, go the the Advanced tab in new ECL Watch and select Open Legacy ECL Watch which is displayed in a separate browser window.

https://track.hpccsystems.com/browse/HPCC-13202

Refining nodes in a Thor Cluster

If you need to reconfigure a Thor cluster, to replace existing nodes (with new IP's) or add or remove new nodes, you must take an additional step to restructure the group. Dali will not automatically restructure the existing group. This is because existing published files reference the previous cluster group state by name, therefore changing it's structure would invalidate those files and make the physical files inaccessible.

There are a couple of scenarios to consider:

Replacing a faulty node or nodes

If the files are replicated, replacing a node and forcing the new group to be used by existing files may be desirable. In this case, reading an existing file will failover to finding a part on the replicate node when it tries to find a physical file on the new replacement node. To force the new group to be used, use the following command:

updtdalienv <environment file> -f

Note: In cases where there is no replications, data loss may be unavoidable and forcing the new group may still be the best option.

Resizing the cluster

Cluster nodes added or removed (but all previous nodes remain part of the environment and are accessible). Rename the group that is associated with the Thor cluster (or the Cluster name if there is no group name).

This will ensure that all previously existing files continue to use the old group structure, whilst new files will use the new group structure.

https://track.hpccsystems.com/browse/HPCC-9691

Significant New Features

New dafilesrv configuration option for improved security authentication

The security around dafilesrv has been enhanced by providing a new set of configuration options which take the location of a certificate and key files and provide encryption in transport.

By default, dafilesrv uses a nonsecure socket to handle remote file requests.  Beginning with HPCC Systems® Release 5.2.0, dafilesrv can be configured to communicate with remote processes on a secure SSL socket, which ensures that the file contents are encrypted on transport.

To enable this security feature, your HPCC Systems® Admin should edit the environment.conf file, usually located in /etc/HPCCSystems and uncomment and fill out the SSL parameters at the tail of the file.

Note: The keyfile must not have a passphrase associated with it.

The parameters are provided as follows:

#enable SSL for dafilesrv remote file access
dfsUseSSL=false
dfsSSLCertFile=/certfilepath/certfile
dfsSSLPrivateKeyFile=/keyfilepath/keyfile

Once the certificate files are created and installed, and their information is provided in the environment file, restart the platform so that they take effect.  Communications with remote clients will now take place on a different secure port, using SSL.  dafilesrv will continue to listen for requests on the legacy nonsecure port, and will immediately reject any of these requests.

https://track.hpccsystems.com/browse/HPCC-11764

New eclcc option for generating dependency information

There is a new option that allows dependency information to be available for each query that is run. When using this option, a new entry appears on the workunit Helpers tab in ECL Watch which is a link to an xml file containing all the dependencies.

To generate information contained in this file set the debug option exportDependencies in the debug options for the workunit. To enable this feature for all workunits, add it into the eclserver default options.

The benefit of this change means that it is now easier to track dependencies between the ECL definitions which in turn means that it is easier to understand the structure of ECL code and also which queries might be affected by changing a particular definition. There is a blog with more information about this here: http://hpccsystems.com/blog/definition-dependencies.

https://track.hpccsystems.com/browse/HPCC-460


All pages in this wiki are subject to our site usage guidelines.