Page tree
Skip to end of metadata
Go to start of metadata

LOOP( dataset, loopcondition, loopbody )


The ECL LOOP statement allows a series of transforms and other operations to be executed repeatedly against a dataset.  The iterations operate on the entire dataset (to iterate through records we would use the ITERATE statement).

 

A recursive function, loosely speaking,  is a function which calls itself while supplying its own output as input.  ECL does not directly support recursion, but the LOOP statement has similar properties and can be used to emulate recursion.  I will show simple examples here to calculate factorial and Fibonacci numbers.  First the code.

 

Factorial 

integerRec:= {integer factorialValue}; 
f0:=dataset([1],integerRec); //base condition: single value dataset initialized to one
integerRec factStep(dataset(integerRec) fact,integer cnt):= PROJECT(fact,
        TRANSFORM(integerRec,
               self.factorialValue:=left.factorialValue*cnt; )); //the iterating function operating on the dataset.
LOOP(f0,10,factStep(ROWS(LEFT) ,COUNTER)); //is 10 factorial

 

The general method to arrive at this:

  1. Define a dataset with the base conditions, in this case containing the integer 1 (1 factorial = 1)
  2. Set up a block of ecl code which 
    1. results in a dataset of the same record format as the original dataset.
    2. where resulting dataset is input of the code for the next iteration
  3. Encapsulate with a FUNCTION which takes as input the dataset and returns next iteration. 
  4. The function can be used as loop body in the loop statement.
    1. for example, LOOP(ds, (some condition), doThisFunction(ROWS(LEFT),counter) )
      1. ROWS(LEFT) is key word meaning the dataset input, which is the dataset output of the previous loop iteration, or original dataset if no output yet exists
      2. COUNTER is a key word meaning the loop counter, which is optional
      3. alternatively use a single Project, Transform instead of function
      4. note: remember to cast parameter to a DATASET(record) in the function declaration
      5. (some condition) may be an integer specifying how many loop iterations
  5. Manually check the function with several iterations, in this case:
    1. the code is:

       f1:=factStep(f0,1);
      f2:=factStep(f1,2);
      f3:=factStep(f2,3);
      f3; //output 3 factorial
    2. the second parameter corresponds to COUNTER in the LOOP statement.  Optically it looks like recursion and functions similarly
  6. When that works, utilize the function in the LOOP statement
    1. LOOP(base case dataset, iterations or exit condition, myFunction(ROWS(LEFT),COUNTER));

 

The FUNCTION statement may be used just as well as the PROJECT, which will allow multiple transforms, joins and other operations to be performed.  For factorial for example, just wrap the transform inside of a function.

 

  1. The code is:

    integerRec factFunction(dataset(integerRec) fact,integer cnt):=function
          step:=factStep(fact,cnt);
          //add more lines here to do anything else
         return step;
    end;
    loop(f0,10,factFunction(rows(left),counter));

 


Fibonacci

fibRec:={integer predValue, integer curValue};
fib0:=dataset([{0,1}],fibRec); //initial values are base conditions
fibRec fibonacciStep(DATASET(fibRec) fib):=PROJECT(fib,
      TRANSFORM(fibRec,
            self.curValue:=left.curValue+left.predValue;
            self.predValue:=left.curValue;
      ));
/*
//check manually
fib1:=fibonacciStep(fib0);
fib2:=fibonacciStep(fib1);
fib3:=fibonacciStep(fib2);
fib1;fib2;fib3;
//when this works, change to LOOP statement
*/

LOOP(fib0,6,fibonacciStep(ROWS(LEFT))); //sixth fibonacci pair subsequent to base pair

Although these examples operate on a single record, conceptually the base conditions could be any number of records and HPCC will optimize the processing of those as usual.  However, true recursion is usually not the most efficient method of calculation, and ECL is not designed for recursion or optimized for iterative loops.  We should also keep in mind that ECL creates new datasets on each loopbody execution, with the associated impact on memory.



  • No labels