Page tree
Skip to end of metadata
Go to start of metadata

This project is available as a student work experience opportunity with HPCC Systems this summer. Curious about other projects we are offering? Take a look at our Ideas List. Find out about the HPCC Systems Summer Internship Program.

The project proposal application period for 2019 summer internships is opening soon! To get notifications, subscribe to our Community Forum.

These projects involve the development of machine learning algorithms to extend the existing ECL-ML (ECL Machine Learning) on the HPCC Systems platform, using ECL and the underlying Parallel Block BLAS (PB-BLAS) infrastructure.

While the general purpose is to develop these algorithms mostly in ECL, using other languages that can be embedded in HPCC (Python, Java, Javascript, R, C++) is an acceptable option too. A few functions are C++ functions with a C++/ECL wrapper around them.

The HPCC Systems Machine Learning Library, is a work in progress. However, it already contains many classification, regression and clustering algorithms.

There are currently 2 available projects in this area:

  1. Implement an approximate n-tile algorithm
  2. Text Search Bundle

Objectives for the ML functions

  • To be able to handle large training sets in a timely manner by distributing the training set across the nodes of an HPCC System Thor cluster.
  • To produce statistics that measure the goodness of the fit of models created by the ML functions

Available Resources

Machine learning projects previously completed by students

  1. Implement a Gradient Trees Algorithm
    In progress. To be completed by George Mathew (NCSU PhD student in Computer Science) as part of the 2017 HPCC Systems Intern Program
  2. Documentation Generator for ECL Code
    In progress. To be completed by Sarthak Jain (Northwestern University PhD student in Computer Science) as part of the 2017 HPCC Systems Intern Program
  3. Implement a Latent Semantic Analysis Algorithm in ECL
    Completed as part of the HPCC Systems Summer Internship Program 2016
  4. Empower ECL-ML: How to make the HPCC Systems Machine Learning Library easier to use
    Completed as part of the HPCC Systems Summer Internship Program 2016
  5. Implement a YinYang K-Means Clustering Algorithm in ECL
    Completed as part of the HPCC Systems Summer Internship Program 2016
  6. Implement the Converse Sparse Cholesky Selection Algorithm in ECL
    Completed as part of the HPCC Systems Summer Internship Program 2016
  7. Analyse which algorithms may provide the best results for the implementation of Non-Negative Matrix Factorisations (NMF) in ECL
    Completed as an independent voluntary contribution in 2016
  8. Add new statistics to the Linear and Logistic Regression Modules
    Completed as part of the GSoC Program 2015
  9. Implement the CONCORD Algorithm
    Completed as part of the HPCC Systems Summer Internship Program 2015




  • No labels