Page tree
Skip to end of metadata
Go to start of metadata

The projects listed here are available as student work experience opportunities with HPCC Systems as part of our summer intern program and Google Summer of Code. 

The project proposal application period for 2020 summer internships is now open. Please see our list of Available Projects. Contact the project mentor for more information and to discuss your ideas. You may suggest a project idea of your own but it must leverage HPCC Systems in some way. Contact us for support from an HPCC Systems mentor with experience in your chosen project area.

Find out more about the HPCC Systems Summer Intern Program.

  1. Additional Embedded Languages in ECL
    Clojure, Haskell, MariaDB, MatLab, MongoDB, ODBC, Postgres, SAS, Scala, SQL, or suggest one!
  2. Additional external data stores
    Ceph, S3 or suggest one!
  3. Google cloud and Microsoft cloud - Extend instance cloud to new AWS regions
  4. DFU Spray from zip/gzip files
    Create a plugin for spraying from a ZIP/GZIP archive without decompressing the content
  5. Implement an IOT pluggable protocol for ROXIE
    Add support for pluggable protocols currently being used in IOT projects
  6. Machine Learning Algorithms on the HPCC Platform
    Data Series Classification
    Implement an approximate n-tile algorithm
    Word Vectorization
    Extend the HPCC Systems ML matrix operation to include complex numbers
    Linear/Logistic Regression Enhancements
    Anomaly Detection Algorithms
    Generative Adversarial Networks (GANs)
    Adaptive Density Based Clustering
    Independence Testing Bundle
    Predictive Model Markup Language (PMML) Processor
  7. Port Roxie to a different UDP layer
    Investigate how well the current implementation achieves the goal of getting data from slaves to serves as quickly and reliably as possible on today's systems, and investigate whether there is a third party library or alternative protocol that may be worth consideration.
  8. System self health check
    Design and implement a tool to provide an overall check to everything is working as expected across components, from a button within ECL Watch
  9. Provide SELinux Policies for the HPCC-Platform installation on Linux environments
    Build SELinux domains for hpccsystems-platform services.
  10. Locking engine to replace DALI - Investigative project
    Research, test and do a POC of a 3rd party inter-machine/process locking engine, for example ZooKeeper, HashiCorp's Consul or other suitable contenders.
  11. Replace existing socket-based message passing interface with an open source package
    Explore if using a different message layer (open-source package, such as ZeroMQ) offers improved performance, robustness and code maintainability

These projects are new for 2018. They are still under development and more details will be added soon. If you want to know more about any of these projects, view the associated JIRA issue and please contact Lorraine Chapman or the mentor of the project:

  1. Implement ECL Pretty Print
  2. Implement reference dafilesrv in other languages
  3. Implement a Reverse activity
  4. Incorporating self test code into a bundle
  5. Provide test code for bundles with no self test
  6. VS Code extension for DESDL and other languages
  7. Add Arrow support to dafilesrv
  8. Add ORC support to HPCC Systems 
  9. Using HPCC Systems as a data lake for the Deep Cloud platform
  10. Applying HPCC Systems Word Vectors to SEC Filings 

  • No labels