The projects listed here are available as student work experience opportunities with HPCC Systems this summer. Here's the list of available in alphabetical order. Some have multiple projects associated with them. You can also view the list by project type.
Find out more about the HPCC Systems Summer Internship Program.
Deadline for proposals - Monday April 3rd 2017
- Additional Embedded Languages in ECL
Scala, Haskell, Clojure, SAS, MatLab, MongoDB, Postgres, MariaDB or suggest one!
- Additional external data stores
Ceph, S3 or suggest one!
- Analysing workunit performance
Identify the most useful workunit statistics, analyse them and present them to users as visualizations within ECL Watch
- Cluster Deployment with Juju Charm
Convert our current implementation to use the new Charm Helpers framework (python) and add support for new HPCC components
- DFU Spray from zip/gzip files
Create a plugin for spraying from a ZIP/GZIP archive without decompressing the content
- Implement a global sort and distribution optimiser
Optimize the use of sorts and distributions by looking at the entire graph and tracking which sorts and distributions are actually used by downstream activities.
- Implement an IOT pluggable protocol for ROXIE
Add support for pluggable protocols currently being used in IOT projects
- Provide Unicode implementations for HPCC Systems standard library functions - The project is no longer available for 2017 Internships
Improvement the way HPCC Systems handles unstructured text
- Log Visualisation Tool
Create visualizations of the top counts for specific types of issues within a log file, showing severity and details
- Machine Learning Algorithms on the HPCC Platform
Approximate n-tile, Text Search Bundle
- Implement a Jupyter kernel for HPCC/ECL
Implement a kernel to enable the embedding/execution of source code and displaying the results
- MPI Proof of Concept
Replace existing socket-based message passing api with an open-source MPI
- Text Search Bundle
- Port Roxie to a different UDP layer
- Investigate how well the current implementation achieves the goal of getting data from slaves to serves as quickly and reliably as possible on today's systems, and investigate whether there is a third party library or alternative protocol that may be worth consideration.
- Continuous Integration of roxie query / data deployments using Jenkins
- System self health check
Design and implement a tool to provide an overall check to everything is working as expected across components, from a button within ECL Watch
- Provide SELinux Policies for the HPCC-Platform installation on Linux environments
- Build SELinux domains for hpccsystems-platform services.