HPCC Systems intern Program - Class of 2020

Find out more about the HPCC Systems Summer Intern Program including how to apply and read this blog introducing the students and their projects.

7 students joined our intern program in 2020. Our students presented about their projects at our tech talk webcasts during the year and enter our 2020 Poster Contest held at our virtual HPCC Systems Community Day Summit held in October 2020.

Due to COVID-19 all internships were completed remotely.

Meet the Class of 2020

Name

Project Title

Description

Mentor(s)

Resources

Name

Project Title

Description

Mentor(s)

Resources

Jack Fields

High School Student
American Heritage School of Boca/Delray, FL, USA

Using the GNN Bundle with TensorFlow to train a model to find known faces

Process the data from collected images using our Generalized Neural Network (GNN) Bundle with TensorFlow to train a model that can recognise known faces. This supports the work of the AHS Robotics Team who are building an Autonomous Security Robot (Watch Demothat can recognise potential risks on a school campus that might otherwise be missed by the human eye. Using object and facial recognition, they can capture faces and recognise them with 93% accuracy using Tensorflow.

David DeHilster
XiaomingWang

Tech Talk Presentation, August 2020
 
Poster 

Community Day Presentation 2020

Blog Journal

Jefferson Mao

High School Student
Lambert High School, Georgia, USA

Establish HPCC Systems on the Google Cloud Platform

Work through the steps required to use HPCC Systems on the Google Cloud platform. Design a web application for creating new HPCC Systems cluster on this cloud service. Exploring Google Cloud Anthos, (a new Google Kubernetes deployment platform), with an HPCC Systems cluster. Analysing how running HPCC Systems on the Google Cloud works in comparison with other cloud services (such as AWS), looking at performance, security and cost effectiveness

Xiaoming Wang
Godson Fortil

Tech Talk Presentation, August 2020 

Poster

Blog Journal

HPCC Systems Blog Post

Matthias Murray

Masters in Data Science
New College of Florida, USA

Applying HPCC Systems Word Vectors to SEC Filings

Report on the current status of vectorisation and NLP representation of SEC filings and then compile identified SEC filing cases and their intersection from a LexisNexis perspective. Sort and transform SEC data, creating a function to convert the data into a format required by the HPCC Systems Word Vectors ML bundle.

Lili Xu
Arjuna Chala
Roger Dev

Tech Talk Presentation, September 2020

Poster

HPCC Systems Blog Post

Nathan Halliday

High School Student


Execute Multiple Workflow Items in Parallel

Restructure the workflow engine to create a graph of tasks that can be used to track which tasks have been executed and which tasks should be executed next. Ensure that there are no multi-threading issues in the workflow engine. the plan is to support ROXIE and Thor

Gavin Halliday

Tech Talk Presentation, August 2020  

Poster

Blog Journal

HPCC Systems Blog Post

Robert Kennedy

Masters in Computer Science 
Florida Atlantic University, USA

Implement a Multi-node, Multi-GPU Accelerated Deep Learning Algorithm using GNN 

Expand on our existing GNN bundle to improve our GPU accelerated neural network training. The aim is that HPCC Systems will be able to train neural networks, at scale, across many GPUs, across many GPU enabled nodes using different parallelisation techniques that are suited to deep learning tasks. Increase the robustness of the underlying GNN library by identifying areas for improvement while documenting best practices to be used when training neural networks on GPUs using the GNN bundle.

Tim Humphrey
Dr Taghi Khoshgoftaar, (Florida Atlantic University)

Tech Talk Presentation, September 2020

Poster

Blog Journal

Community Day Presentation 2020

Vannel Zeufack

Masters in Computer Science
Kennesaw State University, USA

Implement a Preprocessing Bundle for the HPCC Systems ML Library

Make the data preprocessing phase of machine learning on HPCC Systems easier and faster. Produce a preprocessing bundle tutorial to demonstrate how the different modules in the preprocessing bundle could be used together to easily prepare data for a machine learning project

Arjuna Chala
Lili Xu

Tech Talk Presentation, September 2020

Poster 

Blog Journal

Yash Mishra

Masters in Computer Science
Clemson University, USA

Leveraging and evaluating Kubernetes support on Microsoft Azure

Use our new Cloud native platform to leverage the Kubernetes support for HPCC Systems, focusing on performance measurements, cost analysis, looking at various configuration options. Provide a comparison of running the HPCC Systems bare metal version and the new K8 support of cloud native HPCC Systems on Microsoft Azure.

Dan Camper
Dr Amy Apon (Clemson University)

Poster

Community Day Presentation 2020

Blog Journal

HPCC Systems Blog Post

Profile of our intern program in 2020

  • 7 students - 3 High School, 3 Masters, 1 PhD

  • Global and inclusive program, with one student located in Europe (UK) and 2 international students studying in the USA.

  • 2 returning students

  • All remote working

  • Spread of projects: 2 Cloud, 4 Machine Learning, 1 Core Platform

  • 14 mentors involved including 2 academic mentors

HPCC Systems platform related projects

  • Establish HPCC Systems on the Google Cloud Platform

  • Execute Multiple Workflow Items in Parallel

  • Leveraging and evaluating Kubernetes support on Microsoft Azure

Machine learning related projects

  • Applying HPCC Systems Word Vectors to SEC Filings

  • Implement a Multi-node, Multi-GPU Accelerated Deep Learning Algorithm using GNN *

  • Implement a Preprocessing Bundle for the HPCC Systems ML Library

  • Using the GNN Bundle with TensorFlow to train a model to find known faces *

*   Projects suggested by students themselves

All pages in this wiki are subject to our site usage guidelines.