HPCC Systems Storage Support With Container Storage Interface (CSI)

This project is already taken and is no longer available for the 2023 HPCC Systems Intern Program

This project is available as a student work experience opportunity with HPCC Systems. Curious about other projects we are offering? Take a look at our Ideas List.

Student work experience opportunities also exist for students who want to suggest their own project idea. Project suggestions must be relevant to HPCC Systems and of benefit to our open source community. 

Find out about the HPCC Systems Summer Internship Program.

Project Description

The objective of this project is to provide helm chart examples for CSI driver support on various cloud providers and storage types, such as, for example Azure Azurefile, AWS EFS and FSx Lustre, etc.

While adding support for CSI, you will need to consider three possible variations of each scenario:

1) Automatically creating CSI PV/PVC when the HPCC Systems started. The life cycle can be in sync with the HPCC Systems cluster which means when the HPCC Systems cluster is destroyed the PV/PVC will be automatically deleted.

2) Start CSI PV/PVC first with Helm Chart before starting HPCC Systems. The PV/PVC will be still live after HPCC Systems cluster is destroyed. But the deletion of the Kubernetes cluster will destroy the PV/PVC. So the life cycle will be the Kubernetes cluster itself.

3) Start CSI PV/PVC first with Helm Chart before starting HPCC Systems. But the PV/PVC can be persistent even the Kubernetes cluster is destroyed. The life cycle is beyond Kubernetes cluster.

The CSI PV/PVC can be reused by subsequent Kubernetes and HPCC System clusters so it is very useful for data processed once but will be used by various following applications as well.

Please note that Azure File CSI support has been already incorporated in HPCC Systems so this project will focus on AWS EFS and FSx.  

HPCC-Platform/helm/examples/azure at master · hpcc-systems/HPCC-Platform (github.com) implemented all 3 cases for Azure File

Also, the students can rely on the current Azure File and EFS examples and adapt them to the CSI driver.

Note as well that the HPCC Systems helm chart currently has basic EFS CSI driver support for AWS, however this provider has recently updated the configuration and documentation of this driver. 

HPCC-Platform/helm/examples/efs at master · hpcc-systems/HPCC-Platform (github.com) implemented 2)

Azure and AWS accounts will be provided for the student as part of this project.

Completion of this project involves:

By the mid term review we would expect you to have:

  • Understand CSI driver and complete basic AWS EFS case 3) with CSI implementation for HPCC Systems Cluster Deployment

Mentor

Xiaoming Wang
Xiaoming.Wang@lexisnexisrisk.com

Backup Mentor: Godson Fortil
Godji.Fortil@lexisnexisrisk.com

Skills needed
  • General Cloud Environment knowledge

  • AWS EC2, Client API (shell), S3, Docker, Jenkins, Packer

  • Unix Shell, Python

  • Ability to build and test the HPCC system (guidance will be provided).

  • Ability to write test code. Knowledge of ECL is not a requirement since it should be possible to re-use existing code with minimal changes for this purpose. Links are provided below to our ECL training documentation and online courses should you wish to become familiar with the ECL  language.

Deliverables

Midterm

AWS EFS  Helm Example

End of project

AWS EFS and FSx Helm Examples. Documentation (README.md) and PPT for HPCC Platform Development team and Tech Talk Presentation

Other resources

All pages in this wiki are subject to our site usage guidelines.