Toxicity Detection Platform Integrated with HPCC Systems Cloud and GitOps

This project was completed by a student accepted on to the 2021 HPCC Systems Intern Program

Student work experience opportunities also exist for students who want to suggest their own project idea. Project suggestions must be relevant to HPCC Systems and of benefit to our open source community. 

Find out about the HPCC Systems Summer Internship Program.

Project Description

Arc is similar to Google Anthos which address issues for multi-cloud and hybrid cloud management. This project try to explore following (wish list)
 https://docs.microsoft.com/en-us/azure/azure-arc/kubernetes/overview

  • Study general web application integration with HPCC Systems Cloud

  • Research and design Toxicity Detection Model using HPCC Systems GNN Bundle

  • Deploy applications and apply configuration by using GitOps-based configuration management.

If you are interested in this project, please contact Contact Details

Completion of this project involves:

  • Create a github project for deploying and configuration files

  • Create a HPCC Systems Cloud Cluster with GNN setup

  • Implements  Toxicity Detection Model application using HPCC Systems Cloud as backend

By the mid term review we would expect you to have:

  • A working HPCC Systems Cloud on Azure with GNN and basic Toxicity Detection Model training

Mentor

Foreman, Robert (RIS-BCT) <Robert.Foreman@lexisnexisrisk.com>
Contact Details

Backup Mentor: Xiaoming Wang (RIS-BCT) <Xiaoming.Wang@lexisnexis.com>
Contact Details

Skills needed
  • General Cloud Environment knowledge including Docker, Kubernetes

  • Azure Kubernetes Service (AKS) 

  • Windows Powershell, Unix Shell, Python

  • Ability to build and test the HPCC system (guidance will be provided).

  • Ability to write test code. Knowledge of ECL is not a requirement since it should be possible to re-use existing code with minimal changes for this purpose. Links are provided below to our ECL training documentation and online courses should you wish to become familiar with the ECL  language.

Deliverables

Midterm

  • A working HPCC Systems Cloud on Azure with GNN and basic Toxicity Detection Model training

End of project

Complete Toxicity Detection web application backed by HPCC Systems Cloud

Other resources

All pages in this wiki are subject to our site usage guidelines.