Francisco Ciol Rodrigues Aveiro is studying for a Bachelor of
Atreya joined the HPCC Systems Intern Program in 2021 to provide some improvements to the HSQL (HPCC Systems Structured Query Language) project. Atreya was a member of the team who created the HSQL project in 2020, as part of an academic collaboration between HPCC Systems and RVCE, under the supervision of Dr Shobha G and Arjuna Chala (Senior Director, Operations, LexisNexis Risk Solutions Group). Atreya's internship involved implementing the following improvements:
- Define an initial syntax set for HSQL
- Provide a working compiler that can convert HSQL to ECL
- Provide a VSCode extension for use with HSQL
As well as the resources included here, read Atreya's intern blog journal which includes a more in depth look of his work during his 2021 internship. To learn more about the initial work carried out on this project, view the poster Atreya entered into our 2020 Poster Contest.
Big Data has become an important field, and there is a steep learning curve to getting used to handling Big Data, especially in distributed systems. HSQL for HPCC Systems is a solution that is developed for allowing users to get used to its architecture and the ECL (Enterprise Control Language) language with which it primarily operates. HSQL aims to provide a seamless interface for data science developers to use, for working with data. It is designed to work in conjunction with ECL, the primary programming language for HPCC Systems, and should prove to be easy to work with and robust for general purpose analysis.
HSQL is made to provide a compact and easy to comprehend SQL-like syntax for performing visualizations, general exploratory data analysis, training of Machine Learning models while also allowing a modular structure to such programs. Functions can also be written to allow for code reuse. It can also integrate with VSCode IDE and provide Syntax Highlighting and Code Completion features.
Computer Engineering at INSPER, Sao Paulo, Brazil.
Francisco's course covers both software and hardware development, including artificial intelligence, project design and onboard development. Francisco is working with Alysson Oliveira (Software Engineer, LexisNexis Risk Solutions Group) and is completing a 6 month long internship in the Brazil office. His project reflects the contribution he will make to the HPCC Systems Platform.
During this current era of information, the use of cloud computing became a necessity due to the amount of computational power needed. The access to storage and processing power at low cost allied with ease of access are some of the advantages of using such service, which is available as platform as a service (PaaS), software as a service (SaaS), infrastructure as a service (IaaS), and hardware as a service (HaaS). In the IaaS model payment is normally under the Pay-as-you-go politics, where you pay for what you’re using. Though pricing may be cheap, the misusage of resources and unnecessary uptime can bring up the cost. In order to minimize those costs, autoscaling features and containerized applications can be used to control the misusage of resource.
An example of containerized systems with autoscaling capability is Kubernetes, as it can be configured to scale the cluster to what is needed and them return to a minimal state when demand goes down, all in a relatively simple manner. This prevents unnecessary resource usage while scaling to properly attend to any task needed. Another cost associated with a cloud cluster is the amount of external IPs needed. Since you need to pay for each external IP exposed to the web, a full cluster with unique IPs for every service can be expensive. To reduce the number of externals IPs used, a single entry-point can be configured, to then redirect to each service based on a subdomain or path.
Such implementation also centralizes access management to the cluster, improving security and governance over the cluster. The objective of this study is to achieve a single entry point to access an AWS Kubernetes cluster, configuring Ingress and AWS ALB to manage and redirect user access to the correct service in a cluster. Then create a helm chart to replicate this structure in others HPCC Systems clusters. With this implementation, an HPCC Systems cluster would only use one external IP for its multiple services thus reducing cost and improving security.
In this Video Recording, Atreya Francisco provides a tour and explanation of his poster content.
HPCC Systems Ingress Configuration with AWS ALB
Click on the poster for a larger image.