Browse Poster Wiki: 2022 Poster Award Winners, Awards Ceremony (Watch Recording from minute marker 1630), Posters by 2022 HPCC Systems Interns, Posters by Academic Partners, Poster Judges, About Virtual Judging, 2022 Poster Contest Home Page, Poster Contest Previous Years
Swaraj Sanjay Somanache is a student at the RV College of Engineering, Bangalore, India.
He is in his 3rd year pursuing a degree in Computer Science and Engineering.
Being one of the most frequently used building materials, the quality of concrete is determined by its compressive strength, which is measured by crushing a concrete cube or a cylinder until it starts cracking and crushed. The pressure at which the concrete cube or a cylinder starts cracking and eventually crushes is called the Concrete compressive strength and is measured in Megapascals (MPa). It takes a long period of 28 days to test like this. Distributed Machine learning technology allows an engineer to determine the strength of a concrete in just a few seconds of time.
In this proposal we intend to use distributed platform such as HPCC systems platform to create a model to predict the compressive strength of concrete.
The goals for this project were:
- To Predict the Compressive strength of the concrete using Random Forest Regressor model.
- To compare the time required by the HPCC cluster to train the model with python libraries
- Comparing the accuracy metrics of HPCC platform and python libraries such as MSE, R2 score and RMSE.
For training and testing the model, we used the public data set available in Kaggle, “Concrete Compressive Strength Data Set” by Ahiale Darlington. Data Type: multivariate the concrete compressive strength is a highly nonlinear function of age and ingredients. These ingredients include cement, blast furnace slag, fly ash, water, superplasticizer, coarse aggregate, and fine aggregate.
Random Forest is a popular machine learning algorithm that belongs to the supervised learning technique. It is used for both Classification and Regression problems in ML. It is based on the concept of ensemble learning, which is a process of combining multiple classifiers to solve a complex problem and to improve the performance of the model.
We have used BoostedRegForest from Learning Tree Bundles of HPCC Systems machine learning library for our problem statement.
We trained the model on the above-mentioned dataset to predict the compressive strength of concrete in MPA.
The cluster time we got to train and predict our target variable was 36.881 seconds compared to ensemble learning libraries in python which was 72 seconds. Metrics such as MSE, R2 score and RMSE will be analyzed.
In this Video Recording, Swaraj provides a tour and explanation of his poster content.
Add Poster Title Here
Click on the poster for a larger image.