The project proposal application period for 2020 summer internships is now open. Please see our list of Available Projects. Contact the project mentor for more information and to discuss your ideas. You may suggest a project idea of your own but it must leverage HPCC Systems in some way. Contact us for support from an HPCC Systems mentor with experience in your chosen project area.
This project was completed by Sarthak Jain as part of the 2015 GSoC Program. Some of the new statistics have already been added to the HPCC Systems® Machine Learning Library and others will be available as part of the HPCC Systems 6.0.0 release in 2016. Machine learning statistics are important to the big data world, providing a way to drill down into data using complex queries, producing meaningful results to help businesses maintain their competitive edge in the market place. The HPCC Systems® Machine Learning Library has been around for a while now and we are always looking for ways to improve it.
By the GSoC mid term review we would expect you to have completed one table in Microsoft Word for either the Linear or Logistic Regression statisitcs comparing the values generated by the code of each statistic. Also, you will be expected to have created test code that generates those values (including the dataset used). Your github pull requests for both the test code and the regression module containing the new statistics, need to have been accepted by this point.
Backup mentor: John Holt