Provide a standard HPCC Systems ECL math library

This project was completed by Everett Matthew Upchurch Butler, an undergraduate studying for a BS in Information Technology. Matt completed this project as an intern in 2018, joining us earlier than most in the January of that year and completing his internship before the summer started. Most students join over the summer months, but we can be flexible to accommodate study schedules and commitments.

Find out about the HPCC Systems Summer Internship Program.

Project Description

A standard Math library would expand the ECL language greatly. Math is constant and under normal circumstances it does not change as in a dependent relationship. It is a critical component of nearly every aspect of society and is therefore extremely useful in countless applications. ECL has already shown a massive improvement over its predecessors, so it is not difficult to contend that the separation between them, in terms of completeness and efficiency, can be widened even further with this project’s implementation. For this reason, a standard Math library should be created. Specifically, the functions involving probability distributions need to be implemented. The probability distributions discussed in this proposal cover the vast majority of those that are considered useful and will go a long way to add to the capabilities of ECL.

If you are interested in this project, please contact John Holt.

Completion of this project involves:

  •  

    1. FUNCTION to add Beta Distribution

    2. FUNCTION to add Binomial Distribution

    3. FUNCTION to add Non-central Chi square Distribution

    4. FUNCTION to add F Distribution

    5. FUNCTION to add Non-central F Distribution

    6. FUNCTION to add Gamma Distribution

    7. FUNCTION to add Negative Binomial Distribution

    8. FUNCTION to add Poisson Distribution

    9. Test to determine accuracy and speed of PDFs

    10. Documentation regarding the sources of these approximations

    11. FUNCTION to estimate parameters of the PDFs

    12. FUNCTION to calculate the inverse of the PDFsWish list

Wish list

  • FUNCTION to find the best fitting distribution given a random dataset 14. FUNCTION to add Trigonometric Functions

  • FUNCTION to add Logarithmic Functions

  • FUNCTION to add additional Probability Distributions

By the mid term review we would expect you to have:

  • Establish familiarity and congruence with mentor in terms of project goals, implementation timeline, and action plan. Assess current efficiency and structure of existing probability distributions. There are currently 3 in the ML_Core repo: Normal, Chi squared, and Student T’s. Develop and design workflow from data gathering to final implementation and testing.

  • Perform further in depth research on the different Probability Distributions.Decide which Probability Distributions should be included. Priority should be based on perceived usefulness Design PDF framework. Additional Distributions can be added later after framework has been created. Begin searching for algorithms that are not intellectually restricted.

  • Continue gathering the necessary mathematic algorithms. A period of two weeks is dedicated to this task. This is due to the high probability that there will be some difficulty collecting algorithms that are not intellectually protected. Documentation of the sources of these approximations is critical here.

  • Create a pseudo code implementation. This step is critical because it is generally prohibited to copy an existing algorithm from another language.

Mentor

Roger Dev
Contact Details

Backup Mentor: John Holt
Contact Details

Skills needed
  • Ability to build and test the HPCC system (guidance will be provided).

  • Ability to write test code. Knowledge of ECL is not a requirement since it should be possible to re-use existing code with minimal changes for this purpose. Links are provided below to our ECL training documentation and online courses should you wish to become familiar with the ECL  language.

Deliverables

Midterm

  • See above

End of project

  • See above

Other resources

All pages in this wiki are subject to our site usage guidelines.