This project is already taken and is no longer available for the 2023 HPCC Systems Intern Program
If you are interested in this project contact David Dehilster.
Find out about the HPCC Systems Summer Internship Program.
Project Description
With the new NLP++ plugin for HPCC Systems, one of the tasks is to update the English dictionary that resides in the VisualText repository. The original dictionary came from Wordnet which is somewhat out of date. The English dictionary needs more linguistic information and new vocabulary. This will require creative thinking and the interested in coming up with novel ideas to implement enhancements.
If you are interested in this project, please contact Add email link to mentor.
Completion of this project involves:
- Become familiar with the current NLP++ dictionary at: https://github.com/VisualText/dict-en-us
- Research online English dictionaries, repositories, and websites for finding enhancements
- Identify the best sources for enhancement
- Implement the enhancement using VisualText
- Run tests using the NLP++ Plugin in ECL to show enhancements
- Merge enhancement into the NLP++ English dictionary repository
By the mid term review we would expect you to have:
- A good idea as to what and how enhancements are to be implemented
Mentor | |
Skills needed |
|
Deliverables | Midterm
End of project
|
Other resources |
|