Page tree
Skip to end of metadata
Go to start of metadata

This project is already taken and is no longer available for the 2023 HPCC Systems Intern Program

If you are interested in this project contact David Dehilster

Find out about the HPCC Systems Summer Internship Program.

Project Description

With the new NLP++ plugin for HPCC Systems, one of the tasks is to update the English dictionary that resides in the VisualText repository. The original dictionary came from Wordnet which is somewhat out of date. The English dictionary needs more linguistic information and new vocabulary. This will require creative thinking and the interested in coming up with novel ideas to implement enhancements.

If you are interested in this project, please contact Add email link to mentor.

Completion of this project involves:

  • Become familiar with the current NLP++ dictionary at: https://github.com/VisualText/dict-en-us
  • Research online English dictionaries, repositories, and websites for finding enhancements
  • Identify the best sources for enhancement
  • Implement the enhancement using VisualText
  • Run tests using the NLP++ Plugin in ECL to show enhancements
  • Merge enhancement into the NLP++ English dictionary repository

By the mid term review we would expect you to have:

  • A good idea as to what and how enhancements are to be implemented
Mentor

David Dehilster

Skills needed
  • Ability to do research on the internet
  • Ability to learn and program in NLP++
  • Ability to write test code in ECL using the NLP++ plugin to test the enhanced dictionary
Deliverables

Midterm

  • Identified best sources for enhancement and plan to add them to the current dictionary.

End of project

Other resources
  • No labels