Page tree
Skip to end of metadata
Go to start of metadata

The proposal period for 2022 internships is now closed
The proposal period for 2023 internships will open in November 2022

This is new project, more information coming soon. If you are interested in this project contact Lorraine Chapman

Find out about the HPCC Systems Summer Internship Program.

Project Description

With the new NLP++ plugin for HPCC Systems, one of the tasks is to update the English dictionary that resides in the VisualText repository. The original dictionary came from Wordnet which is somewhat out of date. The English dictionary needs more linguistic information and new vocabulary. This will require creative thinking and the interested in coming up with novel ideas to implement enhancements.

If you are interested in this project, please contact Add email link to mentor.

Completion of this project involves:

  • Become familiar with the current NLP++ dictionary at: https://github.com/VisualText/dict-en-us
  • Research online English dictionaries, repositories, and websites for finding enhancements
  • Identify the best sources for enhancement
  • Implement the enhancement using VisualText
  • Run tests using the NLP++ Plugin in ECL to show enhancements
  • Merge enhancement into the NLP++ English dictionary repository

By the mid term review we would expect you to have:

  • A good idea as to what and how enhancements are to be implemented
Mentor

David de Hilster
david.dehilster@lexisnexisrisk.com 

Backup Mentor: TBA
 

Skills needed
  • Ability to do research on the internet
  • Ability to learn and program in NLP++
  • Ability to write test code in ECL using the NLP++ plugin to test the enhanced dictionary
Deliverables

Midterm

  • Identified best sources for enhancement and plan to add them to the current dictionary.

End of project

Other resources
  • No labels