Natural Language Processing Specialist ( R-00080531 )
The National Solutions Operation is seeking an experienced Natural Language Processing (NLP) / Human Language Technology (HLT) Specialist to support the HLT initiatives of a Department of Defense customer in the National Capital Region. NLP Specialists will support a program performing Document and Media Exploitation (DOMEX) operations and are expected to leverage language, analytical, and programming skills.
Must currently possess TS/SCI. Candidate will be required to pass a polygraph and subject interview prior to starting on the project.
Required Skills and Experience:
- Proficient in one or more programming languages such as Python, R, Java, and Scala and the knowledge to apply those apply skills to automate processes involving multilingual data
- Experience working on large data sets and machine learning models
- Intellectual curiosity: strong personal initiative to pursue learning opportunities and expand knowledge and skill sets
- Knowledge of core and emerging NLP
- Ability to adapt to new technologies and grasp concepts quickly
- Proactive problem-solving skills, ability to work in teams and independently
- Excellent communication skills and the ability to convey technical concepts to non-technical audiences in multiple formats
Desired Skills and Experience:
- Formal education in linguistics, computational linguistics, and/or computer programming
- Experience working with ML/AI products and techniques pertaining to NLP, such as topic modelling, document classification, Named Entity Recognition, Machine Translation, etc.
- Familiarity with machine learning libraries (e.g. TensorFlow, PyTorch, or Scikit-learn) and NLP libraries (e.g. spaCy, HuggingFace).
- Knowledge and understanding of pre-trained language models like BERT and GPT
- Experience with extract, transform, and load processes including manipulating and cleaning diverse data types (i.e., parsing XML documents and/or working with JSON)
- Knowledge of translation management processes, translation data formats (e.g., TMX, TBX) and conversion processes
- Familiarity with Computer Assisted Translation (CAT) tools, workflows, and data formats