Job was saved successfully.
Job was removed from Saved Jobs.

Job Details


Leidos

High-Performance Computing (HPC) Application Support Engineer

Defense

All

Full Time

On Site

No

Arlington, Virginia, United States

Description

Job Description:

Maritime Systems Division has an immediate opening for a High-Performance Computing (HPC) Application Support Engineer. This is an exciting opportunity to use your skills and experience in the development and integration of a critical HPC environment. As the HPC Application Support Engineer, you will work alongside our government customer ensuring the successful delivery of this vital capability.

Primary Responsibilities

  • Manage, deploy, and support applications on Red Hat Enterprise Linux (RHEL)
  • Work with users to customize applications and development environments to specification
  • Work with the team to define and implement best practices
  • Monitor internally developed applications for impact to system performance and resource
  • utilization
  • Tune applications to optimize performance and reliability of services across the Highperformance Computing (HPC) ecosystem
  • Diagnose application problems quickly and effectively
  • Automate administration procedures for routine and complex tasks
  • Provide backup HPC system administration support
  • Coordinate with vendors to resolve software problems

Basic Qualifications

  • Requires BS and 4 – 8 years of prior relevant experience or Masters with 2 – 6 years of prior relevant experience and and a minimum of 2 years of experience in Linux/UNIX Systems Administration.
  • Experience supporting internally developed applications in C, C++, Java, and Python
  • An equivalent combination of education and experience will be considered.
  • This position requires the ability to obtain and maintain a clearance from the Department of
  • Defense.

Preferred Qualifications

  • Excellent interpersonal/communication skills, and the ability to work as part of a team
  • 5 years of experience supporting HPC applications and development environments on RHEL
  • Certifications: Security+, RHCSA or RHCE
  • Experience troubleshooting application execution through resource managers such as PBS Pro and Slurm
  • Experience with utilities such as Git, Bitbucket, Confluence
  • An understanding of code review, compilers, and debugging tools including Intel Parallel Studio,
  • GCC, GDB, TotalView
  • Experience supporting applications based on CUDA, OpenCL, OpenMPI, OpenMP, IntelMPI
  • Experience using tools such as Nagios, Zabbix, and SNMP to monitor systems, metrics, and create dashboards
  • Ability to develop and maintain programs and scripts that aid in the operation and automation of administrative tasks and workflows using Bash and Python
  • Ability to identify requirements and to define, plan, and implement requisite solutions
  • Ability to plan, organize, prioritize tasks, and complete assigned projects with minimal supervision

Pay Range: