Sensitive-Data Engineer
Description

 Active Top Secret/SCI Clearance with Polygraph (REQUIRED) 


Are you passionate about harnessing data to solve some of the nation’s most critical challenges? Do you thrive on innovation, collaboration, and building resilient solutions in complex environments?


Join a high-impact team at the forefront of national security, where your work directly supports mission success. We're seeking a Data Engineer with a rare mix of curiosity, craftsmanship, and commitment to excellence. In this role, you’ll design and optimize secure, scalable data pipelines while working alongside elite engineers, mission partners, and data experts to unlock actionable insights from diverse datasets.

Requirements
  • Engineer robust, secure, and scalable data pipelines using Apache Spark, Apache Hudi, AWS EMR, and Kubernetes
  • Maintain data provenance and access controls to ensure full lineage and auditability of mission-critical datasets
  • Clean, transform, and condition data using tools such as dbt, Apache NiFi, or Pandas
  • Build and orchestrate repeatable ETL workflows using Apache Airflow, Dagster, or Prefect
  • Develop API connectors for ingesting structured and unstructured data sources
  • Collaborate with data stewards, architects, and mission teams to align on data standards, quality, and integrity
  • Provide advanced database administration for Oracle, PostgreSQL, MongoDB, Elasticsearch, and others
  • Ingest and analyze streaming data using tools like Apache Kafka, AWS Kinesis, or Apache Flink
  • Perform real-time and batch processing on large datasets in secure cloud environments (e.g., AWS GovCloud, C2S)
  • Implement and monitor data quality and validation checks using tools such as Great Expectations or Deequ
  • Work across agile teams using DevSecOps practices to build resilient full-stack solutions with Python, Java, or Scala

Required Skills

  • Experience building and maintaining data pipelines using Apache Spark, Airflow, NiFi, or dbt
  • Proficiency in Python, SQL, and one or more of: Java, Scala
  • Strong understanding of cloud services (especially AWS and GovCloud), including S3, EC2, Lambda, EMR, Glue, Redshift, or Snowflake
  • Hands-on experience with streaming frameworks such as Apache Kafka, Kafka Connect, or Flink
  • Familiarity with data lakehouse formats (e.g., Apache Hudi, Delta Lake, or Iceberg)
  • Experience with NoSQL and RDBMS technologies such as MongoDB, DynamoDB, PostgreSQL, or MySQL
  • Ability to implement and maintain data validation frameworks (e.g., Great Expectations, Deequ)
  • Comfortable working in Linux/Unix environments, using bash scripting, Git, and CI/CD tools
  • Knowledge of containerization and orchestration tools like Docker and Kubernetes
  • Collaborative mindset with experience working in Agile/Scrum environments using Jira, Confluence, and Git-based workflows
Salary Description
$190k-$245k