Senior Software Engineer, Data Platform
Fully Remote Philadelphia
Description

We're looking for a Senior Software Engineer, Data Platform, to join our quest to provide market-transforming solutions for businesses, care teams, and consumers to interactively manage health and care.  


 At Medecision, each person contributes uniquely to our mission and our ability to raise the level of experience we provide to all our customers and colleagues. The Senior Software Engineer on the Data Platform team plays a pivotal role in designing, building, and evolving the Data Platform — the multi-tenant, cloud-native data backbone that powers clinical analytics, population management, and reporting across Medecision's health plan customers. This includes owning data ingestion pipelines, standardization and curation workflows, analytics data delivery, and the infrastructure that makes it all run reliably at scale. They also serve as a key practitioner in Medecision's AI-native SDLC, leveraging Claude Code and AI agents to accelerate development, code review, and documentation at scale. 


What We're Looking For: 

The ideal candidate for the role of Senior Software Engineer, Data Platform, will demonstrate a true passion for building reliable, scalable, and secure data pipelines and platform services in a regulated healthcare environment. They will share our passion for driving improvements in healthcare through better data. 

 

The Senior Software Engineer must have the ability to perform at the highest levels, often partnering cross-functionally with Solution Architects, Tech Leads, Data Engineers, Business Analysts, and QA Engineers to ensure that complex, data-intensive features are delivered with the optimum amount of quality, reliability, and clinical correctness. Critically, they embrace AI as an amplifier — not a shortcut — and apply it with judgment, rigor, and accountability. 


  • Reports to: VP/Director, Engineering  
    Location(s): Remote, US 
Requirements

What You'll Do:


 Core Engineering Responsibilities 

  • Design, develop, and maintain production-grade Python and Java data services and pipelines deployed on Google Cloud Platform, following established architectural conventions, coding standards, and data platform patterns. 
  • Build and evolve Google Cloud Dataflow batch and streaming pipelines for data ingestion, standardization, curation, and analytics load. Handling deduplication, validation, member-reference integrity, and incremental/full-reload modes. 
  • Implement and maintain event-driven data workflows using Google Cloud Pub/Sub, including file-complete notification topics, FHIR ingest topics, and Pub/Sub ? BigQuery export subscriptions. 
  • Design and manage BigQuery datasets, table schemas, partitioning strategies (range-bucket by member partition), clustering, and reporting views — across standard analytics, custom analytics, and curated storage layers. 
  • Build and maintain Cloud Composer (Apache Airflow) DAGs for workflow orchestration — including file ingestion DAGs, test execution DAGs, and third-party processing DAGs (e.g., MEG). 
  • Develop and maintain Cloud Run microservices (e.g., Ingestion Event Service, Custom Dataset Management Service) and Cloud Functions (inbound GCS bucket triggers). 
  • Participate in design and code reviews; mentor junior engineers and contribute to shared coding standards, patterns, and team knowledge base. 
  • Collaborate with on-shore and off-shore teams, architects, and tech leads to ensure on-time delivery and best engineering practices. 
  • Contribute to CI/CD pipeline improvements using GitLab CI/CD, including build, test, containerization (Docker), and deployment automation to GCP environments. 
  • Engage proactively in the triage and resolution of escalated production issues — diagnosing failures, investigating root causes, and driving durable fixes with a sense of urgency, clear communication to stakeholders, and a commitment to preventing recurrence. 
  • Follow and comply with all security policies and procedures established by the organization, including adherence to HIPAA and HITRUST regulations  
  • Applicants must be authorized to work for any employer in the US. This position does not offer sponsorship for employment visas. 

 Data Platform Domain Focus 

 Medecision's unified Data Platform is a multi-tenant, cloud-native system that ingests, standardizes, curates, and delivers clinical data (Member, Provider, Claims, Lab Results, Rx, and custom datasets) to power analytics, population management, and reporting for health plan customers. Key domain responsibilities include: 


  • Own and evolve the end-to-end data ingestion pipeline: from SFTP/GCS file receipt through ingestion, standardization, curation, and load to BigQuery analytics datasets. 
  • Design and implement custom dataset ingestion capabilities — including dynamic schema mapping, configuration-driven pipeline execution, and automatic BigQuery table/view provisioning on dataset activation. 
  • Maintain and improve the Ingestion Event Service (IES). The platform's central tracking service for data ingestion progress — including Pub/Sub async processing and Firestore document management. 
  • Implement FHIR R4 data ingestion workflows via Pub/Sub and GCP Healthcare Datasets, including XSD/schema validation and HL7 resource mapping. 
  • Support population matching data flows, ensuring custom dataset filters integrate correctly with population builder services and BigQuery analytics queries. 
  • Build and maintain reporting dataset views (BI-compatible BigQuery views) for tenant data exposure. 
  • Integrate with third-party clinical analytics engines via GCE VM instance templates, Airflow DAGs, and GCS-based data exchange. 
  • Ensure all data services meet HIPAA requirements: PHI handling, tenant data isolation, audit logging, and data classification.

 AI-Native Delivery (Required) 

 Medecision operates an AI-native SDLC. The Senior Software Engineer is expected to be a practitioner — not just a beneficiary — of AI-assisted delivery:  

  • Demonstrate a solid understanding of AI concepts, capabilities, and limitations as they apply to software engineering workflows, including code generation, test scaffolding, and documentation. 
  • Use Claude Code as a primary productivity tool for code drafting, refactoring, test generation, and technical documentation — applying it with judgment, rigor, and accountability. 
  • Leverage AI-assisted workflows to accelerate implementation, surface edge cases, generate structured artifacts, and conduct at-scale analysis of service dependencies and API contracts. 
  • Contribute to building and exposing MCP-wrapped APIs that enable AI agents to safely interact with platform services. 
  • Maintain strict HIPAA discipline in all AI-assisted work: no real PHI in prompts or AI-generated artifacts; adhere to managed-settings policies and complete mandatory HIPAA + AI training. 
  • Contribute to the team's shared AI knowledge base. Validated prompts, skills, and workflows — and participate in the AI Champions community of practice. 

What You'll Bring: 

 Required  

  • Bachelor's degree in Computer Science, Software Engineering, Data Engineering, or equivalent practical experience. 
  • 5+ years of data engineering or backend software engineering experience building production data pipelines and platform services. 
  • Proven hands-on experience with Google Cloud Platform data services: BigQuery (schema design, partitioning, clustering, query optimization), Cloud Storage, Cloud Pub/Sub, Cloud Dataflow (Apache Beam), Cloud Composer (Airflow), Cloud Run, Cloud Functions, Firestore, Cloud SQL (PostgreSQL), and Secret Manager. 
  • Strong proficiency in Java for data pipeline development 
  • Proficiency in Python for Airflow DAG authoring, and automation scripting. 
  • Experience designing and implementing batch and streaming data pipelines — including file-based ingestion, event-driven processing, deduplication, validation, incremental load, and full-reload patterns. 
  • Proficiency with BigQuery data modeling: partitioned and clustered tables, dataset organization (standardized, curated, analytics, custom analytics, reporting layers), and SQL query optimization. 
  • Experience with Apache Airflow / Cloud Composer — authoring, deploying, and maintaining production DAGs with parameterized configurations and robust error handling. 
  • Experience with containerization (Docker) and deploying services to cloud-native environments. 
  • Proficiency with GitLab CI/CD for pipeline automation and multi-environment deployment. 
  • Excellent communication skills — able to articulate technical decisions, participate in design reviews, and collaborate effectively with cross-functional teams. 
  • Familiarity with Datadog for service monitoring, alerting, and observability in a cloud-native data platform. 
  • Familiarity with Sisense or equivalent BI/reporting platforms and BigQuery view-based reporting patterns. 

 AI-Native Mindset (Required) 

  •  Solid understanding of AI concepts, capabilities, and limitations as they apply to software engineering and product delivery workflows. 
  • Hands-on experience with Claude Code or equivalent AI-assisted tools. Used as a primary productivity tool for code generation, refactoring, test scaffolding, and documentation, not just experimentally. 
  • Ability to evaluate AI-generated code critically: identifying hallucinations, logic errors, security gaps, and missing edge cases before they reach production. 
  • Practical understanding of MCP (Model Context Protocol) or strong willingness to learn — for building tool wrappers that expose platform APIs to AI agents safely and with appropriate guardrails. 
  • Commitment to responsible AI use: applying AI with judgment, rigor, and personal accountability. Consistent with the principle that humans own decisions, agents own toil. 
  • HIPAA discipline in AI-assisted work: understanding of PHI boundaries in AI workflows and commitment to managed-settings policies and mandatory HIPAA + AI training. 
  • Openness to contributing to and learning from a shared AI knowledge base. Validated prompts, skills, and workflows — and active participation in the AI Champions community of practice. 

 

Strongly Preferred  

  • Knowledge of HIPAA and experience working in HIPAA-regulated product environments, including PHI handling, data classification, and audit requirements. 
  • Hands-on experience with HAPI FHIR R4 and healthcare interoperability standards (HL7, FHIR resource mapping, validation workflows). 
  • Understanding of multi-tenant SaaS architecture patterns — tenant context propagation, per-tenant feature flags, and data isolation.
Salary Description
$130,000 - $155,000