Principal Engineer
Fully Remote Remote
Description


MSI Data is embarking on a transformative initiative to better serve customers with advanced Artificial Intelligence across our entire application suite. We are seeking a Principal Engineer to be the primary technical builder behind various initiatives.


While the strategy is being defined, we need an expert practitioner to execute it. This is a high-impact, individual contributor (IC) role for an engineer who loves to code and build complex systems. You will not be managing people; you will be managing code, coding agents, app performance, and production implementation. You will work at the intersection of complex algorithms, agentic workflows, and production-grade software engineering to bring our AI-powered product line to life.


What We Are Looking For

  • Experience: 10+ years of software engineering experience, with a heavy emphasis on backend systems. At least 3+ years of hands-on coding experience specifically with integrating LLMs into production apps and AI orchestration.
  • Execution Focus: You are an IC at heart. You prefer opening an IDE to opening a spreadsheet. You want to be measured by the code you ship, not the size of the team you manage.
  • Engineering Mastery: You are polyglot or deep in Python/TypeScript. You understand asynchronous programming, memory management, and how to write clean, maintainable code.
  • AI Proficiency: Deep, hands-on experience implementing solutions with OpenAI/Anthropic APIs, LangChain/LangGraph, and Vector Databases (Pinecone/Milvus). You understand the "gotchas" of these tools (hallucinations, context limits, rate limits).
  • Data Engineering: Ability to build the data pipelines required to feed the AI. You are comfortable writing complex SQL and handling unstructured data.
  • Scale: Experience building high-throughput systems. You know how to cache results, manage database connections, and keep the system running when traffic spikes.

Why Join MSI Data?

  • Builder’s Paradise: This is a "greenfield" coding opportunity. You get to build the core engines of a new product line without dealing with years of legacy technical debt in the AI layer.
  • High Impact: Your code will be the foundation of the product. The efficiency and intelligence of our application will be a direct result of your engineering skills.
  • Technical Autonomy: While you won't manage people, you will have high autonomy over how features are implemented, the libraries used, and the code structure.
Requirements


1. Hands-on Execution & Implementation

  • Primary Builder: Act as the lead developer for major features included in MSI’s ML & AI initiatives. You will be responsible for writing the core logic, integrating APIs, and building the "guts" of multiple systems.
  • Rapid Prototyping to Production: Quickly move from proof-of-concept Python scripts and vibe-coded demos to robust, production-ready code (Python/TypeScript). You bridge the gap between "it works in a notebook" and "it works at scale."
  • Code Quality & Review: Set the standard for engineering code quality. You will conduct deep code reviews and implement CI/CD/testing pipelines specifically designed for AI workflows (on/offline evals, regression testing for prompts, cost & latency budget management, etc).

2. Core Systems Engineering

  • Building Agents: Write the logic for sophisticated, autonomous AI agents, implementing tool-use (function calling) and managing conversation state/context windows programmatically.
  • Feature Development: Implement major features in MSI Data’s AI application using REST APIs, React, Drizzle, Postgresql and Next.js.
  • Latency Optimization: Take ownership of the "speed" of the application. Manage Core Web Vitals against an SLA, optimize inference calls, manage asynchronous tasks, and implement streaming responses (WebSockets/SSE) to ensure a fluid user experience.

3. Integration & Reliability

  • System Integration: Connect AI services with our existing application suite. You will design and implement the internal APIs that allow our existing systems to talk to new AI microservices.
  • Reliability Engineering: Implement robust error handling for non-deterministic AI outputs. Build retry logic, fallback mechanisms, and guardrails to ensure the AI behaves predictably in production.
  • Observability: Instrument the code to provide attribution for token usage, cost per request, and response quality, ensuring we have full visibility into the system's performance.