Whitepages is hiring a Platform & IT Lead to support and grow a small team of mid-level engineers responsible for our production platform and internal IT systems. This is a working lead role: you will contribute directly to infrastructure engineering, provide technical direction, and help the team build good habits and strong fundamentals.
This role is perfect for an engineer who loves diving into the work, solving real operational problems, and consistently raising the engineering bar—not leading from a distance
What You’ll Do
Platform Engineering (Majority of the Work)
- Design, operate, and improve the cloud infrastructure behind our products, leveraging automation and AI-assisted tooling to increase reliability and reduce manual effort.
- Work closely with engineering to ensure systems are secure, reliable, observable, cost-efficient, and increasingly self-diagnosing using modern telemetry, anomaly detection, and predictive insights.
- Lead platform projects: improving deployments, modernizing systems, automating operations, introducing AI-enhanced workflows where appropriate (e.g., automated runbook generation, log analysis, predictive scaling, or drift detection), reducing technical debt.
- Guide the team through issues they may not yet have the experience to solve independently using AI tools to accelerate troubleshooting while teaching engineers how to interpret and validate machine-generated recommendations.
- Provide clear documentation, reference architectures, and examples the team can follow.
- Participate in on-call and ensure the team has strong incident response practices by incorporating AI-supported alert triage, root cause summarization, and context-rich incident reports.
IT & Internal Operations (Supporting Function)
- Oversee core internal IT: identity management, devices, office networking, and SaaS tools while implementing AI-assisted monitoring, automated workflows, and security posture checks to maintain a clean, well-managed environment.
- Support the team in troubleshooting and maintaining reliable internal systems using AI/ML where beneficial (e.g., automated device baselines, configuration anomaly detection, or Tier-0 helpdesk automation).
- Maintain a simple, secure, well-managed environment with increasing levels of automation and smart alerting for employees.
Operational Excellence & Standards
- Set practical expectations for uptime, security, monitoring, and operational hygiene incorporating ML-based anomaly detection, predictive capacity planning, and intelligent alert routing to reduce noise and improve response times.
- Help the team adopt consistent processes for deployment, alerting, and incident reviews streamlining workflows with automation, templates, and AI-powered quality checks where appropriate.
- Teach and model strong fundamentals: clear commits, documentation, tests, ownership, while also guiding the team in responsible, safe use of AI tools (e.g., code generation, log insights, documentation assistance)..
- Ensure monitoring and capacity planning are proactive by using data-driven insights, forecasting, and automated reporting, rather than reactive.
Team Leadership (Mid-level Team Context)
- Provide day-to-day technical mentorship and coaching, especially in areas where the team lacks senior depth, including helping engineers leverage AI tools effectively while maintaining strong judgment and architectural rigor.
- Review designs and code to help the team make solid, predictable choices, using AI-assisted analysis as a supplement—not a substitute—for thoughtful engineering review.
- Give clear, timely feedback and help engineers develop stronger instincts including prudent use of automation and AI.
- Create clarity around priorities, breaking down work so the team can progress steadily.
- Address performance or teamwork issues early and constructively ensuring communication and collaboration remain high-quality even with increasing automation.
- Hire thoughtfully, focusing on strengthening the team’s core competencies and valuing candidates who demonstrate strong fundamentals and the ability to leverage modern tooling responsibly.
Cost & Vendor Management
- Own AWS and IT-related budgets; using AI-supported cost-visibility tools and forecasting models to keep spend predictable and identify optimization opportunities.
- Evaluate tools and vendors—particularly AI platforms, observability solutions, and automation frameworks—through the lens of long-term value, reliability, and simplicity.
- Manage vendor relationships and contracts as needed, ensuring service quality and cost alignment with company needs
- Education: Applicable degree or equivalent experience.
- Extensive hands-on experience with Kubernetes, Linux systems, AWS infrastructure, and networking fundamentals.
- A track record of running and improving large/complex production systems in a small-team context.
- Comfort mentoring mid-level engineers and helping them develop good engineering habits.
- Ability to balance hands-on work with guiding others — you know when to delegate and when to step in.
- Solid communication skills and a calm, professional approach to operational challenges.
- Experience with internal IT systems or willingness to build that competency.
Whitepages is a nimble, empowered team of ~48 with a big impact. Our remote-first, hybrid work model allows flexibility while keeping teams well-connected. Seattle-based employees enjoy a beautifully designed downtown office with all the perks and work from home opportunities. This role requires Tuesdays in office, as well as a designated in office week each quarter.
Salary consists of base pay and participation in a quarterly and annual bonus based on the performance of the company. The range for this role is $150k - $200k (plus bonus opportunity), experience.
Whitepages is proud to be an equal opportunity employer and is committed to creating a diverse, inclusive, and equitable workplace. We value authenticity and welcome people of all backgrounds to join our mission.