Director, Cloud Infrastructure (US)
Fully Remote
Description

AEM (Advanced Environmental Monitoring) is the global leader in innovative mission critical weather, wildfire and water monitoring and intelligence solutions. We aim to be the world’s essential source for environmental insights – enabling decisive action and positive outcomes for our customers and their constituents. Our family of innovators offers world-class hydrometeorological technologies and services, including sensors, dataloggers, telemetry, and advanced analytics and software. Our technology and services empower the communities and organizations to survive – and thrive – in the face of escalating environmental risks.


We are seeking a Director of Cloud Operations to lead and optimize the availability, security, scalability, and modernization of our multi-region, multi-vendor cloud infrastructure. This role is crucial in ensuring the reliability and performance of our SaaS products, primarily on AWS, with additional services running in Microsoft Azure.


As a key leader, you will manage a team of Systems Administrators, Database Administrators (DBAs), and Network Operations Center (NOC) personnel, fostering operational excellence in a 24x7 high-availability environment. You will drive automation, monitoring, incident response, cost optimization, and security compliance while collaborating across departments to align cloud operations with business goals.


Key Responsibilities:

  • Lead the day-to-day operations of our SaaS cloud infrastructure, ensuring high availability and reliability.
  • Oversee and mentor a team of Systems Engineers, DBAs, and NOC staff, ensuring effective incident management, troubleshooting, and operational improvements.
  • Manage monitoring and observability tools to identify and resolve system bottlenecks, improving performance and uptime.
  • Optimize AWS and Azure cloud environments, ensuring scalability and cost efficiency while meeting service level objectives.
  • Supervise the 24x7 Network Operations Center (NOC) to ensure proactive response to infrastructure issues.
  • Ensure secure cloud operations by implementing and enforcing security policies aligned with SOC 2 controls, ISO 27001 and other compliance standards.
  • Lead incident response efforts, triaging critical issues and working with engineering teams to resolve production-impacting events.
  • Manage CI/CD pipelines, configuration management, and infrastructure automation to support rapid software delivery.
  • Collaborate with cross-functional teams to support software development, security, and IT infrastructure initiatives.
  • Develop and maintain audit-ready documentation for security compliance and operational excellence.
  • Analyze and optimize cloud resource costs, ensuring efficient budget utilization.
  • Track and report on key performance indicators (KPIs), driving continuous improvement.
  • Ensure effective post-deployment support and operational readiness for new releases.
  • Be available for on-call support as needed.

This job description may not be inclusive of all assigned duties, responsibilities, or aspects of the job described, and may be amended at any time at the sole discretion of the Employer. 

Requirements
  • Bachelor’s degree in a related field or equivalent experience.
  • 5+ years of hands-on experience managing cloud operations, infrastructure, and IT teams in a high-availability SaaS environment.
  • Proven experience leading a team of 5+ IT professionals, including Systems Engineering, Network Engineering, Database Administration, and NOC functions.
  • Deep expertise in AWS (primary) and experience with Microsoft Azure.
  • Experience managing large-scale, high-volume data processing environments with strict SLAs.
  • Strong background in infrastructure automation, CI/CD pipelines, and configuration management.
  • Experience implementing and maintaining IT security policies and compliance frameworks (e.g., ISO 27001, SOC 2).
  • Track record of optimizing cloud spend and resource allocation.
  • Familiarity with incident management, disaster recovery, and business continuity planning.
  • Experience supporting Agile/Scrum software development teams.
  • Strong communication and leadership skills, with the ability to interact with executive stakeholders, technical teams, and external vendors.
  • A recent AWS associate level certification is a must.

Preferred Qualifications:

  • Experience with container orchestration (e.g., Kubernetes, Docker).
  • Knowledge of infrastructure-as-code tools (e.g., Terraform, CloudFormation, Ansible).
  • Prior experience in SRE (Site Reliability Engineering) methodologies.
  • AWS Professional level certification.

Additional Information:

  • This is a remote opportunity that can be done from anywhere in the continental United States and/or Canada
  • Must be eligible to work in Canada without company sponsorship, now or in the future, for employment-based work authorization. F-1 visa holders with Optional Practical Training (OPT) who will require H-1B status, TNs, or current H-1B visa holders will not be considered. H1-B and green card sponsorship is not available for this position.

US Benefits include: Medical, Dental, Vision, Life Insurance, Short-Term & Long-Term Disability & 401k match of up to 3%.  


US Compensation Range: A reasonable estimate of the current salary range for this position is $150,000.00 - $210,000.00 per year. Please note that the salary information is a general guideline only. AEM considers a wide range of factors such as (but not limited to) scope and responsibilities of the position, candidate's work experience, education, licensure and certifications, key skills as well as other market and business considerations when extending an offer. The disclosed range estimate has not been adjusted for the applicable geographic differential associated with the location at which the position may be filled.


This position will accept applications on an ongoing basis and will be closed once the position is filled.

AEM is an Equal Opportunity Employer.