Cloud Ops Manager
Fully Remote Remote Worker - N/A
Job Type
Full-time
Description

Job Summary

The Cloud Ops Manager is a working supervisor responsible for the reliable delivery of CaseWorthy’s software-as-a-service offerings, who ensures world class availability, security, and performance of these offerings. The incumbent manages a team of Cloud Engineers and Administrators who design, build, operate, and maintain the cloud infrastructure and who support innovations of the larger engineering organization.  

Responsibilities

  • Manages team of Cloud Engineers and Administrators.
  • Ensures that there is 24x7 on-call and escalation coverage for support of cloud environments.
  • Owns monitoring and observability solutions.
  • Inculcates a “devops” culture across CaseWorthy R&D teams.
  • Works in tandem with the engineering teams to identify and implement the most optimal cloud-based solutions for the company. 
  • Provides guidance, thought leadership, and mentorship both to the Cloud Ops team and to other development teams to build cloud competencies. 
  • Educates teams on the implementation of new cloud-based initiatives, providing associated training as required. 
  • Monitors and ensures the performance, uptime, and scale of systems and proactively addresses issues. 
  • Troubleshoots incidents, identifies root cause, fixes, and documents problems, and implements preventive measures. 
  • Operates and manages cloud environments in accordance with company security guidelines. 
  • Consults in system design to meet security, cost, reliability, and capacity requirements. 
  • Manages cloud costs and provides cost projections as needed. 
  • Manages data backup operations and maintains data disaster recovery plans. 
  • Manages deployments to UAT and production environments.  
  • Automates infrastructure and configuration management using CI/CD and infrastructure as code.
  • Ability to travel nationwide, up to 10% annually.
  • Performs other duties as assigned.
Requirements

Required Skills & Qualifications

  • Deep understanding of delivering software-as-a-service solutions
  • 5+ years of experience in a systems administration, software engineering, or site reliability related role 
  • 3+ years of experience involving management and design of infrastructure within AWS or Azure
  • 2+ years of supervising employees in an engineering organization 
  • Understanding of and experience with the five pillars of a well-architected framework 
  • Knowledge of a variety of security domains such as: problem management, security vulnerability assessments, business continuity, security audits and standards, and identity management 
  • Experience in hosting web applications, container orchestration, serverless compute, and ETL jobs within AWS or Azure. 
  • Working knowledge of infrastructure-as-code (ARM/BICEP, Terraform, CloudFormation, and/or CDK) and pipeline automation (GitHub Actions, AWS CodeBuild / Code Deploy, Jenkins, Azure Pipelines, BitBucket Pipelines, or similar) 
  • Knowledge of NIST 800-53/HITRUST/ISO Regulatory Frameworks  
  • Strong written and verbal communication skills for both technical and non-technical audiences
  • Proven ability and passion to pick up new technologies and stay on the cutting edge of technology

Preferred Skills & Qualifications

  • AWS and/or Azure certifications a plus
Salary Description
$125,000-$150,000