CloudBees provides the leading software delivery platform for enterprises, enabling them to continuously innovate, compete, and win in a world powered by the digital experience. Designed for the world's largest organizations with the most complex requirements, CloudBees enables software development organizations to deliver scalable, compliant, governed, and secure software from the code a developer writes to the people who use it. The platform connects with other best-of-breed tools, improves the developer experience, and enables organizations to bring digital innovation to life continuously, adapt quickly, and unlock business outcomes that create market leaders and disruptors.
CloudBees was founded in 2010 and is backed by Goldman Sachs, Morgan Stanley, Bridgepoint Credit, HSBC, Golub Capital, Delta-v Capital, Matrix Partners, and Lightspeed Venture Partners. Visit www.cloudbees.com and follow us on Twitter, LinkedIn, and Facebook.
About the role
As a Staff Site Reliability Engineer, you will play a pivotal role in maintaining and enhancing the reliability, scalability, and performance of our systems. You will collaborate closely with cross-functional teams, including software engineers, DevOps engineers, and product managers, to design, implement, and optimize the infrastructure that powers our applications. Your expertise will contribute to achieving high availability and seamless user experiences for our customers.
What You’ll Do
- Lead efforts to design, implement, and manage highly available, scalable, and fault-tolerant systems and services.
- Drive the automation of processes, deployments, monitoring, and incident response to improve efficiency and reliability.
- Collaborate with development teams to ensure the architecture and applications are designed with scalability, reliability and cost in mind.
- Develop and maintain monitoring, alerting, and logging solutions to proactively identify and address performance issues and outages.
- Participate in a follow the sun on-call rotations (on a regular basis), responding to incidents, conducting post-incident reviews, and contributing to incident response improvements.
- Analyze system performance data, identify bottlenecks, and recommend solutions to optimize performance and resource utilization.
- Contribute to the design and implementation of disaster recovery strategies and backup solutions.
- Mentor and provide guidance to junior SREs and other team members, fostering a culture of continuous learning and improvement.
- Stay current with industry trends, emerging technologies, and best practices to drive innovation and improvements in system reliability.
- Experience with containerization and orchestration technologies (e.g., Docker, Kubernetes).
- Bachelor's degree in Computer Science, Engineering, or related field (or equivalent practical experience).
- 10+ years of experience in a Site Reliability Engineering or similar role, with a proven track record of managing complex systems in a production environment.
- Proficiency in programming/scripting languages such as Go, Python, or similar.
- Strong experience with cloud platforms (e.g., AWS, Google Cloud, Azure) and infrastructure-as-code tools (e.g., Terraform, Cloud Formation).
- Solid knowledge of networking concepts, including load balancing, DNS, routing, and security.
- Experience with monitoring and observability tools (e.g., Prometheus, Grafana, DataDog).
- Strong problem-solving skills and the ability to troubleshoot complex issues under pressure.
- Excellent communication and collaboration skills to work effectively across teams.
- Experience with CI/CD pipelines and version control systems (e.g., Jenkins, GitHub actions).
- Relevant certifications (e.g., AWS Certified DevOps Engineer, Google Professional DevOps Engineer) are a plus.
- Proactive approach to identifying problems, performance bottlenecks, and areas for improvement.
- Possess a passion for reliability, through participation in architectural design.
We’re invested in you!
We offer generous paid time off to allow our employees time to rest, recharge and to be present with family and friends throughout the year.
At CloudBees, we truly believe that the more diverse we are, the better we serve our customers. A global community like Jenkins demands a global focus from CloudBees. Organizations with greater diversity—gender, racial, ethnic, and global—are stronger partners to their customers. Whether by creating more innovative products, or better understanding our worldwide customers, or establishing a stronger cross-section of cultural leadership skills, diversity strengthens all aspects of the CloudBees organization.
In the technology industry, diversity creates a competitive advantage. CloudBees customers demand technologies from us that solve their software development, and therefore their business problems, so that they can better serve their own customers. CloudBees attributes much of its success to its worldwide work force and commitment to global diversity, which opens our proprietary software to innovative ideas from anywhere. Along the way, we have witnessed firsthand how employees, partners, and customers with diverse perspectives and experiences contribute to creative problem-solving and better solutions for our customers and their businesses.
Please be aware that there are individuals and organizations that may attempt to scam job seekers by offering fraudulent employment opportunities in the name of CloudBees. These scams may involve fake job postings, unsolicited emails, or messages claiming to be from our recruiters or hiring managers.
Please note that CloudBees will never ask for any personal account information, such as cell phone, credit card details or bank account numbers, during the recruitment process. Additionally, CloudBees will never send you a check for any equipment prior to employment. All communication from our recruiters and hiring managers will come from official company email addresses (@cloudbees.com) and will never ask for any payment, fee to be paid or purchases to be made by the job seeker.
If you are contacted by anyone claiming to represent CloudBees and you are unsure of their authenticity, please do not provide any personal/financial information and contact us immediately at firstname.lastname@example.org.
We take these matters very seriously and will work to ensure that any fraudulent activity is reported and dealt with appropriately.
Some signs of a recruitment scam:
- Ensure there are no other domains before or after @cloudbees.com. For example: “name.dr.cloudbees.com”
- Check any documents for poor spelling and grammar – this is often a sign that fraudsters are at work.
- If they provide a generic email address such as @Yahoo or @Hotmail as a point of contact.
- You are asked for money, an “administration fee”, “security fee” or an “accreditation fee”.
- You are asked for cell phone account information.
- You are asked to cash a check for “equipment” prior to start.
- You are offered a job offer immediately or without an interview.