Lead DevOps transformation initiatives by building resilient multi-cloud infrastructure, automating deployments at scale, and driving platform reliability for enterprise SaaS products.
San Francisco, USA
Full-time
Cloud & Infrastructure
Responsibilities
Architect and maintain multi-cloud CI/CD pipelines supporting hundreds of microservices across AWS and GCP
Lead infrastructure reliability improvements targeting 99.99% uptime SLAs for production systems
Design and implement zero-downtime deployment strategies including blue-green and canary releases
Mentor junior engineers on DevOps best practices, IaC patterns, and incident response procedures
Drive cost optimization initiatives across cloud infrastructure with automated rightsizing recommendations
Establish and maintain security guardrails for infrastructure using policy-as-code tools like OPA and Kyverno
Requirements
6-9 years of DevOps or infrastructure engineering experience in high-scale production environments
Deep expertise in AWS or GCP with multi-account/project governance and networking
Advanced Kubernetes knowledge including custom operators, admission controllers, and cluster autoscaling
Strong experience with Terraform at scale including module design, state management, and CI integration
Proven track record of reducing MTTR and improving deployment frequency in enterprise settings
Excellent communication skills for collaborating with engineering leadership and cross-functional teams
Nice to Have
Experience with platform engineering and internal developer portals (Backstage)
Background in distributed systems observability with OpenTelemetry