Site Reliability Engineer
Own SLOs, incident response, and resilient architectures for production systems.
Role responsibilities
You will blend software and systems thinking to keep services dependable — error budgets, game days, and blameless culture included.
- Define and monitor SLOs/SLIs; drive error budget policies and prioritization.
- Lead incident command, postmortems, and action tracking to prevent recurrence.
- Improve reliability through automation, load testing, and chaos exercises.
- Collaborate with product and platform on capacity and graceful degradation.
Apply for this role
Please complete the form below to submit your application.