Senior Site Reliability Engineer (Application / API Focused)

3 Days Old

Salary: £100,000 per annum + 25% Bonus + Excellent BenefitsWe are hiring a Senior SRE to support a large-scale digital organisation undergoing a major commercial re-platforming across web and mobile channels.This role sits much closer to the application layer than traditional infrastructure SRE positions. You will work directly with product and engineering teams across customer-facing platforms (web, mobile, payment journeys, APIs) to improve reliability, resilience, and service behaviour in production.This is not a ticket-driven operational role and not a pure platform engineering post. It is about embedding measurable reliability into distributed systems at service level.What You’ll Be DoingEmbed SRE practices across API and microservices-based architecturesDefine and own meaningful SLIs/SLOs aligned to customer journeys and business-critical flowsImprove service reliability through proactive observability, tracing, telemetry and alert tuningPartner closely with backend and platform engineers to reduce systemic failure modesLead and contribute to incident response, post-incident reviews and resilience improvementsMove the organisation from symptom-based alerting to customer-impact driven diagnosticsContribute to release safety, progressive deployments and production guardrailsWhat We’re Looking ForProven experience operating as an SRE within digital product environmentsStrong understanding of API architectures, microservices and distributed systems behaviourHands‑on experience defining and implementing SLIs, SLOs and error budgetsDeep observability exposure (e.g. Datadog, Splunk, Prometheus, tracing/APM platforms)Experience working closely with application engineering teams, not just infrastructure teamsBackground in high‑availability, customer‑facing systems where outages have commercial impactCloud‑native exposure (AWS preferred) with practical understanding of Kubernetes environmentsImportantThis role is best suited to engineers who care deeply about production behaviour, customer experience in failure scenarios, and reliability as a first-class product feature, rather than engineers focused purely on infrastructure provisioning or CI/CD enablement.Please get in touch with Benjamin Applewhaite to discuss the role in confidence. #J-18808-Ljbffr
Location:
Greater London
Job Type:
FullTime

We found some similar jobs based on your search