Site Reliability Engineer (SRE)

New Today

About Us
We are a leading gaming and gambling solution software provider with a strong presence in the USA, UK, and Europe.
Through partnerships with global gaming companies, we build cutting-edge technical platforms across sportsbooks, lottery, casino, virtual gaming, and financial trading.
Our vision is to shape the future of gaming by transforming operations into intelligent, data-driven solutions that deliver exceptional customer experiences and create sustainable value for all stakeholders.
We believe in teamwork, knowledge sharing, and transparency with accountability.
The Role
Were looking for a Site Reliability Engineer (SRE) to help shape and drive how we build and operate reliable, observable, and cost-efficient systems.
Youll work closely with development, platform, and incident management teams to define what reliable means in measurable terms and build the tooling and processes to achieve it.
Your work will directly influence the speed, stability, and scalability of our platform.
Key Responsibilities
Partner with development teams to define and manage SLOs/SLIs, and use error budgets to guide engineering decisions.
Enhance observability ensuring metrics, logs, and tracing are in place to detect and fix issues proactively.
Lead cost optimisation initiatives: monitor spend, rightsize workloads, tune autoscaling, and drive efficient infrastructure usage.
Strengthen production readiness with pre-deployment checks, post-release validation, and robust platform guardrails.
Introduce and run chaos engineering experiments to improve system resilience.
Automate operational processes to reduce manual intervention across the stack.
Contribute to major incident response, providing engineering expertise.
Collaborate cross-functionally to raise the bar on platform stability, security, and performance.
Required Skills & Experience
3+ years in SRE, Platform, or DevOps roles.
Strong operational experience with Kubernetes (on-prem and AWS EKS).
Proven track record defining and working with SLOs/SLIs in production environments.
Deep understanding of observability (metrics, logging, tracing, telemetry
TPBN1_UKTJ
Location:
City Of London
Salary:
£75,000
Job Type:
FullTime
Category:
IT

We found some similar jobs based on your search