Senior Site Reliability Engineer in Manchester

New Yesterday

Energy Jobline is the largest and fastest growing global Energy Job Board and Energy Hub. We have an audience reach of over 7 million energy professionals, 400,000+ monthly advertised global energy and engineering jobs, and work with the leading energy companies worldwide.
We focus on the Oil & Gas, Renewables, Engineering, Power, and Nuclear markets as well as emerging technologies in EV, Battery, and Fusion. We are committed to ensuring that we offer the most exciting career opportunities from around the world for our jobseekers.
Job Description
Who is our client?
Our client is a leading payment orchestration platform resolving numerous costly issues for merchants who navigate desperate, fast-moving and fragmented payment service providers. Series A funded and now focused on Series B an the expansion and creation of the Manchester hub the team is growing and looking for a Senior Site Reliability Engineer to join the team.
What is our client looking for?
The guardian of our production environment, responsible for its health, performance, and scalability. You will apply software engineering principles to solve operational problems, automate everything, and ensure our platform exceeds the reliability expectations of our customers.
Responsibilities:
Architect & Automate: Design, build, and maintain our core infrastructure using Infrastructure as Code (IaC) principles. You will be instrumental in evolving our CI/CD pipelines to ensure safe, rapid, and reliable releases. Enhance Reliability & Scalability: Proactively identify and address performance bottlenecks, single points of failure, and scalability limits. You will define and monitor Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to maintain and improve platform health. Champion Observability: Implement and manage comprehensive monitoring, logging, and alerting systems (e.g., Prometheus, Grafana, ELK Stack) to provide deep insights into system behavior and ensure rapid incident detection. Lead Incident Management: Participate in our on-call rotation, acting as a key player in incident response and resolution. You will lead blameless post-mortems to identify root causes and implement preventative measures. Collaborate & Empower: Work closely with software engineering teams to foster a culture of reliability. You will provide guidance on building resilient services, implementing best practices for observability, and improving the developer experience. Secure the Foundation: Implement and maintain security best practices across our cloud infrastructure, ensuring our platform is robust and compliant.
If this sounds like you, get in touch. We would love to hear from you.
***Please note, we are recruiting on behalf of our client. We will host a screening call with you and should you be successful you will then enter into a 3 stage interview process with our client****
If you are interested in applying for this job please press the Apply Button and follow the application process. Energy Jobline wishes you the very best of luck in your next career move.
Location:
Manchester
Job Type:
FullTime
Category:
Engineer, Reliability Engineer, Reliability, Senior, Engineering, Site

We found some similar jobs based on your search