Site Reliability Engineer (SRE) Sunderland - Hybrid
New Today
Position
Site Reliability Engineer (SRE) | Sunderland (Hybrid) | Full-time
Responsibilities
System Reliability & Availability Hero: Guardian of uptime, ensuring critical systems always available, meeting SLAs. Lead incident management, investigate root causes.
Monitoring & Alerting Maestro: Set up and maintain monitoring systems (Dynatrace). Create alerting that preemptively detects problems and define key metrics for system health.
Incident Response Ace: Resolve incidents quickly to minimize downtime. Conduct root cause analysis to prevent recurrence.
Automation Whizz: Automate repetitive tasks using Terraform, Git, TeamCity; build efficient CI/CD pipelines.
Capacity Planning Pro: Scale systems to meet demand, optimize resource usage, forecast future needs.
Performance Optimiser: Tune databases, improve response times, run load and stress tests to handle peak periods.
Infrastructure Guru: Manage AWS cloud resources, ensure scalability, cost‑effectiveness, resilience, and develop disaster recovery plans.
Collaboration King/Queen: Work closely with development teams, embed reliability into new features, champion service ownership, provide feedback for operational improvement.
Security & Compliance Captain: Integrate security best practices, ensure adherence to regulations, protect production environments.
Documentation Dynamo: Produce clear, concise documentation for infrastructure, procedures, runbooks.
Continuous Improvement Enthusiast: Seek new technologies and improved practices to enhance reliability, performance, and efficiency.
If you are an experienced SRE who thrives on building reliable, scalable, and efficient systems and enjoys a collaborative environment, we would like to hear from you.
#J-18808-Ljbffr
- Location:
- Sunderland
- Job Type:
- FullTime