Site Reliability Engineer (SRE)

Equifax

Site Reliability Engineer (SRE)

Salary Not Specified

Equifax, Leeds

  • Full time
  • Permanent
  • Onsite working

Posted 1 week ago, 13 May | Get your application in now before you're too late!

Closing date: Closing date not specified

job Ref: 2336332416a74c1b9392be27e1fce7b4

Full Job Description

We have implemented Site Reliability Engineering (SRE) to combine software and systems engineering for building and running large-scale, distributed, fault-tolerant systems. The SRE is a key role to ensure internal and external services meet or exceed reliability and performance expectations while adhering to Equifax engineering principles.

As SRE you will be responsible for applying an engineering approach to building and running our production systems - we engineer solutions to operational problems. Our SREs are responsible for overall system operation, we use a breadth of tools and approaches to solve a broad set of problems. Practices such as limiting time spent on operational work, blameless postmortems, proactive identification, and prevention of potential outages are something you will be passionate about.

Our SRE culture of diversity, intellectual curiosity, problem solving and openness is key to its success. Equifax brings together people with a wide variety of backgrounds, experiences and perspectives. We encourage them to collaborate, think big, and take risks in a blame-free environment. We promote self-direction to work on meaningful projects, while we also strive to build an environment that provides the support and mentorship needed to learn, grow and take pride in our work.

Don't worry if you've not used GCP before as we will provide you with the opportunity to develop your skills, get certified and have hands on training all whilst influencing and supporting the exciting programme of work during this transition in collaboration with international colleagues.

You will be joining energised, supportive, teams who drive a positive culture immersed in leading-edge technology. We develop secure, regulated, scalable technology for both our commercial and consumer customers with a major focus on ensuring they are highly available and always delighting our clients by delivering to service level objectives.,

  • Engage in and improve the software development lifecycle - from inception and design, through development, deployment, operation and refinement

  • Influence and design infrastructure, architecture, standards and methods for large-scale systems

  • Support services prior to production via infrastructure design, software platform development, load testing, capacity planning and launch reviews

  • Maintain services during deployment and in production by measuring and monitoring key performance and service level indicators including availability, latency, and overall system health

  • Automate system scalability and continually work to improve system resiliency, performance and efficiency

  • Practice sustainable incident response as part of an on-call rotation and through blameless postmortems

  • Remediate tasks within corrective action plan via sustainable, preventative, and automated measures whenever possible

  • Potential on call as and when required

    Analytical and troubleshooting skills

  • Developed and/or administering software in public cloud

  • Monitored infrastructure and application uptime and availability to ensure functional and performance objectives such as Pagerduty, Datadog, or Grafana.

  • Languages such as Python, Bash, Java, Go, etc

  • Cross-functional knowledge with systems, storage, networking, security and databases

  • System administration skills, including automation and orchestration of Linux/Windows, Puppet, Ansible and containers (Docker, Kubernetes, etc.)

  • Proficiency with continuous integration and continuous delivery tooling and practices, Expertise designing, analysing and troubleshooting large-scale distributed systems.

  • Take a system problem-solving approach, coupled with strong communication skills and a sense of ownership and drive

  • Experience managing Infrastructure as code via tools such as Terraform or CloudFormation

  • Passionate for automation with a desire to eliminate toil whenever possible

  • You've built software or maintained systems in a highly secure, regulated or compliant industry

  • You thrive in and have experience and passion for working within a DevOps culture and as part of a team