ACCOUNT EXECUTIVE JOBS

SITE RELIABILITY ENGINEER JOBS

The following job description is a snapshot of one of our successful placements.

For our latest career opportunities, please visit our JOB BOARD.


SITE RELIABILITY ENGINEER (SRE)

Are you an experienced and proactive Site Reliability Engineer? Join our team and play a critical role in ensuring the reliability, performance, and scalability of our systems and services. As an SRE, you'll contribute to the design, building, and maintenance of our infrastructure while implementing best practices for availability, incident response, and system reliability. Your expertise in systems engineering, automation, and monitoring will be pivotal in ensuring the optimal performance of our applications.


Responsibilities:


  • Collaborate closely with development, operations, and other teams to understand system architecture and requirements.
  • Design and implement robust, scalable, and resilient infrastructure solutions using cloud platforms (e.g., AWS, Azure, GCP).
  • Develop and maintain monitoring, alerting, and observability tools to ensure proactive identification of system issues.
  • Implement and optimise CI/CD pipelines and deployment processes for continuous integration and deployment.
  • Automate routine operational tasks using scripting and configuration management tools.
  • Implement disaster recovery and business continuity plans to ensure high availability.
  • Collaborate with cross-functional teams to conduct incident response and post-incident analysis.
  • Identify areas for improvement in system architecture, performance, and reliability.
  • Stay updated about industry trends, emerging technologies, and best practices in site reliability engineering.
  • Provide support for on-call rotations to address system incidents and ensure system availability.


Qualifications:


  • Bachelor's degree in Computer Science, Information Technology, or a related field. Relevant work experience can be considered.
  • Proven experience as a Site Reliability Engineer or similar role, with a strong understanding of systems engineering and automation.
  • Proficiency in cloud technologies such as AWS, Azure, or GCP.
  • Experience with infrastructure as code tools such as Terraform, Ansible, or equivalent.
  • Familiarity with continuous integration and continuous deployment (CI/CD) practices.
  • Strong scripting skills in languages such as Python, Bash, or PowerShell.
  • Knowledge of containerisation technologies (e.g., Docker) and orchestration tools (e.g., Kubernetes).
  • Experience with monitoring and observability tools (e.g., Prometheus, Grafana, ELK stack).
  • Understanding of security best practices and measures for cloud-based applications.
  • Excellent problem-solving skills and the ability to troubleshoot complex issues.
  • Strong communication skills to collaborate with cross-functional teams.
  • Ability to work both independently and collaboratively in a team environment.


Join our team as a Site Reliability Engineer (SRE) and play a critical role in ensuring the reliability, performance, and scalability of our systems and services. Apply now to utilise your expertise in systems engineering, automation, and monitoring to ensure optimal system performance.




This is an example of one of the expert roles the team at S2M have placed previously. If you are looking for Solution Architect job opportunities in Sydney, Brisbane, Melbourne or any other Australian city, simply register your resume with S2M.


OPPORTUNITIES ACROSS THE TECHNOLOGY SECTOR

If this is not quite the role you are looking for, we have expertise across the MediaDigital, Sales & Communications and Design & Product sectors.

Explore more of the roles we specialise in within the Technology space:

Share by: