ZaloPay – Principle Site Reliability Engineer (SRE) 69 views

Job Expired
  • Implementing SRE automation, developing automation across the stack, and optimizing operations hours by reducing manual operations.
  • Eliminating toil by automation across all the layers – infrastructure provisioning, configuration management, deployment, testing, and operation on premise and public clouds (Google Cloud and AWS)
  • Working on retooling our infrastructure to provide an agile, cloud based foundation that provides common infrastructure management and automation framework.
  • Interfacing directly with senior staff members within the organization to discuss and assess compliance with IT policies, standards and procedures, suggest opportunities for improvement, and report on the status of specific. Work with development teams throughout the software life cycle ensuring sustainable software releases.
  • Practicing sustainable incident response and blameless postmortems
  • Help to build methodology to manage infrastructure and platform cost
  • Train SRE junior members
  • Manage small SRE team (4-6 members) to drive automation, scalability, high availability and performance of ZaloPay


  • Bachelor’s degree with five or more years of work experience.
  • Six or more years of SRE relevant work experience.
  • Experience in Systems Architecture, in-depth knowledge on SRE, IT Operations, Cloud, Coding and Scripting experience with Golang, Java, Python and automation tool: Terraform, Ansible,
  • Strong experience with Google, AWS cloud environments, with working knowledge in standard cloud services, features and tool, with Certification in appropriate areas.
  • Strong experience with automation provisioning dependency software on premises.
  • Have experience building Disaster recovery solution is preferred



  • Five or more years of experience working on middle technologies like Kafka/ RabbitMQ, Springboot, REDIS, Elasticsearch MySQL, ETCD.
  • Automation experience and ability to code or script at an advance level.
  • Experience in Cloud & Container platform Strategies, Design, Architecture and Migration.
  • Experience with designing and implementing CI/CD DevOps solutions using Jenkins pipelines using Python, Git, Shell, YAML, Kubernetes and Docker.
  • Configuration Management experience with Chef, Puppet, Ansible or Python.
  • Experience serving as both a mentor and advocate for your team.
  • Experience performing analytics on previous incidents and usage patterns to better predict issues and take proactive actions.

More Information

  • This job has expired!
Share this job
Company Information
  • Full Address Z06 đường số 13, phường Tân Thuận Đông, Quận 7, TP.HCM.

Job Portal, Internships, skills of International University – Vietnam Nation University Ho Chi Minh City.

Contact Us

International University.
Quarter 6, Linh Trung Ward, Thu Duc District, Ho Chi Minh City
[email protected]
(+84) 28 3724 4270 – Ext: 3334, 3826