Connecting...

Banner Default Image

Site Reliability Engineering - Fully remote work!

仕事詳細

勤務地: 日本
給与: Negotiable
職種: 正社員
専門: IT
参照: BBBH52754_1727326912

Our client is a Tokyo-based Web3 company that develops and provides an innovative fan platform designed for the Web3 era, aimed at maximizing fan enthusiasm. Since founding in 2018, they have been delivering community services using advanced technologies, particularly blockchain, to major entertainment companies in Japan.

Responsibilities

  • Implement and maintain continuous integration and continuous deployment (CI/CD) pipelines to automate software delivery processes.
  • Monitor system performance, troubleshoot issues, and implement solutions to enhance reliability, performance, and scalability.
  • Design and implement infrastructure as code (IaC) using tools like Terraform, Ansible, or CloudFormation for automated provisioning and configuration management.
  • Manage containerized environments with Docker and orchestration tools such as Kubernetes for deployment, scaling, and management.
  • Implement and maintain monitoring, logging, and alerting systems for proactive issue detection and resolution.
  • Collaborate with cross-functional teams to define and execute disaster recovery and business continuity plans.
  • Ensure compliance with security best practices and industry standards for data protection and system security.
  • Stay updated on emerging technologies and industry trends, evaluating and recommending tools to enhance efficiency and scalability.

Must-Have

  • Minimum of 5 years of experience in a DevOps, SRE, or related role.
  • Expertise in Google Cloud Platform (GCP) and Google Kubernetes Engine (GKE) with hands-on experience.
  • Experience with infrastructure automation tools like Terraform, Ansible, or CloudFormation.
  • Proficiency in containerization technologies such as Docker and container orchestration tools like Kubernetes.
  • Familiarity with CI/CD pipelines and tools such as Jenkins, GitHub Actions, and ArgoCD.
  • Strong knowledge of Linux/Unix systems and shell scripting.
  • Experience with monitoring, logging, and alerting tools like Prometheus, Grafana, ELK Stack, or Splunk.

類似案件