THECOO logo
THECOO

Site Reliability Engineer

Salary
700万 - 1200万
Location
Tokyo
Remote
On-site / hybrid
Visa
Sponsorship available
Language
Japanese: Conversational / English: Business Level
Posted
Feb 6, 2025
Python
Javascript
Typescript
Kubernetes
AWS
Apply now

Review the role details and submit your application.

Apply Now
THECOO office view

Gallery

Office environment
Team culture
Workspace
Company culture

Overview

About Us

At THECOO, we're leveraging cutting-edge technology to build Fanicon, an innovative and interactive fan community platform. Launched in December 2017, Fanicon is a subscription-based, premium fan community platform where fans can engage directly with their favorite artists and personalities (“icons”), who are directly involved in content design and creation. Fanicon has experienced continuous growth since its launch. As of March 2022, Fanicon hosts over 2,300 communities and over 180,000 fans.

We're currently looking for a Site Reliability Engineer (SRE) to join our team and help us scale. This is an incredible opportunity to grow with a team of experts passionate about creating robust, scalable, and efficient systems. We are looking for an SRE interested in making a big impact, both by collaborating with internal stakeholders closely (dev teams, product managers) and by demonstrating technical leadership (sharing technical knowledge, new ideas, and participating in design discussions). We are also looking for someone friendly and easy to get along with - we have a very welcoming and positive engineering culture, and we value that highly!

Responsibilities

  • SRE Team is looking for a software engineer who can maintain the stability of its systems that handle heavy loads of traffic. We’re looking for a software engineer who can engage themselves in the automation of systems and attend to system failures as well as carry out development to improve the reliability, performance, and scalability of its systems moving forward.
  • Specific work responsibilities include the following:
  • Contribute optimizations to the backend codebase
  • Assist in designing, implementing, and maintaining our multi-cloud infrastructure, specifically:
  • GCP Cloud Run, Networking, GKE, Cloud SQL
  • AWS IVS and CloudFront
  • Work with Docker containers and orchestration tools; knowledge of Kubernetes is a plus.
  • Utilize Terraform for infrastructure as code deployments and tool configuration
  • Utilize CircleCI for continuous integration and delivery pipelines
  • Fine-tune application performance monitoring with Datadog, proactively identifying and resolving issues as part of a department-wide on-call rotation
  • Own production releases together with development teams
  • Participate in incident management, post-mortem analysis, and system optimization
  • Develop and maintain documentation on system configurations, operations, and troubleshooting procedures.
  • Fine-tune database deployments (MySQL and InnoDB)

Required Skills

  • Minimum Qualifications
  • Bachelor's degree in Computer Science or equivalent practical experience
  • Ability to work independently, learn quickly, and proactively collaborate
  • Strong written and verbal communication skills
  • Strong problem solving and troubleshooting skills
  • Ability to work calmly under pressure (such as during a production outage)
  • Experience with at least one public cloud platform
  • Experience with at least one programming language (pref Python, Golang, or Javascript/Typescript)
  • Familiarity with Linux administration and shell scripting

Preferred Skills

  • Experience with monitoring and infrastructure tooling (Datadog, Pagerduty, Terraform, CircleCI)
  • GCP design and administration experience
  • Understanding of networking principles and protocols
  • Ability to write clear, concise, and informative documentation
  • Experience in developing and operating large-scale web applications
  • Experience collaborating with product managers and designers
  • Bonus Points!
  • AWS design and administration experience
  • Kubernetes design and administration experience
  • Experience with Docker, containerization, and microservice architectures
  • Experience with Stripe or other payment processors
  • Experience designing, implementing, and maintaining deployment tooling (canary, blue/green)
  • Strong communication skills in Japanese, both written and verbal
  • Passion for efficiency, scalability, and technology in general
  • Language Requirements
  • English : Business level
  • Japanese: Daily conversational level

About THECOO

At THECOO, we're leveraging cutting-edge technology to build Fanicon, an innovative and interactive fan community platform. Launched in December 2017, Fanicon is a subscription-based, premium fan community platform where fans can engage directly with their favorite artists and personalities (“icons”), who are directly involved in content design and creation. Fanicon has experienced continuous growth since its launch. As of March 2022, Fanicon hosts over 2,300 communities and over 180,000 fans.

We're currently looking for a Site Reliability Engineer (SRE) to join our team and help us scale. This is an incredible opportunity to grow with a team of experts passionate about creating robust, scalable, and efficient systems. We are looking for an SRE interested in making a big impact, both by collaborating with internal stakeholders closely (dev teams, product managers) and by demonstrating technical leadership (sharing technical knowledge, new ideas, and participating in design discussions). We are also looking for someone friendly and easy to get along with - we have a very welcoming and positive engineering culture, and we value that highly!

Quick Facts

CompanyTHECOO
LocationTokyo
Salary700万 - 1200万
RemoteOn-site / hybrid
VisaAvailable
LanguageJapanese: Conversational / English: Business Level
Interested in this role?

Submit your application for this role at THECOO.

Apply Now