Shizuku AI logo
Shizuku AI

MLOps Engineer

Salary
800万 - 2000万
Location
Tokyo
Remote
On-site / hybrid
Visa
Sponsorship available
Language
Japanese: Business Level / English: Fluent
Posted
Apr 10, 2026
Python
C++
Typescript
Hadoop
Openstack
Apply now

Review the role details and submit your application.

Apply Now
Shizuku AI office view

Gallery

Office environment
Team culture
Workspace
Company culture

Overview

MISSION

As the founding MLOps engineer, design and build Shizuku’s ML infrastructure from the ground up. Establish the complete pipeline — from data ingestion through training environments to model serving — creating an internal platform that empowers ML engineers to iterate on models at maximum velocity.

Replace individual, siloed development environments with a unified team-scale ML development platform, maximizing the speed of Shizuku’s evolution.

Responsibilities

  • Design, build, and operate the end-to-end ML training pipeline: data collection/preprocessing → training → evaluation → deployment
  • Design and build GPU training infrastructure on AWS (A100, L4, etc.) with cost optimization
  • Build an internal ML platform for engineers: experiment tracking, model versioning, and reproducibility guarantees
  • Design and build model serving infrastructure: inference APIs, auto-scaling, and latency management
  • Establish training data management and quality assurance pipelines
  • Design and implement CI/CD for ML: automated training, model testing/evaluation, and staged rollouts
  • Drive production integration of models in collaboration with ML Engineer and SWE teams
  • Build monitoring and visibility infrastructure for long-term compute cost and GPU utilization tracking

Required Skills

  • 3+ years of experience designing, building, and operating cloud infrastructure on AWS, GCP, or equivalent platforms
  • Experience building ML/DL pipelines and infrastructure
  • Hands-on experience designing and operating production environments using container technologies (Docker/Kubernetes)
  • Experience managing infrastructure as code (Terraform, Pulumi, etc.)
  • Strong Python skills for building tools and pipelines
  • Ability to work on-site at our Tokyo office (primarily in-office with flexible remote arrangements)
  • Founding Engineer Mentality — You don’t wait for established systems to improve — you define the design philosophy and build the foundation from zero. You’re energized by creating the system itself, not just refining one
  • ML-Literate Infrastructure Engineer — You understand the unique characteristics of ML training and inference workloads, and you translate that understanding into optimally designed infrastructure
  • Purpose-Driven Ownership — You reverse-engineer from “maximizing ML team velocity,” set your own priorities, and drive execution autonomously
  • Comfort with Ambiguity — You design for a world where model count, training frequency, and data volume are still being defined — starting small and scaling architecturally as the picture clarifies
  • Resilience & Respect — You engage as an equal partner with ML Engineers and SWEs, elevating the entire team’s productivity through collaboration

Preferred Skills

  • Experience building, operating, and cost-optimizing GPU clusters (A100, H100, L4, etc.)
  • Experience with ML platforms: SageMaker, Vertex AI, Ray, Kubeflow, etc.
  • Experience deploying and operating experiment tracking infrastructure: MLflow, Weights & Biases, DVC, etc.
  • Experience building model serving infrastructure: Triton Inference Server, TorchServe, vLLM, SGLANG, etc.
  • Experience designing and building internal ML development platforms
  • Domain-specific knowledge of ML workloads in speech, NLP, or vision
  • Experience as a founding infrastructure/MLOps engineer at a startup
  • Technical communication skills in English (currently Japanese-first internally; transitioning to a global environment in the mid-term)

About Shizuku AI

As the founding MLOps engineer, design and build Shizuku’s ML infrastructure from the ground up. Establish the complete pipeline — from data ingestion through training environments to model serving — creating an internal platform that empowers ML engineers to iterate on models at maximum velocity.

Replace individual, siloed development environments with a unified team-scale ML development platform, maximizing the speed of Shizuku’s evolution.

Shizuku is a Japan-born AI companion actively engaging audiences on YouTube and X (formerly Twitter). Already running live streams and cultivating a growing community, Shizuku is now entering its next phase of rapid scale.

As the first Japanese startup to receive investment from a16z, we closed our seed round and are on a mission to bring Japanese entertainment × AI to the global stage.

TEAM STRUCTURE

You will work closely with founder Aki (ML engineer and researcher, ex-Meta, ex-Luma AI) and Engineering Director Ohno to drive the design and construction of our ML infrastructure. As the first MLOps engineer, you’ll have significant autonomy — from technology selection to operational design.

Post-foundation, career paths include both a management track leading a growing team and an IC track deepening technical expertise, tailored to your aspirations.

CURRENT STATE & WHAT YOU’LL BUILD

Infrastructure Status: Modern application infrastructure is in place, but ML training and MLOps tooling are not yet established. AWS adoption is planned

What You’ll Build: An internal platform for ML engineers developing Shizuku’s AI models. The goal: eliminate siloed, ad-hoc local workflows and code ownership by individuals, replacing them with a team-oriented ML development foundation

Quick Facts

CompanyShizuku AI
LocationTokyo
Salary800万 - 2000万
RemoteOn-site / hybrid
VisaAvailable
LanguageJapanese: Business Level / English: Fluent
Interested in this role?

Submit your application for this role at Shizuku AI.

Apply Now