Shizuku AI

MLOps Engineer

Salary

800万 - 2000万

Location

Tokyo

Remote

On-site / hybrid

Visa

Sponsorship available

Language

Japanese: Business Level / English: Fluent

Posted

Apr 10, 2026

Python

C++

Typescript

Hadoop

Openstack

Apply now

Review the role details and submit your application.

Apply Now

Gallery

Overview

MISSION

As the founding MLOps engineer, design and build Shizuku’s ML infrastructure from the ground up. Establish the complete pipeline — from data ingestion through training environments to model serving — creating an internal platform that empowers ML engineers to iterate on models at maximum velocity.

Replace individual, siloed development environments with a unified team-scale ML development platform, maximizing the speed of Shizuku’s evolution.

Responsibilities

Design, build, and operate the end-to-end ML training pipeline: data collection/preprocessing → training → evaluation → deployment
Design and build GPU training infrastructure on AWS (A100, L4, etc.) with cost optimization
Build an internal ML platform for engineers: experiment tracking, model versioning, and reproducibility guarantees
Design and build model serving infrastructure: inference APIs, auto-scaling, and latency management
Establish training data management and quality assurance pipelines
Design and implement CI/CD for ML: automated training, model testing/evaluation, and staged rollouts
Drive production integration of models in collaboration with ML Engineer and SWE teams
Build monitoring and visibility infrastructure for long-term compute cost and GPU utilization tracking

Required Skills

3+ years of experience designing, building, and operating cloud infrastructure on AWS, GCP, or equivalent platforms
Experience building ML/DL pipelines and infrastructure
Hands-on experience designing and operating production environments using container technologies (Docker/Kubernetes)
Experience managing infrastructure as code (Terraform, Pulumi, etc.)
Strong Python skills for building tools and pipelines
Ability to work on-site at our Tokyo office (primarily in-office with flexible remote arrangements)
Founding Engineer Mentality — You don’t wait for established systems to improve — you define the design philosophy and build the foundation from zero. You’re energized by creating the system itself, not just refining one
ML-Literate Infrastructure Engineer — You understand the unique characteristics of ML training and inference workloads, and you translate that understanding into optimally designed infrastructure
Purpose-Driven Ownership — You reverse-engineer from “maximizing ML team velocity,” set your own priorities, and drive execution autonomously
Comfort with Ambiguity — You design for a world where model count, training frequency, and data volume are still being defined — starting small and scaling architecturally as the picture clarifies
Resilience & Respect — You engage as an equal partner with ML Engineers and SWEs, elevating the entire team’s productivity through collaboration

Preferred Skills

Experience building, operating, and cost-optimizing GPU clusters (A100, H100, L4, etc.)
Experience with ML platforms: SageMaker, Vertex AI, Ray, Kubeflow, etc.
Experience deploying and operating experiment tracking infrastructure: MLflow, Weights & Biases, DVC, etc.
Experience building model serving infrastructure: Triton Inference Server, TorchServe, vLLM, SGLANG, etc.
Experience designing and building internal ML development platforms
Domain-specific knowledge of ML workloads in speech, NLP, or vision
Experience as a founding infrastructure/MLOps engineer at a startup
Technical communication skills in English (currently Japanese-first internally; transitioning to a global environment in the mid-term)

About Shizuku AI

Replace individual, siloed development environments with a unified team-scale ML development platform, maximizing the speed of Shizuku’s evolution.

Shizuku is a Japan-born AI companion actively engaging audiences on YouTube and X (formerly Twitter). Already running live streams and cultivating a growing community, Shizuku is now entering its next phase of rapid scale.

As the first Japanese startup to receive investment from a16z, we closed our seed round and are on a mission to bring Japanese entertainment × AI to the global stage.

TEAM STRUCTURE

You will work closely with founder Aki (ML engineer and researcher, ex-Meta, ex-Luma AI) and Engineering Director Ohno to drive the design and construction of our ML infrastructure. As the first MLOps engineer, you’ll have significant autonomy — from technology selection to operational design.

Post-foundation, career paths include both a management track leading a growing team and an IC track deepening technical expertise, tailored to your aspirations.

CURRENT STATE & WHAT YOU’LL BUILD

Infrastructure Status: Modern application infrastructure is in place, but ML training and MLOps tooling are not yet established. AWS adoption is planned

What You’ll Build: An internal platform for ML engineers developing Shizuku’s AI models. The goal: eliminate siloed, ad-hoc local workflows and code ownership by individuals, replacing them with a team-oriented ML development foundation