Research Engineer (Data Infrastructure)

PARIS, 75

il y a 1 jour

Responsibilities

We are a dynamic, collaborative team passionate about AI and its potential to transform society. Our diverse workforce thrives in competitive environments and is committed to driving innovation
The Data Infrastructure team at Mistral AI is architecting the backbone of our frontier model training and fine-tuning ecosystem
We are building the specialized compute and data fabrics required to power the development of world-class AI
This role focuses on building and operating the next generation of data infrastructure at Mistral AI
You will be a core contributor to our evolution, helping us design and scale massive compute fleets and storage systems designed for high performance and scalability
You will help us move toward a future of decoupled control and data planes, scaling big data compute and storage platforms while ensuring secure and governed data access for MLOps and research
You will take full lifecycle ownership: from architecting the migration away from legacy orchestrators to implementing production-grade pipelines and participating in on-call rotations for critical training jobs
Build & Scale: Help us reach our goal of operating massive distributed compute and storage systems
Global Orchestration: Architect and maintain multi-cluster orchestration layers to optimize workload placement across diverse hardware and regions
Design Future-Proof Storage: Architect our transition to modern storage formats to handle fine-tuning datasets at a scale that anticipates exabyte growth
Platform Engineering: Contribute to the development of our internal training platform, ensuring seamless model training and fine-tuning capabilities across Kubernetes and SLURM based environments
Metadata & Lineage: Implement and manage systems to provide clear visibility and lineage as our data and model pipelines grow in complexity
Operational Excellence: Use modern deployment workflows to manage cloud-native deployments, ensuring our data platform can scale by orders of magnitude while remaining reliable and efficient

Benefits

Competitive bonus structure
Equity
Opportunities for professional growth and development

Qualifications

Take pride in building and operating scalable, reliable, and secure systems from the ground up
Proficient in Python and enjoy solving the “brittle data lake” problem with modern, columnar storage standards
Have 4+ years of experience in Data Infrastructure, MLOps, or Infrastructure Engineering
Are comfortable with ambiguity and the challenges of building high-scale infrastructure in a rapid-growth AI environment
Are well-versed in Kubernetes-native tooling and excited to debug large-scale distributed systems across multi-cluster environments
Have experience or a strong interest in supporting foundational compute and storage platforms

#J-18808-Ljbffr

Entreprise

Anonymized uhYaVC

Plateforme de publication

WHATJOBS

Offres pouvant vous intéresser

Member of Technical Staff, Data (Paris, London)

PARIS, 75

il y a 9 jours

AI Engineer

GIF SUR YVETTE

il y a 10 jours

Data Engineer - Foundational

PARIS, 75

il y a 3 jours

Data Engineer for AI Product

PARIS, 75

il y a 3 jours