Research Engineer (Data Infrastructure)
PARIS, 75
il y a 1 jour
Responsibilities
- We are a dynamic, collaborative team passionate about AI and its potential to transform society. Our diverse workforce thrives in competitive environments and is committed to driving innovation
- The Data Infrastructure team at Mistral AI is architecting the backbone of our frontier model training and fine-tuning ecosystem
- We are building the specialized compute and data fabrics required to power the development of world-class AI
- This role focuses on building and operating the next generation of data infrastructure at Mistral AI
- You will be a core contributor to our evolution, helping us design and scale massive compute fleets and storage systems designed for high performance and scalability
- You will help us move toward a future of decoupled control and data planes, scaling big data compute and storage platforms while ensuring secure and governed data access for MLOps and research
- You will take full lifecycle ownership: from architecting the migration away from legacy orchestrators to implementing production-grade pipelines and participating in on-call rotations for critical training jobs
- Build & Scale: Help us reach our goal of operating massive distributed compute and storage systems
- Global Orchestration: Architect and maintain multi-cluster orchestration layers to optimize workload placement across diverse hardware and regions
- Design Future-Proof Storage: Architect our transition to modern storage formats to handle fine-tuning datasets at a scale that anticipates exabyte growth
- Platform Engineering: Contribute to the development of our internal training platform, ensuring seamless model training and fine-tuning capabilities across Kubernetes and SLURM based environments
- Metadata & Lineage: Implement and manage systems to provide clear visibility and lineage as our data and model pipelines grow in complexity
- Operational Excellence: Use modern deployment workflows to manage cloud-native deployments, ensuring our data platform can scale by orders of magnitude while remaining reliable and efficient
Benefits
- Competitive bonus structure
- Equity
- Opportunities for professional growth and development
Qualifications
- Take pride in building and operating scalable, reliable, and secure systems from the ground up
- Proficient in Python and enjoy solving the “brittle data lake” problem with modern, columnar storage standards
- Have 4+ years of experience in Data Infrastructure, MLOps, or Infrastructure Engineering
- Are comfortable with ambiguity and the challenges of building high-scale infrastructure in a rapid-growth AI environment
- Are well-versed in Kubernetes-native tooling and excited to debug large-scale distributed systems across multi-cluster environments
- Have experience or a strong interest in supporting foundational compute and storage platforms
Entreprise
Anonymized uhYaVC
Plateforme de publication
WHATJOBS
Offres pouvant vous intéresser
PARIS, 75
il y a 9 jours
GIF SUR YVETTE
il y a 10 jours
PARIS, 75
il y a 3 jours
PARIS, 75
il y a 3 jours