Chargement en cours

Compute Infrastructure and HPC Expert Engineer

PARIS, 75
il y a 2 jours

Compute Infrastructure and HPC Expert Engineer

Level of qualifications required : Graduate degree or equivalent

Fonction : Support functions

Level of experience : From 3 to 5 years

Context

Following the priorities established in May 2024 by the Seoul Declaration for Safe, Innovative and Inclusive AI, to which France is a signatory, the French government decided to create INESIA, an institute whose mission is to bring together , without creating a new legal entity, national stakeholders involved in AI evaluation and safety , in particular:

  • the French Cybersecurity Agency (ANSSI),
  • the National Laboratory of Metrology and Testing (LNE),
  • the Digital Regulation Expertise Center (PEReN),
  • and the French National Institute for Research in Digital Science and Technology (Inria).

Within this framework, Inria primarily contributes to activities related to systemic risk analysis in the field of national security, as well as the evaluation of the performance and reliability of AI models.

This work is strategically coordinated with Inria’s AI Evaluation research program and materializes through the design and development of an AI evaluation platform, particularly focused on systems based on Large Language Models (LLMs).

The platform aims to provide an integrated, secure, and robust environment supporting the program’s research projects, while enabling the development of evaluation applications such as benchmarking campaigns and red teaming exercises. It relies on open-source tools from the AI ecosystem as well as internally developed components.

You will join a team operating in a fast-paced, iterative development environment: the platform will evolve progressively through regular operational deliverables. We are looking for individuals capable of proposing solutions, making technical trade-offs, and transforming technical requirements into operational systems.

As a Compute Infrastructure and HPC Expert, you will play a key role in operating and optimizing the computing resources used by the platform.

This position offers the opportunity to contribute to a strategic and ambitious project at the heart of current challenges related to AI safety, transparency, and governance, spanning technical, scientific, and societal dimensions.

Assignment

Operate available computing clusters (Abaca, Jean Zay, Adastra, etc.) and potentially deploy additional computing resources to ensure performance, reproducibility, and security.

Main activities

  • Deploy and maintain the infrastructure enabling the platform to leverage available computing resources
  • Optimize job execution through parallelization, resource allocation, and scheduling
  • Set up and maintain monitoring and performance tracking tools

Skills

Required Skills

  • Expertise in the use of shared computing infrastructures, with strong knowledge of job schedulers (OAR, Slurm)
  • Knowledge of multi-GPU and multi-node parallelization techniques, including multi-GPU profiling
  • Understanding of GPU architectures and constraints related to ML/LLM workloads
  • Familiarity with software development best practices (Git versioning, CI/CD, documentation)
  • Ability to write technical documentation

Preferred Skills

  • Familiarity with containerized deployment tools (Singularity, Docker, docker-compose, CI/CD)
  • Knowledge of inference and optimized deployment tools for large language models (vLLM, SGLang, etc.)

Additional Appreciated Skills

  • Experience in academic research
  • Technical English proficiency, both written and spoken
  • Awareness of AI trustworthiness and safety challenges

We encourage you to apply even if you do not meet every requirement — we value candidates who are eager to learn and grow new skills.

Benefits package

Warning : you must enter your e-mail address in order to save your application to Inria. Applications must be submitted online on the Inria website. Processing of applications sent from other channels is not guaranteed.

Instruction to apply

Defence Security : This position is likely to be situated in a restricted area (ZRR), as defined in Decree No. relating to the protection of national scientific and technical potential (PPST).Authorisation to enter an area is granted by the director of the unit, following a favourable Ministerial decision, as defined in the decree of 3 July 2012 relating to the PPST. An unfavourable Ministerial decision in respect of a position situated in a ZRR would result in the cancellation of the appointment.

Recruitment Policy : As part of its diversity policy, all Inria positions are accessible to people with disabilities.

#J-18808-Ljbffr
Entreprise
Inria
Plateforme de publication
WHATJOBS
Offres pouvant vous intéresser
LYON, 69
il y a 2 jours
LILLE, 59
il y a 2 jours
PARIS, 75
il y a 2 jours
Soyez le premier à postuler aux nouvelles offres
Soyez le premier à postuler aux nouvelles offres
Créez gratuitement et simplement une alerte pour être averti de l’ajout de nouvelles offres correspondant à vos attentes.
* Champs obligatoires
Ex: boulanger, comptable ou infirmière
Alerte crée avec succès