AI Platform Engineering H/F
MARIGNANE, 13
il y a 2 jours
Overview
You will join our Digital Transformation Division, working alongside a team of experts in software solution development, the design of robust, scalable and secure infrastructures, agile methodologies, and data management, supporting our clients in delivering their strategic projects.
You will be seconded to a major player in the industrial sector, within a team dedicated to the engineering and operations of an AI Platform based on Kubernetes / OpenShift.
This team is responsible for setting up, operating, and continuously improving a reliable and secure environment for the deployment, monitoring, and management of Machine Learning models in production.
Responsibilities
- Platform Maintenance and Operations: Supporting kube/ocp based AI platform\'s day-to-day operations. Applying changes and upgrades to the AI Platform and its components, to ensure seamless functionality.
- Deployment and Supervision: Deploy and supervise AI models within a kube/OCP environment, automating the deployment, scaling, and management processes for ML models.
- CI/CD Pipeline Management: Design, build, and maintain automated pipelines using Tekton or Kubeflow for key workflows like model retraining and inference. Implement solutions for Continuous Integration/Continuous Delivery (CI/CD) and ongoing monitoring.
- Technical Support and Troubleshooting: Diagnosing and resolving problems related to pod crashes, resource allocation, and pipeline failures.
- Customization and Security: Customize and rebuild platform components, including workbench and runtime images, to integrate new tools and libraries. Ensure the compliance and security of all production environments.
Required Expertise and Qualifications
- Strong expertise in containerization technologies such as Docker and orchestration platforms like Kubernetes (OpenShift experience is a plus).
- Proficiency in DevOps principles and best practices, including CI/CD, Infrastructure as Code, and GitOps.
- Familiarity with MLOps tools, including but not limited to Kubeflow, Elyra, and Large Language Models (LLMs).
- Scripting proficiency, particularly in Python & GO.
- Knowledge of various Machine Learning frameworks.
- Knowledge on Tekton , artifactory
- Localisation : Marignane (13)
- Date : ASAP
- Anglais nécessaire
Entreprise
Act Digital France
Plateforme de publication
WHATJOBS
Offres pouvant vous intéresser
FRANCE
il y a 8 jours
PARIS, 75
il y a 8 jours
LABÈGE, 31
il y a 23 jours
GRENOBLE, 38
il y a 19 jours