Chargement en cours

AI Architect — Generative AI Integration RediLab $

il y a 9 heures

Role Overview

We are an established tech company scaling our core enterprise platform. We are looking for an experienced AI Architect to design and implement intelligent, generative AI features into our existing highly-loaded backend infrastructure.

We are not looking for a Data Scientist to train foundational models from scratch. We need a seasoned software architect who understands how to treat AI components as powerful integration points, building resilient, secure, and lightning-fast MLOps infrastructure around them. You will lead the transition to an AI-native architecture, ensuring our enterprise applications handle high-throughput AI requests seamlessly.

Tech Stack & Core Responsibilities

LLM Integration & Fundamentals

  • Deep understanding of LLM APIs under the hood, with a strong focus on optimizing streaming, maximizing throughput, and minimizing Time-to-First-Token (TTFT).
  • Advanced manipulation of context windows, tokens, Temperature, and Top-P settings for enterprise accuracy.
  • Experience evaluating and deploying both proprietary foundational models (GPT-4o, Claude 3.5) and Open-Source models (Llama 3, Mistral) based on cost and privacy requirements.

.NET & AI Ecosystem Integration

  • Designing AI-native microservices on the Microsoft stack using Semantic Kernel (developing Plugins, Planners, and managing Memory).
  • Managing Azure OpenAI Service deployments, including enterprise security configurations, quota management, and Provisioned Throughput Units (PTU).
  • Familiarity with alternative orchestration frameworks like LangChain and LlamaIndex for architectural benchmarking.

Enterprise RAG & Data Infrastructure

  • Architecting robust Retrieval-Augmented Generation (RAG) systems, building high-throughput Ingestion (vectorization) and Retrieval (search) pipelines.
  • Implementing complex document chunking strategies and solving the "Lost in the middle" context problem.
  • Hands-on expertise with Vector Databases (Pinecone, Qdrant, Milvus, pgvector) and managing embeddings for similarity search.
  • Optimizing search relevance through Hybrid Search architectures (combining Keyword and Vector search).

Advanced AI Patterns

  • Designing GraphRAG architectures utilizing graph databases (like Neo4j) to maintain complex contextual relationships for domain-specific queries.
  • Orchestrating Multi-Agent Systems (using frameworks like Microsoft AutoGen) to automate multi-step operational workflows.
  • Implementing Agentic Workflows leveraging advanced Function Calling and Tool Calling capabilities.

High-Load AI Infrastructure & MLOps

  • Deploying and optimizing LLMs in localized environments using vLLM or TensorRT-LLM for efficient request batching.
  • Integrating LLMs into event-driven architectures utilizing message brokers (Kafka or RabbitMQ) for asynchronous inference and load leveling.
  • Establishing comprehensive LLM Observability to monitor pipelines, gather metrics, and automatically evaluate hallucination rates.

Why Join Us?

  • Full autonomy to design the AI architecture for a mission-critical, revenue-generating system.
  • Direct influence on the company's long-term technical strategy and AI integration roadmap.
  • Remote-first culture surrounded by high-level engineering peers.
#J-18808-Ljbffr
Entreprise
Madfish
Plateforme de publication
WHATJOBS
Offres pouvant vous intéresser
PARIS, 75
il y a 26 jours
PARIS, 75
il y a 14 jours
PARIS, 75
il y a 26 jours
ÎLE-DE-FRANCE
il y a 19 jours
Soyez le premier à postuler aux nouvelles offres
Soyez le premier à postuler aux nouvelles offres
Créez gratuitement et simplement une alerte pour être averti de l’ajout de nouvelles offres correspondant à vos attentes.
* Champs obligatoires
Ex: boulanger, comptable ou infirmière
Alerte crée avec succès