Signaler

AI Data Engineer YozmaTech $

FRANCE

il y a 1 jour

Our client is looking for an AI Data Engineer to help build, maintain, and improve internal and client-facing LLM-powered systems. This role sits at the intersection of data engineering, retrieval infrastructure, and production AI operations, focusing on reliability, retrieval quality, scalability, and operational excellence.

Key Requirements

Strong programming skills, especially in Python.
Experience building ETL and data pipelines in production environments.
Strong SQL skills and experience with relational databases, preferably PostgreSQL.
Experience with search and retrieval systems, including OpenSearch, Elasticsearch, or similar platforms.
Familiarity with vector databases, embeddings workflows, and large-scale document indexing.
Experience with cloud platforms such as AWS and related infrastructure services.
Familiarity with Git, CI/CD pipelines, and modern engineering workflows.
Strong problem-solving skills and comfort working across data, infrastructure, and AI application layers.
English – Upper-Intermediate or higher.

Will be plus

Experience working on RAG systems, internal knowledge assistants, or search-heavy AI applications.
Familiarity with observability stacks, distributed systems, and workflow orchestration tools.
Experience with access control, permission-aware systems, and auditability in enterprise environments.
Exposure to evaluation frameworks for LLM systems and model benchmarking.

What you will do

Maintain and improve ingestion and enrichment pipelines for internal and client content, including parsing, extraction, normalization, metadata enrichment, deduplication, and quality monitoring.
Improve indexing and retrieval quality through chunking and segmentation refinements, embedding and index update workflows, metadata filtering, and caching.
Support hybrid retrieval architectures combining vector search, keyword or BM25 search, and metadata-aware filtering.
Implement and maintain access-aware retrieval by propagating and enforcing document permissions at indexing and query time, including audit logs and validation tests.
Improve source attribution so responses consistently point to the correct documents, sections, and references in a reliable format.
Extend and harden tool execution, workflow orchestration, and automations, including retries, timeouts, idempotency, concurrency controls, and run history.
Develop and maintain evaluation and regression testing frameworks, including golden datasets, automated scoring, and structured comparisons across LLM providers and models.
Operate AI systems in production, including logs, metrics, tracing, alerting, incident response, performance tuning, cost monitoring, and runbook documentation.
Build scalable infrastructure to process, embed, index, and search very large document collections efficiently.

Benefits

Direct access to clients and meaningful products.
Flexibility to work remotely or from our offices.
A-team colleagues and a zero-bureaucracy culture.
Opportunities to grow, lead, and make your mark.

Everyone’s welcome – diversity makes us better. We create a space where you can thrive as you are.

#J-18808-Ljbffr

Entreprise

Madfish

Plateforme de publication

WHATJOBS

Offres pouvant vous intéresser

Data AI engineer (All Genders)

PARIS, 75

il y a 15 jours

Lead Search Engineer F/H

PANTIN, 93

il y a 9 jours

AI Forward Deploy Engineer

PARIS, 75

il y a 20 jours

AI Data Engineer

FRANCE

il y a 20 jours