AI Data Engineer YozmaTech $
FRANCE
il y a 1 jour
Our client is looking for an AI Data Engineer to help build, maintain, and improve internal and client-facing LLM-powered systems. This role sits at the intersection of data engineering, retrieval infrastructure, and production AI operations, focusing on reliability, retrieval quality, scalability, and operational excellence.
Key Requirements
- Strong programming skills, especially in Python.
- Experience building ETL and data pipelines in production environments.
- Strong SQL skills and experience with relational databases, preferably PostgreSQL.
- Experience with search and retrieval systems, including OpenSearch, Elasticsearch, or similar platforms.
- Familiarity with vector databases, embeddings workflows, and large-scale document indexing.
- Experience with cloud platforms such as AWS and related infrastructure services.
- Familiarity with Git, CI/CD pipelines, and modern engineering workflows.
- Strong problem-solving skills and comfort working across data, infrastructure, and AI application layers.
- English – Upper-Intermediate or higher.
Will be plus
- Experience working on RAG systems, internal knowledge assistants, or search-heavy AI applications.
- Familiarity with observability stacks, distributed systems, and workflow orchestration tools.
- Experience with access control, permission-aware systems, and auditability in enterprise environments.
- Exposure to evaluation frameworks for LLM systems and model benchmarking.
What you will do
- Maintain and improve ingestion and enrichment pipelines for internal and client content, including parsing, extraction, normalization, metadata enrichment, deduplication, and quality monitoring.
- Improve indexing and retrieval quality through chunking and segmentation refinements, embedding and index update workflows, metadata filtering, and caching.
- Support hybrid retrieval architectures combining vector search, keyword or BM25 search, and metadata-aware filtering.
- Implement and maintain access-aware retrieval by propagating and enforcing document permissions at indexing and query time, including audit logs and validation tests.
- Improve source attribution so responses consistently point to the correct documents, sections, and references in a reliable format.
- Extend and harden tool execution, workflow orchestration, and automations, including retries, timeouts, idempotency, concurrency controls, and run history.
- Develop and maintain evaluation and regression testing frameworks, including golden datasets, automated scoring, and structured comparisons across LLM providers and models.
- Operate AI systems in production, including logs, metrics, tracing, alerting, incident response, performance tuning, cost monitoring, and runbook documentation.
- Build scalable infrastructure to process, embed, index, and search very large document collections efficiently.
Benefits
- Direct access to clients and meaningful products.
- Flexibility to work remotely or from our offices.
- A-team colleagues and a zero-bureaucracy culture.
- Opportunities to grow, lead, and make your mark.
Everyone’s welcome – diversity makes us better. We create a space where you can thrive as you are.
#J-18808-Ljbffr
Entreprise
Madfish
Plateforme de publication
WHATJOBS
Offres pouvant vous intéresser
PARIS, 75
il y a 15 jours
PANTIN, 93
il y a 9 jours
PARIS, 75
il y a 20 jours
FRANCE
il y a 20 jours