Big Data Developer
BWI (Blue Water Intelligence), founded in June 2022 in Toulouse, France, specializes in monitoring global inland surface water reserves assistant by combining in‑situ data, spaceborne observations, and machine learning to enhance hydrological and hydrodynamical models and deliver data and insights to clients. As a provider of scalable, subscription‑based services, BWI focuses on improving hydrological forecasts to address climate change‑induced water stress and improve water management.
BWI is at the forefront of cutting‑edge hydrological research and innovation. We combine advanced mathematics, physics, and computational expertise to address complex challenges in hydrology, environmental science, and water resource management. We are seeking a dynamic and highly skilled Big Data Engineer to join our team, bridging the gap between science and software development.
Location: Paris or Toulouse (primarily onsite) – Professional English – 5+ years exp. minimum for application to be considered.
Role: Big Data Engineer – building and operating data ingestion and processing pipelines for terabytes of weather and hydrological data, ensuring scalable, reproducible, production‑grade delivery of model inputs and outputs.
Core responsibilities
- Implement ingestion (batch/streaming), ELT/ETL steps, and data publishing workflows.
- Handle scientific formats (netCDF, GRIB2) and columnar storage (Parquet); optimize I/O and algorithms.
- Design storage with eventual‑consistency patterns (atomic publishes, manifests, versioned paths) and a metadata catalog.
- Partition and parallelize workloads for distributed compute; compact small files and tune for cost/performance.
- Build and run containerized services and orchestrated workflows; ensure observability, retries, idempotency, and runbooks.
- Collaborate with scientists to define data models and validation rules.
Top paradigms & architectural patterns required
- ELT-first with ETL where needed; streaming/micro‑batch for low‑latency sources.
- Idempot rencontre enforce, restartable workflows jednod orchestration.
- Versioned datasets, atomic publish patterns, and catalogue as source of truth.
- Observability‑driven ops and infra‑as‑code.
- Python (xarray, netCDF4, pyarrow), PySpark or Dask.
- AWS (S3, EKS, EC2) or equivalent cloud; Terraform for IaC.
Must‑haves
- 3+ years of relevant experience building/operating large data UFC pipeline, out of 5 minimum years of computer science‑related experience in general (ideally as a full‑stack/back‑end/web software engineer).
- Strong software engineering in Python with testing and CI/CD.
- Practical experience with partitioning and parallelism.
- Computer science education.
- Understanding of architectural patterns in storing & processing large volumes of data.
- Humility & eag අද accept both full hands‑on coding position or technical leadership depending on circumstances.
- Clear communication, teamwork, and analytical problem‑solving.
Nice‑to‑haves
- Hydrology, meteorology, remote‑sensing, or space‑ground‑segment experience.
- Understanding of low‑level subjects (memory management, HTTP, S3 implementation).
- Willingness to develop managerial skills.
Seniority level
Mid‑Senior level
Employment type
Full‑time
Job function
Engineering and Information Technology
#J-18808-Ljbffr