Chargement en cours

Staff Engineer Data

PARIS, 75
il y a 1 jour

Requirements

  • If you think you match at least 70% of these criteria, please apply!
  • 7+ years of experience in data engineering, with a strong track record building and operating large-scale, production-grade data pipelines
  • Strong expertise in distributed systems and real-time architectures (streaming, event-driven systems)
  • Strong experience with AWS, Terraform, Docker, and Kubernetes in production environments
  • Strong experience with ClickHouse or similar large-scale analytical databases
  • Hands‑on engineering mindset: ability to contribute directly to production code on critical systems, not just design
  • Proven experience redesigning or refactoring existing production architectures
  • Strong ownership mindset: you proactively identify problems and drive solutions end‑to‑end
  • Experience mentoring engineers and raising the technical level of a team
  • Excellent communication skills and ability to work effectively with backend, ML, Data and Product teams
  • Fluent English in an international environment
  • (Desirable) Experience with large‑scale search systems (Elasticsearch, OpenSearch)
  • (Desirable) Familiarity with agentic systems or LLM‑based architectures applied to data pipelines
  • (Desirable) Experience in high‑growth startups or scale‑ups

What the job involves

  • You will join the Public Intelligence team, whose mission is to leverage public data to detect exposed secrets, map them to the correct company, assess their severity, and enable timely and relevant alerts for customers and prospects
  • The team works on ingesting public data (notably from GitHub and other sources), identifying the owning organization behind exposed secrets, analyzing the impact of these exposures, and evolving current systems toward more agentic and real‑time architectures
  • The existing systems are mature and battle‑tested
  • We are now at a pivotal moment: the goal is to redesign the end‑to‑end architecture to make it more robust, scalable, and aligned with significantly larger ambitions, including an agentic layer
  • Evolve the system from a deterministic approach to agentic systems, improving secret‑to‑company mapping accuracy and impact analysis
  • Redesign an existing multi‑service architecture into a horizontally scalable and maintainable system
  • Move from batch processing to real‑time processing, enabling secrets to be qualified within minutes of detection on GitHub
  • Extend the pipeline to new public data sources (Docker Hub, NPM, Py

    PI, etc.), beyond Git

    Hub
  • Build a search‑oriented data architecture capable of handling hundreds of millions of secrets
  • Scale the system to full dataset coverage, whereas only a subset is currently processed
  • In short: you will have real ownership over the architectural decisions that will define the next generation of the pipeline
  • Design and implement the end‑to‑end data architecture
  • Build real‑time systems, from design through production deployment
  • Be hands‑on on the most complex and critical technical challenges
  • Design monitoring, maintenance, and alerting systems around the data pipeline
  • Mentor and raise the technical bar within the team through code reviews and knowledge sharing
  • Structure engineering processes and facilitate collaboration across backend, ML, Data, and product teams
  • Contribute to the technical roadmap in close collaboration with the Engineering Manager and Product Manager
#J-18808-Ljbffr
Entreprise
GitGuardian
Plateforme de publication
WHATJOBS
Offres pouvant vous intéresser
PARIS, 75
il y a 3 jours
PARIS, 75
il y a 3 jours
ISSY LES MOULINEAUX
il y a 3 jours
Soyez le premier à postuler aux nouvelles offres
Soyez le premier à postuler aux nouvelles offres
Créez gratuitement et simplement une alerte pour être averti de l’ajout de nouvelles offres correspondant à vos attentes.
* Champs obligatoires
Ex: boulanger, comptable ou infirmière
Alerte crée avec succès