Chargement en cours

Site Reliability Engineer

PARIS, 75
il y a 2 jours

Requirements

  • Linux Expertise: Knowledge of Linux operating system internals, networking stack, and kernel tuning
  • Infrastructure as Code: Strong Experience with Kubernetes orchestration and automation using Terraform, Ansible, Puppet or similar tools
  • Programming Skills: Python proficiency for developing automation tools and operational scripts
  • Architecture: Ability to design Highly Available and complex distributed scalable systems architecture
  • Observability Tools: Hands‑on experience with monitoring solutions like Prometheus and Grafana dashboards
  • Security Knowledge: Understanding of security principles and best practices for infrastructure
  • Adaptability and willingness to learn new technologies as our infrastructure evolves
  • Strong problem‑solving abilities, especially in high‑pressure situations
  • Collaborative mindset for working with the whole Infrastructure department
  • (Desirable) Experience with bare metal provisioning
  • (Desirable) On‑premise cloud solutions experience
  • (Desirable) Contribution to open‑source projects

What the job involves

  • You will be part of the Infrastructure Systems team which provides all base platforms in Proton (Kubernetes, VM orchestration, Bare metal provisioning) as well as all the critical services for them (DNS, DHCP, SoT, Monitoring...)
  • Infrastructure Management: Oversee and optimize our rapidly expanding global network of thousands of servers
  • Automation & Reliability: Design and implement automation systems that ensure 99.95%+ uptime across our infrastructure
  • Problem Solving: Serve as an active member of the on‑call rotation, troubleshooting complex technical challenges across different network environments and censorship scenarios
  • Monitoring Excellence: Develop and enhance sophisticated monitoring and alerting systems that provide real‑time visibility into our global infrastructure
  • Technical Implementation: Engineer solutions to improve connectivity, stability, performance, scalability, security, and resilience—particularly in adversarial network environments
#J-18808-Ljbffr
Entreprise
Proton
Plateforme de publication
WHATJOBS
Offres pouvant vous intéresser
Soyez le premier à postuler aux nouvelles offres
Soyez le premier à postuler aux nouvelles offres
Créez gratuitement et simplement une alerte pour être averti de l’ajout de nouvelles offres correspondant à vos attentes.
* Champs obligatoires
Ex: boulanger, comptable ou infirmière
Alerte crée avec succès