Site Reliability Engineer (Network Products)
PARIS, 75
il y a 21 heures
Requirements
- Strong experience with Infrastructure as Code (IaC) and CI/CD pipelines
- Solid expertise in Linux systems and production troubleshooting
- Proficiency with monitoring/logging tools (Open
Metrics, Open
Telemetry) - Programming skills in Python, Go, or Rust
- Good understanding of network systems is a great bonus: BGP, BGP EVPN, VXLAN
- Collaborative mindset and team-first approach
- Curious, continuous learner with a drive for operational excellence
- Clear and effective communicator (written & verbal)
- Comfortable working in English and French and across multidisciplinary teams
What the job involves
- Our growth is driving us to strengthen our Network SRE team to ensure the high reliability, performance, and scalability of our storage platforms
- Your mission will be to automate, monitor, and improve distributed storage systems and infrastructure in order to maximize availability and efficiency while reducing operational overhead
- We work in a collaborative and international environment where the diversity of Scalers, combined with a spirit of sharing, helps bring new projects to life every day, advancing our ambitions together. You will be part of a team of Site Reliability Engineers reporting to a Lead SRE and integrated into the SRE Guild, a collective focused on fostering best practices across engineering
- The team collaborates daily with Dev, Product, and Ops teams to improve resiliency, support service scalability, and ensure a seamless customer experience across our network solutions
- Develop automation tools and frameworks to streamline infrastructure management
- Build and maintain CI/CD pipelines using Infrastructure as Code best practices
- Implement and refine monitoring and alerting systems (Open
Metrics, Open
Telemetry) - Ensure system reliability through incident response and root cause analysis
- Collaborate with developers and product teams to bake resilience into network systems
- Participate in architecture reviews and provide SRE perspective early in the design
- Apply principles of fault-tolerance, load balancing, and energy efficiency optimization
- Share knowledge within the team and broader engineering org via the SRE Guild
- Contribute to the reliability and performance of services
Entreprise
Deepstreamtech
Plateforme de publication
WHATJOBS
Offres pouvant vous intéresser
TOULOUSE, 31
il y a 25 jours
ROUEN, 76
il y a 25 jours
PARIS, 75
il y a 25 jours
PARIS, 75
il y a 25 jours