Platform Reliability Engineer (H/F)
Overview
Select how often (in days) to receive an alert:
Sonepar is an independent family-owned group and a world leader in the distribution of electrical equipment, solutions and related services to the trade. In 2024, Sonepar generated a turnover of €32.5 billion. With a presence in 40 countries and an extensive network of retail outlets, the Group is undertaking an ambitious transformation to make life easier for its customers by offering them an omnichannel experience and a range of sustainable solutions for the industrial, construction and energy sectors. Our 46,000 employees are committed to the electrification of the world and united by a shared Purpose: “Power Progress for Future Generations.”
The role
Platform Reliability Engineer is responsible for ensuring the reliability, availability, and performance of the organization’s digital platform. Working under the guidance of the Platform SRE Lead, this role applies Site Reliability Engineering principles to day-to-day operations, incident management, and continuous improvement initiatives. This position combines strong hands-on operational expertise with an engineering mindset, focused on automation, resilience, and scalable operations across the platform.
Key responsibilities
- Platform Operations & Reliability
- Monitor, operate, and maintain platform health, availability, and performance.
- Apply SRE principles to improve system reliability, scalability, and fault tolerance.
- Contribute to the definition, tracking, and reporting of Service Level Objectives (SLOs), Service Level Indicators (SLIs), and Error Budgets.
- Continuously improve operational processes to enhance platform stability and customer experience.
- Incident & Problem Management
- Participate in the resolution of incidents and support cross-functional incident response.
- Perform root cause analysis and contribute to preventive and corrective actions.
- Use monitoring, observability tools, and data analysis to detect and mitigate issues before they impact service availability.
- Contribute to post-incident reviews and operational documentation.
- Automation & Tooling
- Develop and maintain automation scripts to support deployment, monitoring, and recovery processes.
- Reduce manual operational work through Infrastructure as Code (IaC) and CI/CD pipelines.
- Improve operational efficiency by automating repetitive tasks and workflows.
- Observability
- Implement and enhance monitoring, logging, and alerting solutions for real-time system visibility.
- Contribute to distributed tracing and telemetry to support root cause analysis and troubleshooting.
- Ensure observability data supports SLO tracking and operational decision-making.
- Resilience Engineering
- Participate in resilience testing and chaos engineering experiments to validate system robustness.
- Contribute to improving failover mechanisms and disaster recovery procedures.
- Identify single points of failure and propose reliability improvements.
- Performance & Scalability
- Assist in optimizing platform performance, resource utilization, and cost efficiency.
- Collaborate with development teams to embed reliability and operability into application design.
- Support scalability testing and performance validation activities.
- Cross-functional activities
- Work closely with the Platform SRE Lead, platform engineers, and development teams.
- Communicate operational status, incidents, and improvements clearly within the team.
- Contribute to change management, release planning, and operational readiness activities.
- Share knowledge through documentation, runbooks, and best practices.
What success looks like
- Platform services meet defined reliability and availability targets.
- Incidents are detected early, resolved efficiently, and followed by meaningful improvements.
- Automation reduces operational toil and improves response times.
- Observability provides clear insights into platform health and risks.
- Development teams trust the platform’s reliability and operational maturity.
Experience you bring
- Strong communication skills, able to collaborate effectively in a technical environment.
- Ability to work autonomously while contributing actively to a cross-functional team.
- Fluent in English & French (written and spoken).
- 7–10 years of professional experience in SRE, operations, DevOps, or platform engineering roles.
Soft Skills
- Proactive, curious, and solution-driven mindset.
- Strong focus on reliability, automation, and continuous improvement.
- Comfortable operating in a fast-paced and evolving digital environment.
- Ability to challenge existing practices constructively and propose better ways of working.
Recruitment Process
- Pre-screening Interview
- Managerial Interview with Cloud Platform Director
- Technical Interview
- HR Interview
Work Mode & Location
Hybrid: 3 days in Paris (8ème) after the trial period
Benefits
- 75% reimbursement of your monthly or annual transport pass.
- Swile Ticket restaurant card
- Gym exclusively reserved for the company and made available to employees free of charge.
- Sustainable mobility package
- Health insurance & Welfare
- Employee Savings Plan & Profit Sharing Bonus.
Diversity and Inclusion
We encourage you to apply even if your experience doesn't align perfectly with the job description. We value your skills and journey. We are committed to an inclusive environment and are proud to have the GEEIS Label and to be partner of AGEFIPH.
Interested?
You are interested in this challenge? Then join and build a better future with us!
#J-18808-Ljbffr