Signaler

Site Reliability Engineer (SRE)

PARIS, 75

il y a 1 jour

Job Description

We are looking for a Site Reliability Engineer to strengthen our Infrastructure & Security department and help us scale our internal and customer-facing platforms.

In this role, you will contribute to both run and build activities: operating production environments, improving reliability, leading technical transformation initiatives, and designing modern, scalable, secure, and observable infrastructure.

You will work across cloud and on-premise environments, collaborate closely with Engineering, QA, Data, Security, and Product teams, and help improve our developer experience, operational excellence, production mindset, and infrastructure maturity.

Responsibilities

Operate, maintain, and improve production and internal infrastructure environments across cloud and on-premise platforms.
Contribute to both run activities, such as incident response, monitoring, support, troubleshooting, maintenance, and reliability improvements, and build activities, such as architecture evolution, automation, migration, tooling, and platform transformation.
Help design, build, and maintain resilient, scalable, secure, observable, and cost-efficient infrastructure.
Lead or contribute to technical migrations, modernization projects, and architecture transformation initiatives.
Strengthen operational processes: incident management, change management, backup and restore, disaster recovery, on-call practices, documentation, and post-incident reviews.
Improve observability across systems, services, and infrastructure through metrics, logs, traces, dashboards, alerting, and SLOs.
Promote a strong production mindset across teams, with a focus on reliability, performance, security, customer impact, and operational simplicity.
Collaborate closely to improve delivery quality and platform reliability.
Contribute to Developer Experience by improving tooling, CI/CD workflows, infrastructure automation, environments, deployment processes, and self-service capabilities.
Support FinOps practices by monitoring costs, optimizing infrastructure usage, and helping teams make cost-aware decisions.
Build automation and tooling to reduce manual work, improve repeatability, and make infrastructure easier to operate.
Participate in technical architecture discussions and provide guidance to infrastructure and engineering teams.
Contribute to infrastructure roadmaps, technical standards, best practices, and long-term platform strategy.
Maintain strong documentation and knowledge sharing practices.

Success Criteria (6 to 12 months)

Built strong trust with Infrastructure, Engineering, Security, and Product teams.
Demonstrated strong ownership of production systems and contributed to improving reliability, stability, and operational maturity.
Helped improve observability through better dashboards, alerts, metrics, logs, traces, or SLOs.
Contributed to reducing operational toil through automation, documentation, tooling, or process improvements.
Helped improve incident response, post-incident reviews, change management, or on-call practices.
Contributed to one or more meaningful build initiatives: migration, architecture improvement, platform modernization, CI/CD improvement, internal tooling, or developer experience enhancement.
Shown strong ability to work across both cloud and on-premise environments.
Contributed to making infrastructure more secure, scalable, performant, cost-efficient, and easier to operate.
Helped Engineering teams improve delivery quality and production readiness.
Recognized as a collaborative, structured, pragmatic, and reliable technical partner.

Requirements

Technical Skills

Strong experience in a similar role.
Solid experience operating production environments with high reliability, availability, and performance expectations.
Good understanding of cloud infrastructure, ideally AWS.
Strong knowledge of systems, networking, DNS, load balancing, security fundamentals, and infrastructure troubleshooting.
Experience with infrastructure as code, automation, configuration management, and CI/CD pipelines.
Experience with observability tools: metrics, logs, traces, alerting, dashboards, SLOs, SLIs.
Good understanding of containers, orchestration, service discovery, secrets management, and modern platform architecture.
Experience with incident management, post-incident reviews, backup and restore, disaster recovery, capacity planning, and operational processes.
Ability to write scripts, automation, or internal tooling to reduce manual work and improve reliability.
Understanding of security best practices for infrastructure, cloud, identity, secrets, network segmentation, and production access.
Interest or experience in FinOps, cost optimization, performance optimization, and infrastructure efficiency.
Experience with developer experience, internal platforms, self-service tooling, or platform engineering is a strong plus.

Soft Skills

Strong production mindset: reliability, customer impact, resilience, security, and operational excellence.
Excellent communication skills with both technical and non-technical stakeholders.
Structured, rigorous, autonomous, and pragmatic approach.
Ability to lead technical initiatives, migrations, or architecture discussions.
Collaborative, curious, and committed to continuous improvement.

#J-18808-Ljbffr

Entreprise

StrangeBee SAS

Plateforme de publication

WHATJOBS

Offres pouvant vous intéresser

Site Reliability Engineer (SRE)

GRAND EST, FRANCE

il y a 5 jours

Ethical Hacker / Offensive Security Engineer

PARIS, 75

il y a 5 jours

DevOps/IT Engineer (Remote - Europe)

REMOTE

il y a 5 jours

DevOps Engineer

PARIS, 75

il y a 5 jours