L3 Support Engineer (Data Center)
PARIS, 75
il y a 5 jours
The role
We are building our L3 Support Line from scratch to serve as the data center center of expertise for servers, firmware (BIOS/BMC), and deep Linux diagnostics across Europe and the US.
This is a senior technical role focused on deep investigations, cross-site pattern detection, and driving permanent fixes with R&D and ODM vendors. You will turn complex incidents into scalable solutions and elevate L1/L2 capabilities through strong technical enablement.
You’re welcome to work in our data center in Paris or Béthune, France.
Responsibilities
- Lead root cause analysis beyond L2 depth (GPU failures, firmware issues, Linux-level faults, HW/SW interactions).
- Detect recurring patterns across sites and convert findings into durable fixes.
- Own technical workstreams during high-severity incidents.
- Build evidence packs and drive escalations with ODM and R&D.
- Push for firmware, component, and platform-level resolutions.
- Track outcomes and ensure knowledge flows back to operations.
- Support validation and rollout of firmware updates (risk assessment, staging, rollback planning).
- Create scalable runbooks, troubleshooting guides, and error catalogs.
- Turn investigations into playbooks that elevate L1/L2 teams.
- Travel to data centers for complex troubleshooting, new platform readiness, or incident containment.
Qualifications
- Strong hands-on experience with data center servers and deep Linux troubleshooting.
- Ability to diagnose across hardware, BIOS/BMC firmware, and Linux (logs, drivers, storage basics, performance triage).
- Structured incident response experience and clear communication under pressure.
- Experience driving evidence-based escalations with vendors/R&D.
- Fluent English (written and spoken).
- Optional: strong familiarity with GPU server platforms and tooling (e.g., nvidia-smi, dcgmi, Linux logs correlation).
- Optional: experience with ipmitool and Redfish workflows, firmware lifecycle, and staged rollouts.
- Optional: scripting skills (bash and basic Python) for log collection, triage automation, and simple reliability analysis.
- Optional: exposure to OCP-based platforms and ODM manufacturing ecosystems.
- Optional: experience supporting enterprise bare metal customers under contractual SLAs.
Benefits
- Competitive salary and comprehensive benefits package.
- Opportunities for professional growth within Nebius.
- Flexible working arrangements.
- A dynamic and collaborative work environment that values initiative and innovation.
Entreprise
Nebius
Plateforme de publication
WHATJOBS
Offres pouvant vous intéresser
PARIS, 75
il y a 5 jours
BÉTHUNE, 62
il y a 5 jours
PARIS, 75
il y a 5 jours
BRUYÈRES- E CHÂTEL
il y a 5 jours