L3 Support Team Lead

The role

We are establishing a global L3 Support Line from scratch to own the highest level of technical escalation for server and rack infrastructure across Europe and the US. Operating at the intersection of datacenter operations, R&D engineering, and ODM partners, this team will take full ownership of complex server and firmware incidents — driving root-cause resolution and converting recurring failures into scalable architectural improvements.

You will lead a team of ~10 L3 engineers in Europe (Amsterdam HQ + other DC areas), partnering closely with the regional L3 Lead to deliver 24/7 global coverage.

In this role, you will act as Incident Commander for high-severity production events, establish formal problem management practices, and design enterprise-grade support frameworks for contracted bare-metal customers — including two large FAANG clients at launch.

This is a managerial role with deep technical accountability: you will lead people and processes while retaining the capability to drive advanced Linux, hardware, and firmware investigations when L2 reaches its technical ceiling.

You’re welcome to work in our office in Amsterdam, the Netherlands.

Your responsibilities will include:

Incident Command (Highest Priority)

Act as Incident Commander for high-severity infrastructure incidents
Lead structured triage and drive permanent root-cause fixes
Align L2, Cloud Ops, R&D, NOC, DC Automation, and ODM vendors during critical events
Establish clear postmortems and follow-through mechanisms

Problem Management & Reliability

Identify recurring failure patterns and convert them into scalable fixes
Build structured escalation loops with R&D and vendors
Lead quarterly reliability reviews across platforms, firmware, and hardware
Translate analytics into preventive improvements

Build & Scale the L3 Function

Design the L3 operating model (intake, prioritization, ownership, escalation)
Hire and grow a distributed team across EU and US
Define collaboration models across internal teams and external vendors
Influence cross-functional outcomes without direct authority

Enterprise Bare Metal Support

Define enterprise-grade support processes (SLA handling, escalation paths, severity models)
Act as senior escalation interface for complex customer-impacting issues

We expect you to have:

Experience building or leading L3 / escalation support for datacenter server infrastructure
Strong Incident Commander experience in production environments
Background supporting enterprise customers under contractual SLAs
Proven ability to build incident & problem management processes from scratch
People leadership experience (hiring, coaching, scaling teams)
Strong English communication skills

It will be an added bonus if you have:

Deep Linux, hardware, and firmware troubleshooting capability
GPU server platform experience (e.g., NVIDIA diagnostics)
Experience managing ODM/OEM escalations
Bash / basic Python scripting
Exposure to OCP-based platforms

Details

Location

On-site