Staff Platform Engineer, AI/ML Infrastructure
Role Summary
The Staff Platform Engineer, AI/ML Infrastructure will provide technical leadership for cloud platforms, deployment systems, and operational foundations that power enterprise‑scale generative AI applications. This role will define and evolve the infrastructure architecture for AI/ML platforms running across AWS, Kubernetes, serverless, and containerized environments. The engineer will lead platform standards for reliability, scalability, observability, CI/CD, security, and developer enablement, while partnering closely with software engineering, AI engineering, security, and operations teams. The ideal candidate combines deep hands‑on cloud engineering experience with staff‑level technical influence.
Key Responsibilities
- Define and drive the technical strategy for AI/ML platform infrastructure supporting generative AI applications, LLM integrations, model routing, and enterprise AI services.
- Architect, build, and operate scalable cloud platforms using AWS services such as EKS, ECS Fargate, Lambda, DynamoDB, S3, OpenSearch, Secrets Manager, CloudWatch, ALB, and MWAA.
- Establish reusable infrastructure patterns using CloudFormation, Helm, and Terraform to support reliable multi‑environment and multi‑region deployments.
- Lead CI/CD architecture using GitHub Actions, reusable workflows, OIDC‑based AWS authentication, automated quality gates, deployment promotion, and environment approvals.
- Design and improve observability across AI platforms, including CloudWatch dashboards, logs, alarms, Prometheus/Grafana, OpenSearch, Langfuse, and LLM‑specific operational metrics.
- Build platform capabilities for GenAI workloads, including model availability monitoring.
- Partner with software engineering teams to improve deployment reliability, rollback strategies, health checks, autoscaling, load testing, and runtime performance.
- Define and enforce security and compliance practices for infrastructure, including IAM permission boundaries, Secrets Manager usage, secret scanning, audit logging, tagging standards, and change‑management controls.
- Provide technical leadership for cost optimization, capacity planning, environment standardization, and operational resilience across development, test, production, and sandbox environments.
- Mentor engineers, review architecture and infrastructure designs, and influence platform engineering practices across teams.
Basic Qualifications
- Bachelor’s degree in Computer Science, Engineering, Information Technology, or a related technical field, or equivalent practical experience.
- 7+ years of experience in DevOps, platform engineering, cloud infrastructure, site reliability engineering, or software engineering roles.
- Strong hands‑on experience with AWS, Azure, or GCP infrastructure and services, including container, serverless, networking, storage, observability, and security services.
- Experience designing and operating production systems on Kubernetes, ECS Fargate, or comparable container orchestration platforms.
- Proficiency with infrastructure‑as‑code, especially CloudFormation, Terraform, Helm, or similar tooling.
- Strong CI/CD experience with GitHub Actions or similar platforms, including reusable workflows, automated testing, deployment gates, and cloud authentication.
- Experience building and operating observability solutions using CloudWatch, Prometheus/Grafana, OpenSearch, or similar tools.
- Strong understanding of cloud security practices, IAM, secrets management, least‑privilege access, audit logging, and compliance requirements.
- Experience supporting distributed systems, microservices, APIs, asynchronous workloads, and multi‑environment deployments.
- Demonstrated ability to lead technical design, mentor engineers, and influence engineering practices across teams.
Preferred Qualifications
- Experience supporting AI/ML or generative AI platforms, including LLM gateways, model routing, prompt observability, token metering, or model failover.
- Experience operating platforms in regulated enterprise environments, ideally healthcare, pharmaceutical, finance, or life sciences.
- Experience with multi‑account, multi‑region AWS architectures and enterprise governance patterns.
- Experience with cost optimization, autoscaling strategies, capacity planning, and cloud budget monitoring.
- Experience with load testing and performance validation using tools such as Locust or comparable frameworks.
- Strong Python or scripting skills for platform automation, operational tooling, and CI/CD extensions.
- Ability to communicate complex technical decisions clearly to engineering, security, operations, and leadership audiences.
Technical Environment
Cloud: AWS EKS, ECS Fargate, Lambda, DynamoDB, S3, OpenSearch, CloudWatch, Secrets Manager, ALB, VPC, IAMInfrastructure‑as‑Code: CloudFormation, Helm, TerraformCI/CD: GitHub Actions, reusable workflows, OIDC federation, environment approvals, automated release promotionAI/ML Platform: AWS Bedrock, Azure OpenAI, LiteLLM, LangfuseObservability: CloudWatch dashboards and alarms, Prometheus, Grafana, OpenSearch, Langfuse, custom metricsSecurity & Governance: IAM permission boundaries, secret scanning, audit logging, tagging compliance, change‑management automationEngineering Practices: Docker, Python, pre‑commit, automated testing, load testing, code quality gates, monorepo service standards
Leadership Expectations
As a staff‑level engineer, this role is expected to operate beyond individual delivery. The engineer will identify systemic platform gaps, define technical direction, create reusable standards, and raise engineering maturity across multiple teams. Success requires strong judgment, ownership, and communication. The engineer should balance hands‑on implementation with architectural leadership, guide teams through ambiguous technical decisions, and build platform capabilities that make AI product teams faster, safer, and more reliable.
Benefits
Pfizer offers competitive compensation and benefits programs designed to meet the diverse needs of our colleagues. The annual base salary for this position ranges from €65 250,00 to €108 750,00 for the location France – Rives de Paris. Other benefits may include health care coverage, retirement savings plans, insurance benefits, an Employee Assistance Program, wellness benefits, and more.
Equal Opportunity Employer
Pfizer is an equal‑opportunity employer and complies with all applicable equal employment opportunity legislation in each jurisdiction in which it operates.
#J-18808-Ljbffr