POSITION Senior Software Production Engineer – Infrastructure Software for AI
POSITION SUMMARY
A newly established AI infrastructure center in Silicon Valley is seeking a Senior Software Production Engineer to lead the development of test automation infrastructure for GPU-based systems supporting AI workloads. The role focuses on accelerating production readiness of large-scale Kubernetes deployments and AI systems.
RESPONSIBILITIES
Design and build test automation infrastructure for Kubernetes on GPU clusters
Develop scalable system, stress, and milestone-based testing plans
Ensure release quality in collaboration with engineering and program teams
Mentor downstream production engineering talent
Promote a culture of humility and innovation
QUALIFICATIONS
Bachelor’s in CS, EE, or related field (Master’s/PhD preferred)
7+ years in software/hardware engineering, including distributed systems
Deep experience in Kubernetes automation and cloud infrastructure
Familiarity with GPU platforms (DGX, H100, GB200) and AI developer tools
SALARY USD 150,000-250,000
LOCATION San Jose, California (Hybrid Work Style)
#LI-JACUS #LI-US #countryUS