Senior Architect, AI Solutions Engineering
Company: NVIDIA
Location: Santa Clara
Posted on: July 14, 2025
|
|
Job Description:
NVIDIA is seeking an AI Solutions Architect to join its
Infrastructure Planning and Process Team! This role will focus on
the extensive scale-up of key AI solutions for NVIDIA's internal
cloud infrastructure. IPP (Infrastructure, Planning and Process) is
a global organization within NVIDIA, working closely with various
teams such as Graphics Processors, Mobile Processors, Deep
Learning, Artificial Intelligence, and Driverless Cars to meet
their infrastructure needs. The cloud services support nearly half
a million automated jobs daily on five thousand servers, enhancing
the productivity of thousands of NVIDIA software developers
worldwide. The cloud hosts a diverse mix of machines and devices
with various operating systems (Windows/Linux/Android) and hardware
platforms, including NVIDIA GPUs and Tegra processors. As an AI
Solutions Architect, you will manage the tools NVIDIAns use to
deliver solutions quickly, and identify any gaps in these tools.
You will also understand overall movement of data in the entire
platform, identifying bottlenecks, defining solutions, developing
key pieces, writing APIs, and owning deployment. You will
collaborate with internal and external development teams to
discover opportunities and solve complex problems. Your role will
also involve guiding engineers in solving complex problems,
developing acceptance tests, and reviewing their work and test
results. Exceptional technical leadership, communication,
organizational, and analytical skills are required, along with a
passion for solving large and complex problems, e.g. Peta Bytes of
fast storage, Million cores, 100,000 builds and 100,000 tests. What
you’ll be doing: Serve as an Architect developing internal AI
systems used by thousands of NVIDIANs globally. Identify gaps and
issues and resolve ones are better suited for AI solutions versus
conventional approaches. Further divide the AI category into 'buy
versus build' options by researching available tools in the market.
Align with teams across Nvidia to establish overall AI system goals
and break them down into specific objectives for each sub-system.
Drive, motivate, convince, and mentor sub-system leads to achieve
improvements with agility and speed. Identify performance
bottlenecks and optimize the speed and cost efficiency of AI
development and testing systems. Drive the planning of
software/hardware capacity, covering both internal and public
cloud, addressing the balance between time and utilization.
Introduce technologies enabling massively parallel systems to
improve turnaround time by an order of magnitude. Collaborate with
AI product vendors to gain deep insights of the AI industry, and
share them with leaders and developers internally. What we need to
see: BS EE/CS or equivalent experience with 10 years of systems
software development with at least 1 year of experience in
developing/exploring AI. Development with Large Language Models
(LLMs), Retrieval-Augmented Generation (RAG), Fine-Tuning LLMs, AI
Agentic workflows, LangChain, LangGraphs, and Cascading models.
Experience in deploying in hybrid, multi-cloud architecture and
edge computing. Extensive experience architecting and shipping
large-scale distributed software systems. Ability to identify gaps
and bottlenecks, and develop solutions to optimize performance.
Strong programming and software development skills in JAVA, Python,
Shell-script along with good understanding of distributed systems
and REST APIs. Experience in working with SQL/NoSQL database
systems such as MySQL, Cassandra, MongoDB or Elasticsearch.
Excellent knowledge and working experience with Docker containers
and Virtual Machines. Good background of Cloud technologies like:
OpenStack, Docker, Kubernetes, Chef/Puppet, Hadoop/Ceph/SwiftStack,
LXC, Git, Perforce, JFrog, Kafka. Ability to work across
organizational boundaries optimally to improve alignment and
productivity between teams in a multi-national, multi-time-zone
corporate environment. Ways to stand out from the crowd: MS or PhD
in EE/CS Depth in AI, Machine Learning and Deep Learning algorithms
and techniques. Strong collaborative and interpersonal skills, with
a consistent record of guiding and influencing others in dynamic
environments. Experience developing large-scale software systems
using service-oriented architecture under real-time performance
requirements. Background in designing high-performance, scalable
software systems with a strong focus on hardware cost optimization.
With competitive salaries and a generous benefits package, we are
widely considered to be one of the technology world’s most
desirable employers. We have some of the most forward-thinking and
hardworking people in the world working for us and, due to
unprecedented growth, our best-in-class engineering teams are
rapidly growing. If you're a creative and autonomous engineer with
a real passion for technology, we want to hear from you. The base
salary range is 184,000 USD - 356,500 USD. Your base salary will be
determined based on your location, experience, and the pay of
employees in similar positions. You will also be eligible for
equity and benefits . NVIDIA accepts applications on an ongoing
basis. NVIDIA is committed to fostering a diverse work environment
and proud to be an equal opportunity employer. As we highly value
diversity in our current and future employees, we do not
discriminate (including in our hiring and promotion practices) on
the basis of race, religion, color, national origin, gender, gender
expression, sexual orientation, age, marital status, veteran
status, disability status or any other characteristic protected by
law.
Keywords: NVIDIA, Fairfield , Senior Architect, AI Solutions Engineering, IT / Software / Systems , Santa Clara, California