Senior GPU Cluster Developer
1 month ago
We are seeking a highly skilled Senior GPU Cluster Software Engineer to join our team at NVIDIA. This is a unique opportunity to work on large-scale distributed systems infrastructure with monitoring, logging, visualization, and alerting capabilities.
About the RoleAs a key member of our System Software team, you will be responsible for building profiling solutions for real-world applications running on GPU compute clusters. Your primary goal will be to improve the user experience for customers and engineers supporting the cluster.
Responsibilities- Work in an agile and fast-paced global environment to gather requirements, architect, design, implement, test, deploy, release, and support large-scale distributed systems infrastructure with promised uptime.
- Build internal profiling tools for real-world ML/DL applications running on HPC GPU clusters for failure and efficiency analysis.
- Understand state-of-the-art improvements in the ML/DL domain and work with various application owners and research teams to add/improve profiling needs for current and potential future supported features.
To succeed in this role, you will need:
- Bachelor's degree in Computer Science or related field (or equivalent experience) and 5+ years of software development experience in Python.
- Experience with Gitlab (or another source code management tool) branch/release, CI/CD pipeline, etc.
- Solid understanding of algorithms, data structures, and runtime/space complexity.
- Experience working with distributed system software architecture.
- Basic understanding of HPC GPU cluster, Slurm.
- Basic understanding of Machine learning concepts and terminologies.
- Background with databases - SQL and NoSQL (Prometheus, Elasticsearch, OpenSearch, Redis, etc.).
- Experience with distributed Data Pipeline, Telemetry, Visualizations (Kibana, Grafana, etc.), Alerting (PagerDuty, etc.).
The estimated annual salary for this role is around $150,000 - $200,000, depending on your level of experience and qualifications.
-
GPU Cluster Software Engineer
2 months ago
Shanghai, Shanghai, China NVIDIA Full timeAs a member of the System Software team at NVIDIA, you will be responsible for building and optimizing large-scale distributed systems infrastructure with monitoring, logging, visualization, and alerting capabilities. Your focus will be on creating profiling solutions for real-world applications running on GPU compute clusters to improve efficiency and user...
-
Senior Machine Learning Infrastructure Specialist
2 months ago
Shanghai, Shanghai, China Optiver Full timeAbout the Role:Optiver is a global market maker with a presence in multiple continents, and our Shanghai office is a rapidly growing participant in the Chinese markets. We are seeking a highly skilled Senior Machine Learning Platform Engineer to join our team and help shape the future of our company.Key Responsibilities:Design and develop the infrastructure...
-
Shanghai, Shanghai, China Bosch Full timeJob Overview We are seeking a highly skilled Software Development Engineer to join our team at Bosch, focusing on the development of automotive instrument clusters. This role offers an exciting opportunity to work on cutting-edge technologies and collaborate with cross-functional teams. Salary The estimated annual salary for this position is $120,000 -...
-
Automotive Cluster Software Expert
4 weeks ago
Shanghai, Shanghai, China Bosch Group Full timeWe are seeking a highly skilled Automotive Cluster Software Expert to join our team at Bosch Group. As a key member of our software development team, you will play a crucial role in designing and developing cutting-edge automotive instrument clusters.Job SummaryThis is an exciting opportunity for an experienced software engineer to lead the development of...
-
GPU Graphics Architecture Engineer
1 month ago
Shanghai, Shanghai, China NVIDIA Full timeAbout NVIDIANVIDIA is a leader in the technology industry, renowned for its innovative and cutting-edge products.Job OverviewWe are seeking a skilled GPU Graphics Performance Architect to join our team. The successful candidate will be responsible for investigating and studying state-of-the-art real-time rendering techniques and their implementation on GPU,...
-
Chief Data Platform Specialist
1 month ago
Shanghai, Shanghai, China Optiver Full timeCompany OverviewOptiver is a global market maker with offices around the world, united in its commitment to improving the market through competitive pricing, execution, and risk management. By providing liquidity on multiple exchanges across the globe, Optiver participates in safeguarding healthy and efficient markets.SalaryThe estimated salary for this...
-
GPU Architectural Performance Innovator
1 month ago
Shanghai, Shanghai, China NVIDIA Full timeNVIDIA is a global leader in the technology industry, renowned for its innovative and high-performance graphics solutions. As a GPU Graphics Performance Architect, you will be part of a dynamic team that drives the development of cutting-edge graphics architecture.What You Will Do:Investigate and study state-of-the-art real-time rendering techniques to...
-
High Performance GPU Architectural Engineer
1 month ago
Shanghai, Shanghai, China NVIDIA Full timeNVIDIA - High Performance GPU Architectural EngineerWe are seeking a skilled GPU C++ Modeling Engineer to join our team. As a key member of our organization, you will play a crucial role in designing and developing high-performance GPU architectures.About the Role:You will investigate and propose innovative architecture ideas based on thorough quantitative...
-
GPU Architect Specializing in Real-Time Rendering
2 months ago
Shanghai, Shanghai, China NVIDIA Full timeJob Title: GPU Graphics Performance ArchitectAbout the Role:As a member of the graphics performance team at NVIDIA, you will contribute to the development of efficient and powerful graphics architectures. Your work will involve studying graphics workloads, testing innovative hardware and software solutions, and identifying areas for improvement. The goal is...
-
GPU Performance Architect
2 months ago
Shanghai, Shanghai, China NVIDIA Full timeGraphics Performance TeamNVIDIA's Graphics Performance Team is responsible for delivering efficient and powerful graphics architecture every generation. The team studies graphics workloads and tests innovative HW/SW solutions on various platforms to address inefficiencies in the current architecture.Our work paves the path for real-time rendering of complex...
-
Shanghai, Shanghai, China NVIDIA Full timeNVIDIA, a leader in the technology world, is seeking a highly skilled and innovative GPU Graphics Performance Architect Intern. As a member of our team, you will play a crucial role in delivering cutting-edge graphics architectures that set new standards for efficiency and performance.We're looking for a talented individual with a strong background in...
-
Senior Graphics Architecture Engineer
2 months ago
Shanghai, Shanghai, China Amazon Innovation Center (Shenzhen) Company Limited Shanghai Branch - O93 Full time**Job Title:** Senior Graphics Architecture EngineerAbout the Role:We are seeking an experienced Senior Graphics Architecture Engineer to join our team at Amazon Innovation Center (Shenzhen) Company Limited Shanghai Branch - O93. As a key member of our graphics software development team, you will be responsible for designing and implementing high-performance...
-
Shanghai, Shanghai, China NVIDIA Full timeNVIDIA is now looking for an exceptional individual to join its Compute Developer Technology team as a Deep Learning Expert Intern. This role offers the opportunity to work on cutting-edge techniques in deep learning, graphs, machine learning, and data analytics.About NVIDIA:As a pioneer in the field of AI computing, NVIDIA has established itself as a leader...
-
Senior Embedded Systems Development Engineer
2 months ago
Shanghai, Shanghai, China Amazon Innovation Center (Shenzhen) Company Limited Shanghai Branch - O93 Full timeJob SummaryAs a key member of the Amazon Innovation Center (Shenzhen) Company Limited Shanghai Branch - O93 team, you will play a pivotal role in designing, implementing, and optimizing multimedia functionalities for embedded systems. Your expertise in Linux BSP development and multimedia integration will drive the success of our projects.Key...
-
Power Efficiency Expert for Next-Gen GPUs
1 month ago
Shanghai, Shanghai, China NVIDIA Full timeAbout the Role:We are seeking a Power Methodology and Analysis engineer to join our team at NVIDIA. Our company prides itself on having energy-efficient products, and we believe that maintaining this advantage over competition is key to our continued success.Our team is responsible for researching, developing, and deploying methodologies to help NVIDIA's...
-
Shanghai, Shanghai, China NVIDIA Full timeNVIDIA's success lies in its cutting-edge analysis tools, empowering engineers to optimize performance and power efficiency. We seek innovative individuals to join our software team, characterized by high standards and multifaceted challenges.This software engineering role involves developing analysis tools for various OS and hardware combinations, from...
-
High-Performance AI Software Developer
4 weeks ago
Shanghai, Shanghai, China NVIDIA Full timeWe are seeking a skilled Deep Learning Performance Software Engineer to expand our research and development in Inference. This role involves developing highly optimized deep learning kernels for inference, working with cross-collaborative teams, and occasionally traveling to conferences and customers.As a Deep Learning Performance Software Engineer at...
-
Senior SW QA Automation Engineer
1 month ago
Shanghai, Shanghai, China NVIDIA Full timeNVIDIA is a world-leading innovator in GPU computing. Our mission is to fuel the advancements in gaming, automotive, professional visualization, HPC, datacenters, and networking.We are seeking an experienced Senior Software QA Test Development Engineer to join our team. In this role, you will collaborate with multi-functional groups to design, develop, and...
-
Shanghai, Shanghai, China NVIDIA Full timeTransform AI Training PerformanceNVIDIA is seeking senior engineers who excel at performance analysis and optimization to drive AI training efficiency. If you're passionate about squeezing every last clock cycle out of AI training, we want to hear from you. This role offers the opportunity to directly impact the hardware and software roadmap in a...
-
Senior IP Verification Specialist
1 month ago
Shanghai, Shanghai, China NVIDIA Full timeJob Title:Senior Custom SOC IP Verification EngineerAbout the Role:NVIDIA seeks a seasoned Senior IP Verification Specialist to drive the verification of cutting-edge SoC and IP solutions. As part of our team, you will contribute to delivering innovative products that transform lives.You will be responsible for ASIC design verification for various IPs at...