Current jobs related to AI Framework Engineer - Pudong - Advanced Micro Devices, Inc
-
R&D Engineer-Developing
3 days ago
Pudong, China Advantest Full timeKey Responsibilities•AI Technology Application: Enhance R&D efficiency through AI, responsible for AI application development and delivery.•Project Management: Lead AI project requirement analysis, technical design, and product delivery.•Model Optimization: Adjust and optimize general, embedding, and inference models.•Process Building: Create...
-
Software System Design Engineer
3 days ago
Pudong, China Advanced Micro Devices, Inc Full timeWHAT YOU DO AT AMD CHANGES EVERYTHING At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create...
-
Systems Design Engineer
1 day ago
Pudong, China Advanced Micro Devices, Inc Full timeWHAT YOU DO AT AMD CHANGES EVERYTHING At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create...
-
Sr. Manager, Program Management
2 weeks ago
Pudong, China Advanced Micro Devices, Inc Full timeWHAT YOU DO AT AMD CHANGES EVERYTHING At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create...
-
Systems Design Engineer
1 day ago
Pudong, China Advanced Micro Devices, Inc Full timeWHAT YOU DO AT AMD CHANGES EVERYTHING At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create...
-
Intern_AI Data Eng
6 days ago
Pudong, Shanghai, China 梅赛德斯-奔驰租赁有限公司 Full time职位来源于实习僧。Name of Project: ADAS AI Data Platform Project Objectives: - Understand the Mercedes-Benz Cloud-based data usage approach, collaboration mode and standards. - Understand AI infrastructure and foundational pipeline setup on cloud. (e.g. Backend service, Algorithm application implementation, etc.) - Contribute to AD users daily usage...
-
R&D Engineer-Associate
6 days ago
Pudong, China Advantest Full time1.Develop semiconductor Automated Test Equipment (ATE) software systems using C++ and Python, including functional modules, test scripts, and data processing;2.Assist in the design and development of both relational and non-relational databases to support system data management;3.Implement and optimize fundamental algorithms to enhance system performance and...
-
Application Development Expert
6 days ago
Shanghai - Pudong Avenue, China Freudenberg Full timeWorking at Freudenberg: We will wow your worldResponsibilities:Develop and implement robotic applications for humanoid robots in industrial environments such as handling and assembly. Program interaction and motion sequences using established robotics frameworks (e.g., ROS 2, MoveIt).Design and implement motion control algorithms including trajectory...
-
R&D Engineer-Associate
6 days ago
Pudong, China Advantest Full timeResponsibilitiesTool Research: Investigate and analyze AI-driven GUI automation testing tools, summarizing features, advantages, and applicable scenarios.Technical Validation: Set up test environments, validate tool feasibility and performance, and prepare validation reports.Solution Design: Assist in designing technical solutions for AI-integrated GUI...
-
Silicon Design Engineer
6 days ago
Pudong, China Advanced Micro Devices, Inc Full timeWHAT YOU DO AT AMD CHANGES EVERYTHING At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create...
AI Framework Engineer
3 weeks ago
WHAT YOU DO AT AMD CHANGES EVERYTHING
At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you'll discover the real differentiator is our culture. We push the limits of innovation to solve the world's most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career.
Position Overview
We are seeking a highly experienced engineer specializing in large language model (LLM) inference performance optimization. You will be a core member of our team, responsible for building and optimizing the LLM inference performance with high-throughput, low-latency on AMD Instinct GPUs. If you are passionate about pushing performance boundaries and have deep, hands-on expertise with cutting-edge technologies like vLLM or SGLang, we invite you to join us.
Key Responsibilities
1. Core System Optimization: Lead the development, tuning, and customization of LLM performance optimization on AMD GPUs, leveraging and extending frameworks like vLLM or SGLang to address performance bottlenecks in production environments.
2. Performance Analysis & Tuning: Conduct end-to-end performance profiling using specialized tools. Perform deep optimization of compute-bound operators (e.g., Attention), memory I/O, and communication to significantly increase throughput and reduce latency.
3. Model Architecture Adaptation: Demonstrate expertise in mainstream LLM architectures (e.g., DeepSeek, Qwen, Llama, ChatGLM) and optimize inference for their specific characteristics (e.g., RoPE, SWA, MoE, GQA).
4. Algorithm & Principle Application: Leverage your deep understanding of core algorithms (Transformer, Attention, MoE) to implement advanced optimization techniques such as PagedAttention, FlashAttention, continuous batching, quantization, and model compression.
5. Technology Foresight & Implementation: Research and prototype state-of-the-art optimization techniques (e.g., Speculative Decoding, Weight-Only Quantization) and drive their adoption into production systems.
Qualifications:
Mandatory Requirements:
1. Expertise in Inference Frameworks: Proven, hands-on experience with vLLM or SGLang, including deep understanding of their source code, deployment, configuration, and performance tuning. (Please describe relevant projects in your resume).
2. Mastery of Model Architectures: In-depth understanding and practical experience with inference workflows of mainstream LLMs (e.g., DeepSeek, Qwen), including their tokenizers, model configurations, and architecture definitions.
3. Strong Theoretical Foundation: Solid grasp of the principles behind Transformer, Self-Attention, MoE, KV Cache, and their impact on inference performance.
4. Proven Optimization Experience: Familiarity with end-to-end LLM inference optimization techniques such as PagedAttention, FlashAttention, continuous/dynamic batching, and quantization (INT8/INT4/GPTQ/AWQ), demonstrated with successful case studies.
5. Programming Skills: Proficiency in Python and strong software engineering best practices.
Preferred Qualifications (Plus):
1. Low-Level Development Skills: Experience with CUDA C++ programming for writing and debugging high-performance GPU kernels; or practical experience using Triton to develop and optimize deep learning operators.
2. Compiler Knowledge: Understanding or practical experience with compiler technologies like TVM or MLIR is a significant advantage.
3. Distributed Systems Experience: Hands-on experience with distributed inference for large-scale models (e.g., Tensor Parallel, Pipeline Parallel).
4. Education: Master's or Ph.D. in Computer Science, Artificial Intelligence, Electrical Engineering, or a related field.
#LI-FL1
Benefits offered are described: AMD benefits at a glance.
AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants' needs under the respective laws throughout all stages of the recruitment and selection process.