Current jobs related to AI Framework Engineer - Pudong - Advanced Micro Devices, Inc


  • Pudong, China Advantest Full time

    Key Responsibilities•AI Technology Application: Enhance R&D efficiency through AI, responsible for AI application development and delivery.•Project Management: Lead AI project requirement analysis, technical design, and product delivery.•Model Optimization: Adjust and optimize general, embedding, and inference models.•Process Building: Create...


  • Pudong, China Advanced Micro Devices, Inc Full time

    WHAT YOU DO AT AMD CHANGES EVERYTHING At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create...


  • Pudong, China Advanced Micro Devices, Inc Full time

    WHAT YOU DO AT AMD CHANGES EVERYTHING At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create...


  • Pudong, China Advanced Micro Devices, Inc Full time

    WHAT YOU DO AT AMD CHANGES EVERYTHING At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create...


  • Pudong, China Advanced Micro Devices, Inc Full time

    WHAT YOU DO AT AMD CHANGES EVERYTHING At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create...

  • Intern_AI Data Eng

    6 days ago


    Pudong, Shanghai, China 梅赛德斯-奔驰租赁有限公司 Full time

    职位来源于实习僧。Name of Project: ADAS AI Data Platform Project Objectives: - Understand the Mercedes-Benz Cloud-based data usage approach, collaboration mode and standards. - Understand AI infrastructure and foundational pipeline setup on cloud. (e.g. Backend service, Algorithm application implementation, etc.) - Contribute to AD users daily usage...


  • Pudong, China Advantest Full time

    1.Develop semiconductor Automated Test Equipment (ATE) software systems using C++ and Python, including functional modules, test scripts, and data processing;2.Assist in the design and development of both relational and non-relational databases to support system data management;3.Implement and optimize fundamental algorithms to enhance system performance and...


  • Shanghai - Pudong Avenue, China Freudenberg Full time

    Working at Freudenberg: We will wow your worldResponsibilities:Develop and implement robotic applications for humanoid robots in industrial environments such as handling and assembly. Program interaction and motion sequences using established robotics frameworks (e.g., ROS 2, MoveIt).Design and implement motion control algorithms including trajectory...


  • Pudong, China Advantest Full time

    ResponsibilitiesTool Research: Investigate and analyze AI-driven GUI automation testing tools, summarizing features, advantages, and applicable scenarios.Technical Validation: Set up test environments, validate tool feasibility and performance, and prepare validation reports.Solution Design: Assist in designing technical solutions for AI-integrated GUI...


  • Pudong, China Advanced Micro Devices, Inc Full time

    WHAT YOU DO AT AMD CHANGES EVERYTHING At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create...

AI Framework Engineer

3 weeks ago


Pudong, China Advanced Micro Devices, Inc Full time


WHAT YOU DO AT AMD CHANGES EVERYTHING 

At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you'll discover the real differentiator is our culture. We push the limits of innovation to solve the world's most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond.  Together, we advance your career.  



Position Overview
We are seeking a highly experienced engineer specializing in large language model (LLM) inference performance optimization. You will be a core member of our team, responsible for building and optimizing the LLM inference performance with high-throughput, low-latency on AMD Instinct GPUs. If you are passionate about pushing performance boundaries and have deep, hands-on expertise with cutting-edge technologies like vLLM or SGLang, we invite you to join us.
Key Responsibilities
1. Core System Optimization: Lead the development, tuning, and customization of LLM performance optimization on AMD GPUs, leveraging and extending frameworks like vLLM or SGLang to address performance bottlenecks in production environments.
2. Performance Analysis & Tuning: Conduct end-to-end performance profiling using specialized tools. Perform deep optimization of compute-bound operators (e.g., Attention), memory I/O, and communication to significantly increase throughput and reduce latency.
3. Model Architecture Adaptation: Demonstrate expertise in mainstream LLM architectures (e.g., DeepSeek, Qwen, Llama, ChatGLM) and optimize inference for their specific characteristics (e.g., RoPE, SWA, MoE, GQA).
4. Algorithm & Principle Application: Leverage your deep understanding of core algorithms (Transformer, Attention, MoE) to implement advanced optimization techniques such as PagedAttention, FlashAttention, continuous batching, quantization, and model compression.
5. Technology Foresight & Implementation: Research and prototype state-of-the-art optimization techniques (e.g., Speculative Decoding, Weight-Only Quantization) and drive their adoption into production systems.


Qualifications:
Mandatory Requirements:
1. Expertise in Inference Frameworks: Proven, hands-on experience with vLLM or SGLang, including deep understanding of their source code, deployment, configuration, and performance tuning. (Please describe relevant projects in your resume).
2. Mastery of Model Architectures: In-depth understanding and practical experience with inference workflows of mainstream LLMs (e.g., DeepSeek, Qwen), including their tokenizers, model configurations, and architecture definitions.
3. Strong Theoretical Foundation: Solid grasp of the principles behind Transformer, Self-Attention, MoE, KV Cache, and their impact on inference performance.
4. Proven Optimization Experience: Familiarity with end-to-end LLM inference optimization techniques such as PagedAttention, FlashAttention, continuous/dynamic batching, and quantization (INT8/INT4/GPTQ/AWQ), demonstrated with successful case studies.
5. Programming Skills: Proficiency in Python and strong software engineering best practices.
Preferred Qualifications (Plus):
1. Low-Level Development Skills: Experience with CUDA C++ programming for writing and debugging high-performance GPU kernels; or practical experience using Triton to develop and optimize deep learning operators.
2. Compiler Knowledge: Understanding or practical experience with compiler technologies like TVM or MLIR is a significant advantage.
3. Distributed Systems Experience: Hands-on experience with distributed inference for large-scale models (e.g., Tensor Parallel, Pipeline Parallel).
4. Education: Master's or Ph.D. in Computer Science, Artificial Intelligence, Electrical Engineering, or a related field.

#LI-FL1



Benefits offered are described:  AMD benefits at a glance.

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law.   We encourage applications from all qualified candidates and will accommodate applicants' needs under the respective laws throughout all stages of the recruitment and selection process.