DataLake AI Platform Operation Engineer
3 days ago
We help the world run better
At SAP, we enable you to bring out your best. Our company culture is focused on collaboration and a shared passion to help the world run better. How? We focus every day on building the foundation for tomorrow and creating a workplace that embraces differences, values flexibility, and is aligned to our purpose-driven and future-focused work. We offer a highly collaborative, caring team environment with a strong focus on learning and development, recognition for your individual contributions, and a variety of benefit options for you to choose from.
We are seeking a skilled and motivated individual to join our team as a DataLake AI Platform Operations Engineer. This role focuses on Cloud Infrastructure, Kubernetes (K8S), and Machine Learning, as well as AI Model Training tooling solutions. In this position, you will be responsible for setting up and managing AI and general computing infrastructure connected to an OpenStack-based private cloud, provisioning cloud resources from IaaS, implementing various service components to support distributed model training tasks and productive use-case serving instances across K8S clusters, and overseeing the runtime metrics of each component while continuously optimizing them.
What You'll Do:
----------------
- Infrastructure Operation: Utilize OpenStack-based IaaS resources and optimize their provisioning to ensure efficient infrastructure operations.
- Cross-Node Resource Management: Manage Kubernetes clusters across different regions and availability zones, ensuring optimal performance for use-cases and shared services while minimizing resource consumption.
- Logging, Auditing, and Metrics: Implement distributed logging solutions using Loki and OpenSearch. Configure auditing for each use-case and collect Prometheus-based metrics from both platform services and use-cases.
- Dashboarding and Monitoring: Develop dashboards tailored to specific needs and monitor the platform using the dashboard tools you create.
- Support Platform Use-Cases: Assist use-case development teams in maximizing the platform's capabilities for their projects.
- TCO Management: Automate the calculation of the total cost of ownership for platform infrastructure and licenses, and allocate these costs to each specific use-cases.
- Collaboration, Documentation, and Training: Collaborate with peers across regions to support various projects, document new changes, and provide training to platform users.
What You Bring:
----------------
- Bachelor's degree in Computer Science, Engineering, or a related field; advanced degrees are a plus.
- Basic understanding of GPU-based computing concepts, and familiarity with AI/ML frameworks and tools such as CUDA, Kubeflow, Spark, or PyTorch.
- Solid knowledge of Kubernetes and container orchestration concepts.
- Proficiency in coding languages (e.g., Python, Go, Shell) for automation and infrastructure management.
- Proven experience in infrastructure and operations management for cloud service solutions.
- Strong problem-solving skills and the ability to diagnose and resolve complex technical issues.
- Excellent communication and collaboration skills to work effectively with cross-functional teams.
- Strong attention to detail and the ability to manage multiple priorities in a fast-paced environment.
Join our dynamic team and contribute to cutting-edge solutions in AI and cloud infrastructure
Bring out your best
SAP innovations help more than four hundred thousand customers worldwide work together more efficiently and use business insight more effectively. Originally known for leadership in enterprise resource planning (ERP) software, SAP has evolved to become a market leader in end-to-end business application software and related services for database, analytics, intelligent technologies, and experience management. As a cloud company with two hundred million users and more than one hundred thousand employees worldwide, we are purpose-driven and future-focused, with a highly collaborative team ethic and commitment to personal development. Whether connecting global industries, people, or platforms, we help ensure every challenge gets the solution it deserves. At SAP, you can bring out your best.
We win with inclusion
SAP’s culture of inclusion, focus on health and well-being, and flexible working models help ensure that everyone – regardless of background – feels included and can run at their best. At SAP, we believe we are made stronger by the unique capabilities and qualities that each person brings to our company, and we invest in our employees to inspire confidence and help everyone realize their full potential. We ultimately believe in unleashing all talent and creating a better and more equitable world.
SAP is proud to be an equal opportunity workplace and is an affirmative action employer. We are committed to the values of Equal Employment Opportunity and provide accessibility accommodations to applicants with physical and/or mental disabilities. If you are interested in applying for employment with SAP and are in need of accommodation or special assistance to navigate our website or to complete your application, please send an e-mail with your request to Recruiting Operations Team: Careers@sap.com
For SAP employees: Only permanent roles are eligible for the SAP Employee Referral Program, according to the eligibility rules set in the SAP Referral Policy. Specific conditions may apply for roles in Vocational Training.
EOE AA M/F/Vet/Disability:
Qualified applicants will receive consideration for employment without regard to their age, race, religion, national origin, ethnicity, age, gender (including pregnancy, childbirth, et al), sexual orientation, gender identity or expression, protected veteran status, or disability.
Successful candidates might be required to undergo a background verification with an external vendor.
Requisition ID: 398759 | Work Area: Software-Development Operations | Expected Travel: 0 - 10% | Career Status: Professional | Employment Type: Regular Full Time | Additional Locations: #LI-Hybrid.
-
DataLake AI Platform Operation Engineer
3 days ago
Shanghai, China SAP Full timeWe help the world run better At SAP, we enable you to bring out your best. Our company culture is focused on collaboration and a shared passion to help the world run better. How? We focus every day on building the foundation for tomorrow and creating a workplace that embraces differences, values flexibility, and is aligned to our purpose-driven and...
-
Shanghai, China NVIDIA Full timeNVIDIA is hiring distributed systems and structured data platform engineers to design and develop our exa-scale AI infrastructure and deep learning platform for Autonomous Vehicles. Together, we will build the exa-scale software 2.0 cloud platform for one of the most ambitious problems of our time: autonomous vehicles. Then we will apply it to other...
-
Senior AI Engineer
2 weeks ago
Shanghai, Shanghai, China Thermo Fisher Scientific Full time: Explore New Capabilities: Stay updated with the latest advancements in OpenAI and other LLMs, including China Local AI. Conduct research and experiments to identify new capabilities and potential applications for our organization. Evaluate the feasibility and impact of integrating these technologies into our existing systems. Collaboration: Work...
-
AI Solutions Engineer
3 weeks ago
Shanghai, China Thermo Fisher Scientific Full time: Explore New Capabilities: Stay updated with the latest advancements in OpenAI and other LLMs, including China Local AI. Conduct research and experiments to identify new capabilities and potential applications for our organization. Evaluate the feasibility and impact of integrating these technologies into our existing systems. Collaboration: Work...
-
Digital & AI Specialist
2 weeks ago
Shanghai, Shanghai, China Faurecia Full timeJob Description Overall responsibilities and duties: The Digital & AI Specialist is a key role in the journey of Forvia digital transformation strategy, as he/she will leverage the Data tools(low-code & big data) and AI platform to create innovative and customized solutions in various business scenarios by working with partners. The main missions...
-
Digital & AI Specialist
4 weeks ago
Shanghai, China Faurecia Full timeJob Description Overall responsibilities and duties: The Digital & AI Specialist is a key role in the journey of Forvia digital transformation strategy, as he/she will leverage the Data tools(low-code & big data) and AI platform to create innovative and customized solutions in various business scenarios by working with partners. The main...
-
Senior AI Training Performance Engineer
4 weeks ago
Shanghai, China NVIDIA Full timeWe are now looking for a Senior AI Training Performance Engineer!NVIDIA is seeking senior engineers who are obsessed with performance analysis and optimization to help us squeeze every last clock cycle out of AI training, one of the most important workloads in the world. If you are unafraid to work across all layers of the hardware/software stack from GPU...
-
Generative AI Engineer
4 weeks ago
Shanghai, Shanghai, China Signify Netherlands B.V. Full timeWe're looking for a generative AI engineer to join our AI team in Shanghai.Working for Signify means being creative and adaptive. Our culture of continuous learning and commitment to diversity and inclusion creates an environment that allows you to build your skills and career. Together, we're transforming our industry.As the world leader in lighting, we're...
-
Shanghai, China NVIDIA Full timeWe are now looking for a TensorRT Software Development Engineer!NVIDIA is hiring software engineers for its AI Computing team. Academic and commercial groups around the world are using GPUs to power a revolution in deep learning-powered AI, enabling breakthroughs in areas like LLM, ChatGPT and GenerativeAI that has put DL at the “iPhone moment” for AI....
-
Shanghai, China NVIDIA Full timeNVIDIA is hiring distributed systems and system security engineers to design and develop our exa-scale AI infrastructure and deep learning platform for Autonomous Vehicles. Together, we will build the exa-scale software 2.0 cloud platform for one of the most ambitious problems of our time: autonomous vehicles. Then we will apply it to other applications such...
-
AI Developer Technology Engineer Intern, CUDA
4 weeks ago
Shanghai, China NVIDIA Full timeWe are now looking for an AI Developer Technology Engineer Intern, CUDA. Intelligent machines powered by AI computers that can learn, reason and interact with people are no longer science fiction. Today, a self-driving car can meander through a country road at night and find its way. An AI-powered robot can learn motor skills through trial and error. This is...
-
Machine Learning/AI Engineer, Staff
2 weeks ago
Shanghai, Shanghai, China Qualcomm Full timeCompany: Qualcomm China Job Area: Engineering Group, Engineering Group > Software Applications Engineering General Summary: Main Responsibilities: Option 1: Deep Learning Models Compiling, Algorithm optimization, Performance Benchmark, AI application intergration, AI framework Intergeration, Graph and Backend Compiler Development. Option 2:...
-
Shanghai, Shanghai, China NVIDIA Full timeNVIDIA is hiring distributed systems and data engineers to design and develop our exa-scale AI infrastructure for ingesting, indexing and managing data of Autonomous Vehicles. Together, we will build the exa-scale software 2.0 cloud platform for one of the most ambitious problems of our time: autonomous vehicles. Then we will apply it to other applications...
-
Principal Architect
4 weeks ago
Shanghai, China Airwallex Full timeAirwallex is the leading financial technology platform for modern businesses growing beyond borders. With one of the worlds most powerful payments and banking infrastructure, our technology empowers businesses of all sizes to accept payments, move money globally, and simplify their financial operations, all in one single platform. Established in 2015,...
-
Generative AI Specialist, Generative AI
4 weeks ago
Shanghai, Shanghai, China Amazon Full timeJob Description Are you excited about working with cutting-edge Generative AI algorithms to solve real-world problems? Join the Generative AI Innovation Center at AWS, where you'll collaborate with a team of strategists, data scientists, engineers, and solution architects to build bespoke solutions using the power of generative AI. Key Job...
-
Shanghai, China NVIDIA Full timeNVIDIA is hiring distributed systems and data engineers to design and develop our exa-scale AI infrastructure for ingesting, indexing and managing data of Autonomous Vehicles. Together, we will build the exa-scale software 2.0 cloud platform for one of the most ambitious problems of our time: autonomous vehicles. Then we will apply it to other applications...
-
AI Research Scientist
2 weeks ago
Shanghai, Shanghai, China Intel Full timeJob Description We are seeking a highly motivated AI Research Scientist to join our team (Vision and AI Lab at Intel Labs China), focusing on AI Scaling and Generative AI (GenAI) Tech Innovation. The ideal candidate will have a deep understanding of AI, machine learning (ML), and neural networks, with a focus on scaling AI technologies and GenAI...
-
MMOTA Platform Engineering
4 days ago
Shanghai, Shanghai, China Ford Motor Company Full timeMMOTA (Multi-Modules Over the air update) was a service running over Ford VSU(Vehicle Software Update) Platform; This position is an engineer role, be responsible for OTA Projects Execution/Delivery/Management Work as Deployment Engineer to manage and update OTA Project Milestone Deliver status by Daily work. Take technical actions via OTA Platform Portal...
-
AI Research Scientist
4 weeks ago
Shanghai, China Intel Full timeJob Description We are seeking a highly motivated AI Research Scientist to join our team (Vision and AI Lab at Intel Labs China), focusing on AI Scaling and Generative AI (GenAI) Tech Innovation. The ideal candidate will have a deep understanding of AI, machine learning (ML), and neural networks, with a focus on scaling AI technologies and GenAI...
-
MMOTA Platform Engineering
4 days ago
Shanghai, China Ford Motor Company Full timeMMOTA (Multi-Modules Over the air update) was a service running over Ford VSU(Vehicle Software Update) Platform; This position is an engineer role, be responsible for OTA Projects Execution/Delivery/Management Work as Deployment Engineer to manage and update OTA Project Milestone Deliver status by Daily work. Take technical actions via OTA Platform Portal...