Data Engineer_Guangzhou

3 weeks ago


Guangzhou, China TEKsystems Full time

职位介绍

AWS Cloud 或者其他Cloud 也可以 Python Airflow Kafka English needs to be fluent, prefer senior candidates

Job Description:

We are seeking a highly motivated Data Engineer to join our dynamic Data Engineering team. As a Data Engineer, you will play a crucial role in managing and optimizing our data infrastructure, ensuring the reliability and availability of data for our analytics and business intelligence needs. You will work with cutting-edge technologies including Amazon S3, Amazon Athena, Apache Airflow, and Python to build and maintain data pipelines.

Responsibilities:

Data Pipeline Development : Design, develop, and maintain data pipelines to extract, transform, and load (ETL) data from various sources into our data lake on Amazon S3.

Data Quality Assurance : Implement data validation and quality checks to ensure the accuracy and consistency of data throughout the ETL process.

Amazon Athena Expertise : Utilize Amazon Athena for ad-hoc querying and analysis of data stored in Amazon S3, optimizing query performance and efficiency.

Automation : Develop and maintain automated workflows using Apache Airflow to schedule and orchestrate data processing tasks, ensuring timely and reliable data delivery.

Python Programming : Write clean and efficient Python code to support data transformation, integration, and data engineering tasks.

Data Architecture : Collaborate with senior data engineers to design and evolve the data architecture, ensuring scalability and performance of the data infrastructure.

Documentation : Maintain comprehensive documentation of data pipelines, workflows, and processes for future reference.

Troubleshooting : Identify and resolve data pipeline issues, performance bottlenecks, and data quality concerns in a timely manner.

Collaboration : Work closely with data scientists, analysts, and other stakeholders to understand data requirements and provide data engineering support.

Qualifications :

Bachelor's degree in Computer Science, Information Technology, or a related field.

Strong knowledge of Amazon S3, Amazon Athena, and Apache Airflow.

Proficiency in Python programming for data manipulation and scripting.

Basic understanding of data modelling, SQL, and database concepts.

Excellent problem-solving and troubleshooting skills.

Strong communication and teamwork abilities.

Eagerness to learn and adapt to new technologies and tools.