The Metromax Group

Data Engineer - AWS Glue/EMR

Click Here to Apply

Job Location

bangalore, India

Job Description

Key Responsibilities : - Design and develop data pipelines using AWS Glue, EMR, Spark Scala, and S3 to support both batch and real-time data processing needs. - Implement ETL processes to extract, transform, and load data from various sources (structured and unstructured) into the data lake. - Leverage Apache Spark on EMR for big data processing and transformations, using Spark Scala - Manage and optimize data storage on S3, ensuring proper data partitioning, file formats (Parquet, ORC, Avro), and lifecycle policies for cost-effective storage solutions. - Monitor, troubleshoot, and optimize EMR clusters for performance, scalability, and cost efficiency. - Collaborate with data architect and analysts to ensure seamless data integration and support advanced analytics and machine learning models. - Automate data workflows using AWS Step Functions, Lambda, and other orchestration tools. Required Qualifications : - Bachelor's or master's degree in computer science, Data Engineering, or a related technical field. - 3-5 years of experience in data engineering, particularly using AWS services (EMR, Glue, S3, Lambda). - Strong expertise in Apache Spark for distributed data processing, with hands-on coding experience in Scala and Python. - Experience with building ETL pipelines and working with big data in a cloud-based Lakehouse environment. - Deep understanding of data formats (Parquet, Avro, ORC) and file optimization techniques. - Familiarity with data modeling principles, including partitioning, bucketing, and schema management in AWS Glue Data Catalog. - Strong knowledge of SQL and query optimization for working with large datasets. - Experience with AWS security services such as IAM, KMS (Key Management Service), and encryption best practices. - Proficiency in troubleshooting and performance tuning of Spark and EMR clusters for large-scale data processing. - Familiarity with CI/CD pipelines and infrastructure-as-code (Terraform, CloudFormation) for managing AWS environments. Preferred Qualifications : - AWS Certified Data Analytics, Developer, or Solutions Architect certification. - Experience with streaming data technologies such as Kinesis or Kafka for real-time data ingestion. - Knowledge of serverless computing and experience with AWS Lambda, Step Functions, and DynamoDB. - Familiarity with DevOps and automation tools (e.g., Jenkins, Git, Docker). Soft Skills : - Strong problem-solving and analytical thinking skills. - Ability to work collaboratively in a fast-paced, cross-functional environment. - Excellent communication skills to explain complex technical issues to both technical and non-technical stakeholders. (ref:hirist.tech)

Location: bangalore, IN

Posted Date: 11/21/2024
Click Here to Apply
View More The Metromax Group Jobs

Contact Information

Contact Human Resources
The Metromax Group

Posted

November 21, 2024
UID: 4918208112

AboutJobs.com does not guarantee the validity or accuracy of the job information posted in this database. It is the job seeker's responsibility to independently review all posting companies, contracts and job offers.