ATech
Big Data Engineer - Spark/Scala
Job Location
in, India
Job Description
Designation: BIG DATA ENGINEER Job Description: Your Role and Responsibilities: - Understand a data warehousing solution and able to work independently in such an environment - Responsible in Project development and delivery experience of a few good size projects - Design, build, optimize and support new and existing data models and ETL processes based on our clients business requirements. - Build, deploy and manage data infrastructure that can adequately handle the needs of a rapidly growing data driven organization. - Coordinate data access and security to enable data scientists and analysts to easily access to data whenever they need too. - Experiencing developing scalable Big Data applications or solutions on distributed platforms - Able to partner with others in solving complex problems by taking a broad perspective to identify innovative solutions - Strong skills building positive relationships across Product and Engineering. - Able to influence and communicate effectively, both verbally and written, with team members and business stakeholders - Able to quickly pick up new programming languages, technologies, and frameworks - Experience working in Agile and Scrum development process - Experience working in a fast-paced, results-oriented environment - Experience in Amazon Web Services (AWS) or other cloud platform tools - Experience working with Data warehousing tools, including Dynamo DB, SQL, Amazon Redshift, and Snowflake - Experience architecting data product in Streaming, Serverless and Microservices Architecture and platform. - Experience working with Data platforms, including EMR, Data Bricks etc - Experience working with distributed technology tools, including Spark, Presto, Scala, Python, Databricks, Airflow - Developed the Pysprk code for AWS Glue jobs and for EMR Worked on scalable distributed data system using Hadoop ecosystem in AWS EMR, MapR distribution - Developed Python and pyspark programs for data analysis Good working experience with python to develop Custom Framework for generating of rules (just like rules engine). - Developed Hadoop streaming Jobs using python for integrating python API supported applications - Developed Python code to gather the data from HBase and designs the solution to implement using Pyspark. Apache Spark DataFrames/RDD's were used to apply business transformations and utilized Hive Context objects to perform read/write operations - Re - write some Hive queries to Spark SQL to reduce the overall batch time Required Technical and Professional Expertise: - First and most important- Sound understanding of data structures & SQL concepts and experience in writing complex SQL especially around OLAP systems - Sound knowledge of the ETL tool like informatica, 5 years of experience, Big Data technologies'' like Hadoop ecosystem, its various components, along with different tools including Spark, Hive, Sqoop,etc - In-depth knowledge of MPP/distributed systems Preferred Technical and Professional Expertise: - The ability to write precise, scalable, and high-performance code - The ability to write precise, scalable, and high-performance code - Knowledge/Exposure in data modeling with OLAP (Optional) (ref:hirist.tech)
Location: in, IN
Posted Date: 11/21/2024
Location: in, IN
Posted Date: 11/21/2024
Contact Information
Contact | Human Resources ATech |
---|