ANLAGE
Data Engineer - Scala
Job Location
in, India
Job Description
Job Overview : We are seeking a highly skilled and motivated Scala Data Engineer to join our team. In this role, you will design, build, and maintain large-scale data processing systems. The ideal candidate will have a strong background in Scala programming, distributed data processing frameworks, and a deep understanding of data architecture and pipelines. You will collaborate closely with data scientists, data analysts, and other engineering teams to optimize the data infrastructure. Key Responsibilities : - Data Pipeline Development : Design, develop, and maintain efficient, reliable, and scalable data pipelines using Scala and distributed frameworks like Apache Spark or Kafka. - Data Integration : Integrate data from various sources into data lakes or data warehouses, ensuring high data quality and performance. - Performance Optimization : Optimize existing processes and code for performance, scalability, and reliability, especially in distributed environments. - Data Architecture : Help define and implement data architecture that supports business analytics, machine learning, and reporting needs. - Collaboration : Work closely with data scientists, analysts, and other engineers to translate data requirements into technical solutions. - Monitoring & Maintenance : Build tools and processes for monitoring, alerting, and ensuring the reliability of the data infrastructure. - Testing & Debugging : Write unit, integration, and end-to-end tests for data pipelines and ensure data integrity across systems. - Documentation : Provide clear, concise documentation for both technical and non-technical stakeholders. Requirements : - Experience : 7 years of experience in data engineering, with a focus on Scala and big data technologies. - Programming Languages : Proficient in Scala; experience with Java, Python, or other JVM-based languages is a plus. - Big Data Frameworks : Hands-on experience with Apache Spark, Kafka, Hadoop, or similar frameworks. - Data Storage Systems : Experience with relational databases (e.g., PostgreSQL, MySQL) and NoSQL databases (e.g., Cassandra, HBase). - Data Warehousing : Familiarity with data warehousing solutions like Amazon Redshift, Google BigQuery, or similar. - Cloud Platforms : Experience with cloud-based data platforms (AWS, Azure, GCP). - Data Processing : Strong understanding of distributed data processing, batch, and streaming data processing techniques. - Version Control : Proficiency in using version control systems like Git. - Problem-Solving Skills : Strong analytical and problem-solving skills with a focus on continuous improvement. - Communication : Excellent communication and teamwork skills, with the ability to translate technical issues into business context. Preferred Qualifications : - Machine Learning : Experience with data pipelines supporting machine learning models. - ETL Tools : Familiarity with ETL tools like Apache NiFi, Talend, or Airflow. - CI/CD : Experience with continuous integration and deployment pipelines for data pipelines. - Microservices : Experience building and managing microservices for data-intensive applications. - DevOps : Familiarity with DevOps practices, especially as they apply to data engineering. (ref:hirist.tech)
Location: in, IN
Posted Date: 11/28/2024
Location: in, IN
Posted Date: 11/28/2024
Contact Information
Contact | Human Resources ANLAGE |
---|