CygnusPro Software Solutions Pvt. Ltd
Data Engineering Architect - Google Cloud Platform
Job Location
bangalore, India
Job Description
Role : Data Engineering Architect (GCP & Apache Spark). Job Summary : We are seeking a highly skilled Data Engineering Architect with deep expertise in Google Cloud Platform (GCP), Apache Spark, to architect, design and implement scalable, high-performance data lake solutions. The ideal candidate will have extensive experience in building data ingestion pipelines, managing big data processing using Apache Spark,. Key Requirements : - Over 12 years of professional experience in data engineering, specializing in implementing large-scale enterprise Data Engineering projects with the latest technologies. - Over 5 years of hands-on experience in GCP technologies and over 3 years of architect experience. - Design and implement end-to-end data architectures leveraging GCP services (e., Big Query, Cloud Storage, Dataflow, Pub/Sub, Cloud Composer) for large-scale data ingestion and processing. - Build and optimize large-scale data pipelines using Apache Spark on GCP (via Dataproc or other Spark services). - Ensure high performance and scalability in Spark-based data processing workloads. - Lead the integration of SAP S/4HANA data with GCP for real-time and batch data processing. - Manage data extraction, transformation, and loading (ETL) processes from SAP S/4HANA into cloud storage and data lakes. - Develop and manage scalable data ingestion pipelines for structured and unstructured data using tools like Cloud Dataflow, Cloud Pub/Sub, and Apache Spark. - Provide architectural guidance for designing secure, scalable, and efficient data solutions on the Google Cloud Platform, integrating with on-premise/cloud systems like SAP S/4HANA. - Implement both real-time streaming and batch processing pipelines using Apache Spark, Dataflow, and other GCP services to meet business requirements. - Implement data governance, access controls, and security best practices to ensure the integrity, confidentiality, and compliance of data across systems. - Collaborate with business stakeholders, data scientists, and engineering teams to define data requirements, ensuring the architecture aligns with business goals. - Optimize Apache Spark jobs for performance, scalability, and cost-efficiency, ensuring that the architecture can handle growing data volumes. - Provide technical leadership to the data engineering team, mentoring junior engineers in data architecture, Apache Spark development, and GCP best practices. Technical Expertise : - Expert-level programming proficiency in Python, Java, and Scala. - Extensive hands-on experience with big data technologies, including Spark, Hadoop, Hive, Yarn, MapReduce, Pig, Kafka, and PySpark. - Proficient in Google Cloud Platform services, such as BigQuery, Dataflow, Cloud Storage, Dataproc, and Cloud Composer Google Pub/Sub, and Google Cloud Functions. - Expertise in Apache Spark for both batch and real-time processing, as well as proficiency in Apache Beam, Hadoop, or other big data frameworks. - Experienced in using Cloud SQL, BigQuery, and Looker Studio (Google Data Studio) for cloud-based data solutions. - Skilled in orchestration and deployment tools like Cloud Composer, Airflow, and Jenkins for continuous integration and deployment (CI/CD). - Expertise in designing and developing integration solutions involving Hadoop/HDFS, real-time systems, data warehouses, and analytics solutions. - Experience with DevOps practices, including version control (Git), CI/CD pipelines, and infrastructure-as-code (e., Terraform, Cloud Deployment Manager). - Strong background in working with relational databases, NoSQL databases, and in-memory databases. - Experience managing large datasets within Data Lake and Data Fabric architectures. - Strong knowledge of security best practices, IAM, encryption mechanisms, and compliance frameworks (GDPR, HIPAA) within GCP environments. - Experience in implementing data governance, data lineage, and data quality frameworks. - In-depth knowledge of web technologies, application programming languages, OLTP/OLAP technologies, data strategy disciplines, relational databases, data warehouse development, and big data solutions. - Led end-to-end processes for the design, development, deployment, and maintenance of data engineering projects. - Excellent debugging and problem-solving skills. - Retail and e-commerce domain knowledge is a plus. - Positive attitude with strong analytical skills and the ability to guide teams effectively. Preferred Qualifications : - GCP Certifications: Such as Professional Data Engineer or Professional Cloud Architect. - Apache Spark and Python certifications. - Experience with Data visualization tools like Tableau, Power BI etc. (ref:hirist.tech)
Location: bangalore, IN
Posted Date: 11/28/2024
Location: bangalore, IN
Posted Date: 11/28/2024
Contact Information
Contact | Human Resources CygnusPro Software Solutions Pvt. Ltd |
---|