Prama.ai
Prama.ai - Python/PySpark Developer - Data Engineering
Job Location
in, India
Job Description
About the Role : We are seeking a highly skilled and motivated Python/PySpark Data Engineer to join our growing data engineering team. In this role, you will play a crucial part in building and maintaining robust and efficient data pipelines that power our data-driven decision making. You will work closely with data engineers, analysts, and other stakeholders to design, develop, and deploy high-performance data solutions on cloud platforms, primarily AWS. Responsibilities : Data Pipeline Development & Maintenance : - Design, develop, and maintain data pipelines using PySpark on cloud platforms like AWS EMR, AWS Glue, and Databricks. - Extract, transform, and load (ETL) large datasets from various sources (e.g, databases, APIs, cloud storage) into data warehouses and data lakes. - Optimize data pipelines for performance, scalability, and cost-effectiveness using techniques like data partitioning, caching, and indexing. - Implement data quality checks and validation procedures to ensure data accuracy and integrity. - Troubleshoot and resolve data pipeline issues promptly and effectively. Python & PySpark Proficiency : - Write clean, efficient, and well-documented Python code for data processing, transformation, and analysis. - Leverage advanced PySpark features like DataFrames, SQL, and Spark SQL for data manipulation and aggregation. - Experience with Spark streaming and real-time data processing is a plus. Cloud Technologies : - Hands-on experience with AWS services such as S3, Redshift, Glue, EMR, and IAM. - Familiarity with cloud-native data platforms and tools is a plus (e.g, AWS Glue Data Catalog, AWS Athena). Data Warehousing & ETL/ELT : - Strong understanding of data warehousing concepts, including dimensional modeling, data marts, and data lakes. - Experience with ETL/ELT processes and tools (e.g, Airflow, Prefect). Collaboration & Communication : - Collaborate effectively with data engineers, data analysts, data scientists, and business stakeholders to understand data requirements and translate them into technical solutions. - Clearly communicate technical concepts and project progress to both technical and non-technical audiences. Continuous Learning : - Stay up-to-date with the latest advancements in data engineering technologies, best practices, and industry trends. Qualifications : - Bachelor's degree in Computer Science, Computer Engineering, or a related field. - 3 years of professional experience in Python development. - 2 years of hands-on experience with PySpark and the Spark ecosystem. - Strong understanding of data structures, algorithms, and object-oriented programming principles. - Proficiency in SQL and experience with relational databases (e.g, PostgreSQL, MySQL, Oracle). - Experience with data warehousing concepts, ETL/ELT processes, and data modeling techniques. - Excellent analytical and problem-solving skills with the ability to identify and resolve complex data issues. - Strong communication and interpersonal skills with the ability to work effectively in a collaborative team environment. - Experience with Agile development methodologies is a plus. Bonus Points : - Experience with containerization technologies like Docker and Kubernetes. - Knowledge of machine learning and data science concepts. - Experience with data visualization tools (e.g, Tableau, Power BI) (ref:hirist.tech)
Location: in, IN
Posted Date: 2/19/2025
Location: in, IN
Posted Date: 2/19/2025
Contact Information
Contact | Human Resources Prama.ai |
---|