Data Engineer

Location : Cochin

Employment Type : Full Time

Work Mode : Hybrid

Experience : 3-5 yrs

Job Code : BEO-9687

Posted Date : 11/10/2024

Job Description

Responsibilities

The Data Engineer will play a critical role in leveraging data for strategic decision-making by focusing on the design, construction, and maintenance of data architecture and pipelines. This role encompasses setting up, implementing, and maintaining data pipelines and engaging in various stages of the data lifecycle. Responsibilities range from building data platforms from scratch to transitioning to new data designs and ensuring the security of existing systems.

Key Responsibilities:

Data Pipeline Development: 

·      Develop, optimize, and maintain data pipelines. 

·     Experience with building, optimizing, and maintaining data pipelines, including familiarity with ETL (Extract, Transform, Load) and ELT processes. 

·       Strong knowledge of batch processing, event-based architectures, and real-time data streaming. 

·        Automate data collection and processing workflows.

Database Management: 

·        Develop and maintain database schemas (SQL and NoSQL). 

·        Advanced SQL skills, enabling efficient and powerful queries. 

·        Ensure data integrity and optimize database performance.

Version Control and DevOps Practices:

·        Experience using version control tools like Git for collaboration.

·        Familiarity with CI/CD pipelines to automate data pipeline deployments.

·    Exposure to infrastructure-as-code (e.g., Terraform or CloudFormation) for managing cloud resources.

Data Integration: 

·   Collaborate with data architects and business analysts to understand and fulfill data requirements.

Data Warehousing: 

·        Build and maintain data warehouses or lakes for analytical purposes using tools like Azure Data Factory, Azure Synapse, Snowflake, and Terraform scripts. 

·        Structure data for efficient querying and analysis.

Monitoring and Maintenance: 

·        Monitor data systems for performance and reliability. 

·        Troubleshoot issues and optimize data processes.

Documentation and Compliance: 

·        Document data architecture processes and workflows. 

·        Ensure compliance with data governance and security policies.

Desired Candidate Profile

Programming Languages: 

·        Proficiency in Python.

Modern Tools and Technologies: 

·       Familiarity with the Azure platform, Terraform, DBT, Vertex AI, and Python as the primary programming language.

Data Management: 

·        Experience with SQL and NoSQL databases (e.g., MongoDB, Cassandra).

Big Data Technologies: 

·        Familiarity with tools like Kafka, Hadoop, and Spark.

Cloud Platforms: 

·        Experience with Azure services (or AWS, Google Cloud).

Data Modelling: 

·        Understanding of data modelling techniques and concepts.

Version Control: 

·        Knowledge of Git for code management.

Education and Experience:

Education: 

Bachelor’s degree in computer science, Data Science, or a related field. Advanced degrees may be preferred.

Experience: 

Typically, 3-5 years of experience in a data engineering or related role.

Problem Solving & Soft Skills:

·        Adept at troubleshooting pipeline failures, optimizing data flows, and ensuring scalability. 

·        Experience with incident management in production environments. 

·        Strong analytical thinking for solving complex problems and analyzing large datasets. 

·        Excellent verbal and written communication skills for collaborating with cross-functional teams. 

·        Attention to detail to ensure high data quality and integrity. 

·        Flexibility in working across time zones and with distributed teams. 

·        Ability to prioritize tasks and meet deadlines with minimal supervision. 

·        Familiarity with agile methodologies and project management tools like Jira is a plus.

Nice-to-Haves:

·        Experience with Production Systems and Client Data: 

·        Familiarity with real-world production-level systems.

 Agile/Scrum Knowledge: 

·        Hands-on experience with tools like Jira and Confluence.

ML/AI Data Pipeline Experience: 

·        Familiarity with technologies like Airflow, Spark, and LangChain.

Database Expertise: 

·        Ability to optimize databases for specific use cases.

Data Integration Platforms: 

·        Experience with Kafka and similar platforms.

 ML Ops: 

·        Knowledge of managing and monitoring machine learning models in production.

 Cloud Experience: 

·    Proficiency in cloud environments, especially Azure, with data engineering components (e.g., Dataflow, CICD, Azure DevOps, Git, Python, PySpark, Streamlit, Vertex AI, Dataproc, Sagemaker, HDInsight, Databricks).

General Networking and Security Knowledge: 

·        Expected to have a basic understanding of networking and security principles.

 Career Path:

·        Start as a Data Engineer or Data Analyst. 

·      Progress to Senior Data Engineer or Data Scientist roles based on career goals and performance.

Back

We use cookies to personalize and enhance your browsing experience on our websites. By clicking "Accept all cookies", you agree to the use of cookies. You can read our Cookie Policy to learn more.