The Data Engineer will play a critical role in leveraging data for strategic decision-making by focusing on the design, construction, and maintenance of data architecture and pipelines. This role encompasses setting up, implementing, and maintaining data pipelines and engaging in various stages of the data lifecycle. Responsibilities range from building data platforms from scratch to transitioning to new data designs and ensuring the security of existing systems.
Key Responsibilities:
Data Pipeline Development:
· Develop, optimize, and maintain data pipelines.
· Experience with building, optimizing, and maintaining data pipelines, including familiarity with ETL (Extract, Transform, Load) and ELT processes.
· Strong knowledge of batch processing, event-based architectures, and real-time data streaming.
· Automate data collection and processing workflows.
Database Management:
· Develop and maintain database schemas (SQL and NoSQL).
· Advanced SQL skills, enabling efficient and powerful queries.
· Ensure data integrity and optimize database performance.
Version Control and DevOps Practices:
· Experience using version control tools like Git for collaboration.
· Familiarity with CI/CD pipelines to automate data pipeline deployments.
· Exposure to infrastructure-as-code (e.g., Terraform or CloudFormation) for managing cloud resources.
Data Integration:
· Collaborate with data architects and business analysts to understand and fulfill data requirements.
Data Warehousing:
· Build and maintain data warehouses or lakes for analytical purposes using tools like Azure Data Factory, Azure Synapse, Snowflake, and Terraform scripts.
· Structure data for efficient querying and analysis.
Monitoring and Maintenance:
· Monitor data systems for performance and reliability.
· Troubleshoot issues and optimize data processes.
Documentation and Compliance:
· Document data architecture processes and workflows.
· Ensure compliance with data governance and security policies.