About the Role
The Data Analytics team enables data-driven decision making in GrowSari. As a data engineer, you will serve a valuable role in the team by building ELT data pipelines that capture data from various sources, load them into our data infrastructure, and transform them into forms that are useful and analytics-friendly.
The primary goal of the Data Engineer is to unlock productivity gains for the analytics team by bringing new datasets into our data warehouse quickly, maintaining optimal performance of our data infrastructure, and ensuring that all our datasets are available on-time and of high quality.
We are looking for someone who is up for any kind of challenge, is excited about new technologies, understands the importance of reliable data, and has a strong sense of ownership.
- Work closely with the Data Analytics Team members to clearly define data requirements
- Develop data pipelines that capture new datasets from production systems, internal
teams, and external sources, and make them available in our data warehouse
- Maintain existing data pipelines and infrastructure, ensuring high levels of data quality
and system health
- Upgrade our main data pipelines from incremental batch jobs to a streaming data
architecture that will enable real-time analytics at scale
- Provide data expertise and guidance to other team members on how to efficiently write
SQL statements and process data for analytics
- Collaborate with the team to assess the rapidly evolving data needs of Growsari, explore new technologies, and initiate improvement projects
- Data analytics excites you.
- You are an expert in SQL and Python.
- You know the difference between OLTP and OLAP.
- You can design and implement an ELT/ETL pipeline from scratch.
- You are skilled in at least 1 data orchestration tool, e.g. Apache Airflow.
- You have at least 2 years of relevant work experience in data engineering.
- You are a self-starter. You know how to add value to the team without passively waiting
for tasks to be assigned to you.
- You have prior experience with the following AWS services: Amazon Redshift, EC2, S3, RDS, DynamoDB, Kinesis
- You are familiar with data streaming architectures and relevant technologies.
- You have previous experience in other analytics roles, e.g. data analyst, data scientist