THE ROLE & THE TEAM
As a Data Engineer in the Transport team, you will have end-to-end ownership of discovering, designing, maintaining and orchestration of the team’s data and machine learning pipelines, data provisioning via cloud-based processing technologies as well as data extraction from different source systems.
You will collaborate in a cross-functional environment with Product Managers, operations and engineers to enable the analysis and insights into our transport performance.
The Team:
- We are a team of data engineers within the Transport domain under Logistics. Our team's responsibility is to work with Transport's product and engineering teams to deliver the Business KPIs. We are here to:
- Work on reports and building Dashboards to present actionable insights for the leadership.
- Do Research about the business requirements and translate it to source and target mappings.
- Conduct data discovery and find out the right strategy to ingest data into Datalake.
- Integrate the required datasets into Datalake by sourcing it from disjointed source systems owned by different product/domain teams in a compliant way.
- Monitor the Data Integration and reporting jobs to ensure seamless reporting of KPIs.
WHY YOU SHOULD BE INTERESTED…
You will be joining a relatively young and growing team of Data Engineer and Data Engineers spearheading Data Engineering & Analytics within the Transport domain. We trust your entrepreneurial capabilities!
Work in a multi-functional setup to identify and support our stakeholder’s needs and use cases and collaborate with other product and engineering teams to develop our long-term data-oriented solutions for our stakeholders.
Share knowledge, promote clean data engineering methodologies and collaborate with other colleagues.
Design, implement and maintain ETL pipelines that will power the dashboards and reports to help our stakeholders to make informed decisions about all aspects of Transport operations.
WE’D LOVE TO MEET YOU IF…
You have 3-5 years of extensive experience in data engineering and have a good understanding of data structures and algorithms.
You have an analytical mindset and solid coding skills in Python, with a focus on clean, readable and maintainable code.
NICE TO HAVE...
Git, CI/CD and DevOps;
BI Tools like Power BI, Google Looker Studio, Microstrategy or Tableau.
Hands-on: Machine Learning Models, AI agents.
You are experienced in distributed data processing using Spark SQL and PySpark (Databricks is a plus)
You are knowledgeable in SQL, ETL design and development, data modelling and interface specifications, quality assurance and testing methods.
You have strong knowledge of relational DBMSs, Redshift and Exasol preferred.
You have experience building data solutions on cloud, preferably AWS.
You have experience in orchestration tools, preferably Apache Airflow.
Excellent verbal and written communication skills with a detail-oriented and self-starter mindset.