Data Engineer (Fixed Term Contract)

  • BeZero Carbon
  • London E2, UK
  • 16/04/2024
Full time Data Engineering Business Intelligence Software Engineering Data Warehouse

Job Description

About Us

BeZero Carbon is a global ratings agency for the Voluntary Carbon Market. We distribute our ratings via our SaaS Product, BeZero Carbon Markets, informing all market participants on how to price and manage risk. Our ratings and research tools support buyers, intermediaries, investors and carbon project developers.

Founded in April 2020, our 150+ strong team combines climatic and earth sciences, sell-side financial research, earth observation, machine learning, data and technology, engineering, and public policy expertise. We work from five continents.

We raised a significant Series B funding round in late 2022, and are growing rapidly as a company, accelerating the Net-Zero transition through ratings.

www.bezerocarbon.com

Job Description

BeZero is looking for a data engineer to join our existing data products and tooling team for a fixed term contract of 3-4 months .

This team is part of our broader data organisation, and is focussed on developing carbon offset-related data products for our clients, as well as building internal data tools to increase the efficiency of our Ratings teams. This is a cross-functional role: you will be working together with colleagues from our product, ratings, and software engineering team every day.

BeZero has invested heavily in the development of internal tools to increase the operational efficiency of its ratings production process. As part of this investment, the team has been working on our in-house central data portal that enables rating analysts to access prepared and curated data essential for evaluating carbon offset projects. Aside from helping our team with daily maintenance of the existing data pipelines, we have two projects this contractor role would be responsible for delivering:

1. Rearchitecting our dbt project: We currently have a single dbt project that manages our analytical data model. We want to adjust our project to achieve the following:

a. Allow smaller slices of the project to run independently and on different frequencies.

b. Better protect against bad data making its way to our presentation layers.

c. Unify and define patterns to build a common toolset that can be applied to different use cases.

2. Extend our analytical stack to power in-house analyses: We currently expose to our ratings and analytics teams datasets and analyses on Metabase to feed into their ratings process. Ratings processes differ substantially for different projects resulting in varying requirements for datasets and analyses.

a. Re-assess and define patterns to ingest and make third-party data available to analytics users

b. Extend our ingest and dbt pipelines to surface and process the data

c. Work with our analytics stakeholders to fulfil their business requirements

We have ideas on how to achieve the projects above but are looking for a dbt expert to assess what we have, work with us to find the best approach for our team, define best practices and ultimately execute the required changes. You’d be a great fit if you’ve worked on similar projects before, perhaps as a Data Engineer or Analytics Engineer.

Tech stack

Our data stack includes the following technologies:

  • AWS serves as our cloud infrastructure provider.
  • Snowflake acts as our central data warehouse for tabular data. AWS S3 is used for any of our geospatial raster data, and we use AWS RDS instances with PostGIS for storing and querying geospatial vector data.
  • We heavily use dbt for building SQL data models and Python jobs for any non-SQL data ingestion and transformations (typically for API integrations).
  • Our computational jobs are executed in Docker containers on AWS ECS, and we use Prefect as our workflow orchestration engine.
  • GitHub Actions for CI / CD.
  • Metabase serves as a dashboarding solution.

We are a remote-friendly company and many of our colleagues work fully remote; however, for this position, we will only consider applications from candidates based in the UK.

You’ll be our ideal candidate if:

You have at least 4 years of experience building ELT/ETL pipelines in production, using Python and SQL.

You are deeply familiar with dbt and have experience scaling dbt repositories beyond a couple of hundred models.

You’ve designed back-end services and deployed APIs yourself, ideally using a framework like FastAPI.

You have experience in deploying and maintaining cloud resources into production using tools such as AWS Cloud Formation, Terraform, or others.

You have hands-on experience with workflow orchestration tools (e.g., Airflow, Prefect, Dagster), containerization using Docker, and a cloud platform like AWS.

You can write clean, maintainable, scalable, and robust code in Python and SQL, and are familiar with collaborative coding best practices and continuous integration tooling.

You are well-versed in code version control and have experience working in team setups on production code repositories.

You have experience in deploying and maintaining cloud resources into production using tools such as AWS Cloud Formation, Terraform, or others.

The ideal candidate will be able to begin a Full-Time contract at short notice. This role can be performed on a hybrid basis (London Office), or remotely in the UK.

Our interview process:

  • Initial screening interview with recruiter (15 mins)
  • Introduction call with Chief Data Officer (30 mins)
  • Technical interview with members from the data engineering team (90 mins)
  • Reference checks + offer

We value diversity at BeZero Carbon. We need a team that brings different perspectives and backgrounds together to build the tools needed to make the voluntary carbon market transparent. We’re therefore committed to not discriminate based on race, religion, colour, national origin, sex, sexual orientation, gender identity, marital status, veteran status, age, or disability.