SENIOR DATA ENGINEER – PYTHON – AIRFLOW – £600 PD – 6 MONTHS

  • Intec Select
  • London bridge station, London, UK
  • 27/01/2021
Full time Data Science Data Engineering Data Analytics Big Data Data Management Statistics

Job Description

An excellent opportunity has arisen with a global financial for an experienced data engineer to help re-architect one of their flagship data products. This is a platform that aggregates many different data sources, applies machine learning to generate predictions and presents everything in a web application.

Initially working remotely then onsite from their London office (likely from June), this is an exceptional opportunity for those who want to enjoy state-of-the-art R&D and be challenged !!

Responsibilities:

  • Architect and build distributed, scalable, and reliable data pipelines to ingest and process data
  • Collaborate closely with stakeholders and researchers to support machine learning, analytical and product use cases
  • On-board, document and curate external datasets for internal usage
  • Engage in data interpretation and forensic data analysis and troubleshooting

Requirements

  • Self-motivated and creative. They are building from ground up, not tweaking legacy
  • Good communicator and team player
  • Solid understanding of algorithms and data structures
  • Proficiency in scripting languages (especially Python)
  • Experience of cloud platforms (e.g. Amazon Web Services)
  • Commercial experience of writing and optimizing SQL queries
  • Experience of data orchestration platforms (e.g. Airflow)

An ideal candidate (is not required to, but) will also have:

  • Experience in feature engineering for Machine Learning applications
  • Knowledge of data versioning libraries/ methods (e.g. DVC)
  • Familiarity with data lakes
  • Experience of cloud data warehouses (e.g. Snowflake)
  • Familiarity with migrations frameworks for change management (e.g. Flyway, Alembic)
  • Experience of dynamic schemas/schemas evolution
  • Experience building streaming pipelines (e.g. Kafka, Kinesis)
  • Familiarity with continuous deployment
  • Experience ingesting data from web (e.g. HTTP requests, HTML/ XML parsing)
  • Familiarity with Big Data ecosystem (e.g. Spark/ MapReduce)
  • Experience of load testing data pipelines
  • Experience of geo-indexing technologies (e.g. PostGIS)
  • Knowledge of columnar data storage formats (e.g. Parquet, ORC)
  • Experience working within an agile team