For one of our clients we are looking for Consultant (m/f) Azure Databricks
Project description:
Collecting a huge amount of data from different data sources (IoT data, process data etc.) into an Azure Data Lake Storage on a continuous basis (e.g., hourly or daily)
Joining/transforming (part) of the data and copying it to different folders in the Data Lake (e.g., for third party access)
Activities:
Technical consultation for and implementation of Azure Databricks transformations (SQL or Python) of data stored in Azure Data Lake (from several data sources, e.g., IoT and process data) so that is can be accessed efficiently
Technical consultation for and implementation of “delta transformations”, i.e., transformation of only new data since the last transformation (e.g., daily or hourly)
Set up Databricks jobs for the above transformations considering dependencies between them (e.g., Job A must be finished before Job B can start).
Creation of (automated) tests for the above solutions
Creation of hand-over documents (operational task list, operational handbook, infrastructure sheet) for maintenance team, that will be provided to client for verification and approval
Remote training of the maintenance team about implemented transformations (probably 1-3 sessions, depending on the complexity)