PRINCIPAL CONSULTANT For those who develop applications that deal with data fetching and processing, you will know that there...
Blog
Code
Data engineer’s guide to data governance (part 3/3)
DATA ENGINEERSThis is our last blog on the topic of data governance and there is still lot to cover, so let’s get going! In...
Processing API Data in Azure with Python (part 1/2)
DATA ENGINEERS This can be a huge accelerator when kicking off a new project and helps to easily and quickly create an MVP...
Message platform patterns
CHIEF DATA ARCHITECT Message platforms are one of the main tools used in data object distribution and streaming data processing....
High Performance Computing on GPUs (GCP vs AWS)
DATA ENGINEERS In this third part of the HPC blog series, we train the same machine learning model as in the previous two blog...
High Performance Computing with Slurm on AWS
DATA ENGINEERS As a second part of the blog series about HPC, we test performances of a Slurmcluster deployed on AWS cloud...
Cloudera cluster on Alibaba Cloud
CLOUD INFRASTRUCTURE ENGINEERCloudera Enterprise is a modern platform for machine learning and analytics, optimized for the...
Serverless ETL orchestration using AWS Step functions and on-demand Redshift cluster
DATA ENGINEER We're building a recommendation engine that is based on customer usage, billing data and a set of product offers...
GCP pipeline: pub/sub-lookup-storage (part 2/2)
DATA ENGINEERS This post will briefly describe how to create Cloud Run service and showcase two different cases for both...