Georg Heiler
Georg Heiler
Home
Blog
Publications
Projects
Lecturing
Talks
Contact
Light
Dark
Automatic
python
OSI layers for the data ecosystem
Blueprint pillars for a data mesh architecture.
Georg Heiler
,
Martin Windisch
,
ChatGPT
Last updated on Jan 18, 2023
7 min read
Governance and pipelines in the modern data stack
The data orchestrator is at the heart of the data pipelines. We start by exploring how a modern data orchestrator drastically eases the development of pipelines. Then we will see how govanance can be conducted efficiently in a MDS-based setup.
Dec 8, 2022 4:00 PM — 8:00 PM
weXelerate
Georg Heiler
AI basierte Root Cause Analyse von CPD Störquellen in Docsis Netzen
Good quality network connectivity is ever more important. For hybrid fiber coaxial (HFC) networks, searching for upstream \emph{high noise} in the past was cumbersome and time-consuming. Even with machine learning due to the heterogeneity of the network and its topological structure, the task remains challenging. We present the automation of a simple business rule (largest change of a specific value) and compare its performance with state-of-the-art machine-learning methods and conclude that the precision@1 can be improved by 2.3 times. As it is best when a fault does not occur in the first place, we secondly evaluate multiple approaches to forecast network faults, which would allow performing predictive maintenance on the network.
May 10, 2022 12:00 AM — May 12, 2022 12:00 AM
Georg Heiler
PDF
Slides
Orchestrating data in the mesh of the fragmented modern data stack
The fragmented modern data stack has emerged as the unbundling of Airflow. Various tools operate in silos. Dagster as a next-generation data orchestrator allows you to clearly see the data dependencies of the individual pipelines on your data factory floor.
Apr 27, 2022 12:18 AM — 12:20 AM
Georg Heiler
Slides
Video
Making BigData small again (and green)
Towards simpler and perhaps more energy efficient data platforms with increased developer productivity.
Georg Heiler
Last updated on Apr 3, 2022
9 min read
Comparing SQL-based streaming approaches
Comparing established and up-and-coming streaming approaches for an integrated real-time data model
Georg Heiler
Last updated on Apr 25, 2022
26 min read
Identifying the root cause of cable network problems with machine learning
Good quality network connectivity is ever more important. For hybrid fiber coaxial (HFC) networks, searching for upstream high noise in …
Georg Heiler
,
Thassilo Gadermaier
,
Thomas Haider
,
Allan Hanbury
,
Peter Filzmoser
Preprint
Cite
SFTP sensor
Way too many data pipelines still work with SFTP file transfer. Even a modern data orchestrator needs to interface here well.
Georg Heiler
,
Sandy Ryza
Last updated on Apr 30, 2022
5 min read
Connector goodness from Airbyte E2E lineage
Simplify data ingestion with the plentiful connectors of Airbyte without compromising on data lineage
Georg Heiler
,
Sandy Ryza
Last updated on Mar 17, 2022
3 min read
Scalable data pipelines from dagster with pyspark
Getting started with simple dagster pipelines.
Georg Heiler
,
Sandy Ryza
Last updated on Mar 17, 2022
5 min read
»
Cite
×