The Avenga Team
Data Science combines mathematics, statistics, programming applied to collected data and activities to clean, prepare – stage the data. In few words, it is the scientific approach to knowledge extraction from data. Hopefully, the knowledge extracted is aligned with business needs and of real value!
Done correctly, Data Science provides actionable, valuable intelligence from massive volumes of data and delivers predictive and prescriptive analytics to make organizations make better decisions.
Avenga has been working with DataScience in a number of scenarios to help our clients gain actionable insights. In one instance, our data team was asked to analyze a complicated data pipeline for a large organization. They were challenged meeting data delivery SLA on a daily basis caused by various nightly data cleansing and processing jobs running long, failing all together or behaving in unexpected ways. Over time, as scale and complexity increased with computing moving to the cloud, and leveraging microservice architectures, their current monitoring techniques and tools needed to be extended.
Challenges: data comes from different data sources (Netezza, Oracle, RedShift, and PostgreSQL), reporting should be similar for all data sources. With lots of metrics, applications, and high performance systems, keeping track of performance became a difficult task.
Outcomes from the Data Science Engagement:
Reports are built in Tableau and spread between mailing distribution on the daily basis.
Thus we proposed full cycle of System Performance Data Analysis: data collection, data preprocessing, data modeling/thresholds identification, anomaly notification/alarms, and data visualization/reporting solution.