Data governance solves multiple KM challenges
Data governance is at the hub of multiple KM functions. It ensures that information is reliable, discoverable, and secure from unauthorized use. In addition, it supports digital transformation because it entails integration across multiple sources of data, promotes the maintenance of quality data, and catalyzes a deep understanding of how data is used and shared. The data governance market is predicted to reach $2.73 billion in 2023 and to grow at about 20% per year to reach $6.71 billion by 2028, according to Mordor Intelligence. This growth is driven by the need to automate compliance and to use the constantly growing volume of data to achieve positive business outcomes.
Data quality sets the stage
Backcountry, founded in Park City, Utah, in 1996, sells outdoor sports clothing and equipment from specialty brands such as Stoic and Helinox. Its products are sold online and in stores in Seattle; Boulder, Colo.; Chicago; Washington, D.C.; and other U.S. locations. With hundreds of products and an extensive customer base, managing data is a priority for Backcountry. Its repositories include data for customer acquisition, segmentation and personalization, marketing, and customer support.
As a result of having to contend with many data sources, Backcountry decided to move to a cloud-based data warehouse that could accommodate a burgeoning volume of data. However, the increased scale and complexity of the data operations and pipelines created data quality issues that prompted Backcountry to explore Monte Carlo’s Data Observability Platform. Although each application could track data as it flowed through its own model, Backcountry wanted a solution that would integrate with all parts of the data stack and pipelines.
By using the Monte Carlo Data Observability Platform, Backcountry was able to obtain visibility throughout its data stack. In addition, it reduced costs by automating tasks that had previously been done manually and was able to resolve data issues more quickly. For example, the data team was immediately notified when 50,000 rows in a table were unexpectedly deleted. The Data Observability Platform was able to point the team to an issue with a data source, and the error was quickly mitigated.
The Monte Carlo Data Observability Platform integrates with key elements of an organization’s data stack and monitors data flows throughout the enterprise. “The goal is to find anomalies that indicate data quality issues before they have time to have an impact,” said Lior Gavish, CTO and co-founder of Monte Carlo Data. The data observability tool analyzes patterns in data and detects deviations from the expected patterns. “Alerts can be sent out to the appropriate data stewards via Slack, Teams, or what- ever communication mode the company desires,” continued Gavish.
Potential data incidents fall into four categories: freshness (how often data is updated), volume (the expected quantity of data), data quality (ensuring data is complete and accurate), and schema (accounting for any data changes in the structure). Each of these factors needs to be observable, along with data lineage—where the data came from and where it is going. Once a problem is identified, Monte Carlo provides context about the data, zeroing in on all potential root causes. Data lineage provides a visualization of the data flow and shows how errors in one table may cascade downstream to dashboards consumed by business users.