Plan, Build, Manage: A Model for Understanding Data Observability
September 11, 2023
10 Min Read
Several months back, Willem Koenders of ZS Associates authored a Medium article titled "The best way to explain data governance to beginners." In this piece, he used an analogy that likened different aspects of data management and governance to the components of a tangible structure, encompassing its assets and operations. The article also featured a perfectly detailed illustration that linked various elements to their real estate and construction counterparts. I’ve included the visual here:
In my role as the Chief Product Officer at Acceldata, I am dedicated to raising awareness about a new, critical technology category known as enterprise data observability (EDO), which Acceldata created and stands as a market leader.
By aligning Koenders’ perspective with the point of view of EDO, I’ll explain how data observability functions as a supervisory layer that elevates every facet of data management. This analogy could be summarized like this:
Consider what’s required to ensure property reliability in real-time: tracking wiring and plumbing status, detecting anomalies such as structural changes and usual activities, optimizing operating costs, assigning charges to individual tenants, implementing real-time rule-based alert systems, and taking proactive measures.
Enterprise data observabilitycan be likened to a cutting-edge building management system for your data assets. In the following sections, I'll refurbish (pun intended) and outline how EDO enhances and bolsters each facet of data management and governance, as elegantly articulated by Koenders. Each section below remains unchanged from Willem's article, and I've introduced an enterprise data observability bullet point to highlight the additional capabilities that enhance each element.
Like real estate, data assets, such as datasets, are valuable resources that require proper management to yield benefits and avoid risks.
Enterprise Data Observability: Ensures data assets remain reliable and valuable, which is critical to maintaining a well-kept building for its occupants.
Data (Product) Ownership
Just as a property has an owner, data also needs a designated owner or team responsible for its management.
Enterprise Data Observability: Enterprise data observability uses real-time alerts to notify data owners of issues, preventing potential damage, similar to proactive building maintenance to avoid costly repairs.
Much likeproperty managers overseeing real estate, data stewards manage data assets, ensuring their quality and usage.
Enterprise Data Observability: Data teams must optimize data usage and quality in a way that compares to how property managers benefit from real-time building system monitoring. Efficiency in both cases significantly impacts profitability by reducing waste and unnecessary expenditures. Just as leaving a tap running, neglecting to turn off lights, or failing to optimize heating and cooling can result in substantial waste in the real estate domain, similarly, in data and analytics, a considerable amount of resources and expenditure are squandered on inefficient, protracted, and even duplicate SQL queries, as well as neglected Spark instances that should be shut down.
Data Consumers / Users
Property managers need to know their tenants, employees, contractors, and others who are stakeholders in their properties.
Enterprise Data Observability ensures data consumers' confidence by detecting anomalies, whereas tenants seek safety in a building. It also transparently reports usage costs to promote responsible data practices, which is much like utility billing in real estate.
In the same way that owners generate income from a property, data monetization involves making revenue from data assets, such as selling data to other organizations.
Enterprise Data Observability: Data teams can use data monetization insights from enterprise data observability to ensure data quality and reliability, which is like maintaining a building to attract tenants or buyers. It's essential in a landscape where enterprises aim to monetize their data and create valuable data products, benefiting large commercial data providers.
Like a lease agreement for a property, a data contract is a formal agreement between a data producer and consumer, specifying data exchange terms, format, and quality requirements.
Enterprise Data Observability: The right data observability solution will establish trust in data contracts through real-time reliability and alerts, similar to fostering confidence between a landlord and tenant in a well-maintained property. It also enforces contract compliance by promptly notifying of rule deviations.
Whether it's property or data, estimating value is important, with property value tied to location and condition, and data value linked to relevance, accuracy, and accessibility.
Enterprise Data Observability: Data teams need metrics for precise data value quantification, similar to property valuation considering condition and location. Value quantification ensures cost optimization aligns with budgets, crucial in a generative AI era with uncertain processing requirements for large language models.
Data Security and Access Controls
Data security safeguards data from unauthorized access, in the same way that locks and alarms to protect a property.
Enterprise Data Observability: Enterprise data observability uses alerts to detect unusual data behavior or "drifts," potentially indicating issues like insider data manipulation for malicious purposes, even though it's not directly related to cybersecurity or hacking.
Like a blueprint for a building, data architecture defines the structure of data storage and retrieval systems, guided by standards and best practices.
Enterprise Data Observability: Data pipeline observability monitors the integrity and functionality of data architecture, similar to how changes in a building's blueprint reflect schema drift. Pipeline interruptions are similiar to plumbing blockages or electrical disruptions in a building.
Data is categorized into domains, much like neighborhoods in a city, each with unique attributes and requirements. Data domain owners, kind of like a homeowners association, oversee these requirements.
Enterprise Data Observability: One of the key features of enterprise data observability is maintaining data domain integrity and quality through real-time reliability and anomaly detection, mirroring a homeowners association's role in preserving neighborhood well-being.
As with zoning and environmental regulations for properties, data policies set rules for managing data in organizations, often based on data privacy and protection regulations.
Enterprise Data Observability helps maintain regulatory compliance by real-time monitoring of data against policies, similar to immediate responses to building code violations during routine inspections.
Metadata is information about data, detailing attributes, ownership, and access.
Enterprise Data Observability: Effective data observability ensures metadata precision and timeliness, which is likemaintaining property records diligently. It also identifies potential schema drift and raises alerts.
Data quality assesses accuracy, completeness, and consistency, similar to property conditions in real estate management, which checks for defects and safety hazards.
Enterprise Data Observability: Data teams are in continuous pursuit of data quality, which is done by detecting anomalies and addressing issues early, unlike property inspections that identify and fix flaws. It predicts and prevents data problems, potentially saving downstream resources.
Data remediation is about finding and fixing data quality problems, like to repairing property defects to maintain property value and safety.
Enterprise Data Observability aids data remediation by quickly spotting issues through anomaly detection, acting as an early warning system to prevent significant data quality problems and reduce the need for extensive cleanup.
This might include actions like measuring property occupancy and visitor logs to assess value, data usage involves tracking how, by whom, and to what extent data is used in an organization.
Enterprise Data Observability: Data usage metrics provide value assessment to enable efficient data processing. It also optimizes data usage costs to improve business margins and reduce resource consumption.
Property managers need to know a building's compatibility with utilities and infrastructure, data interoperability enables seamless data exchange with other systems and applications based on common standards.
Enterprise Data Observability: Data teams need interoperability among data sources and assets. Data observability monitors data asset interactions, pipeline flows, schema mapping, drifts, and data freshness to facilitate seamless integration with other systems.
Data storage refers to the capacity in databases, warehouses, or data lakes, whether physical or virtual.
Enterprise Data Observability: With data observability, data teams can optimize data storage by identifying unused data for archiving or removal, similar to maximizing building space use by property managers.
Properties age, experience renovations, and adapt. Similarly, data goes through stages like creation, storage, usage, archiving, and disposal.
Enterprise Data Observability: These are proactive insights across the data lifecycle, ensuring performance, value retention, and compliance through historical data and anomaly detection.
Consider the importance of connecting different parts of a building's infrastructure to make sure it all works. In the same way, data integration involves linking and harmonizing various data sources and systems.
Enterprise Data Observability: Data observability ensures seamless data integration by monitoring data flows and ensuring data compatibility across systems and applications for optimal data utilization.
The Future of Enterprise Data Observability
Enterprise Data Observability is rapidly gaining widespread acceptance. Gartner continues to advance Enterprise Data Observability on its hype cycle for data management, and Acceldata has been prominently featured in no fewer than 12 hype cycles, underscoring the significance and applicability of EDO across diverse disciplines and industries, including FinOps.
For those still seeking clarity on the distinction between EDO and data quality, Gartner has published a report elucidating how EDO enhances data quality and outlining the best practices that enterprises should embrace today. You can access a complimentary copy of that Gartner report.