Modern data architectures are multilayered and usually operate in cross-platform environments. The advantages are well known, but the corresponding management concerns quickly become unwieldy. To handle the complexity, data leaders are increasingly looking to adopt data observability to understand what’s happening to their data, the quality of their data, and the costs, performance, and operational impact of how data is used.
Censuswide recently conducted a survey of more than 200 data leaders (which includes Chief Data Officers, VPs of Data Platforms, Data Engineers, and a variety of other titles, all of which are provided below) from across the United States. Perhaps not surprisingly, these leaders, all of whom manage sophisticated, complex data stacks, report serious concern about lack of visibility. The results highlight specific, common concerns, as well as insights into their data investment plans. These include:
Data quality issues
- 45% have experienced data pipeline failure 11-25 times in the past two years due to data quality or errors that were discovered too late.
- Of those 45%, 63% said that the customer experience their organization delivers suffered from the data pipeline failures.
- 25% claim that lack of data visibility is their top pain point related to data management.
- Over half (53%) state that their data teams spend 1-6 days per month addressing data quality issues.
- 58% indicate that dealing with issues related to data quality interferes with other job responsibilities and higher priority projects.
Data budget and investment plans
- 80% are interested in investing in tools to improve insight into their data pipelines.
- A large majority (85%) plan to employ data observability tools in the next 12 months.
To be clear, these data leaders are seeking solutions that allow them to improve the performance of their data, but also provide better ROI on their data investments. Their goals are not simply to report better numbers to their boards of directors, but rather to construct systems that deliver reliable, quality data that can be used to impact organizational goals in all areas of the business.
Let’s look more closely at the responses from this group to understand current data concerns and how they can be addressed.
Rudimentary approaches to data management
Clean, reliable data is clearly important for this group, as more than 50% are currently applying data quality solutions to impact data operations. Interestingly, a large number are using legacy master data management tools, which we know do not address issues pertinent to modern data stacks. Others are using data integration and data transformation tools as a way to get data and applications to connect. While this is normal for most organizations, these tools are operating with data that is accepted as being of high quality. Unless there is visibility into how users are engaging with the data, how it’s being processed, or what’s happening within pipelines, data management is at risk of using unreliable data.
Overconfidence due to satisfaction with the status quo
Overall, this group appears content with the level of data management they are applying. We do not, and cannot, know if they are aware of gaps in data visibility. Do they recognize that performance could be improved or that they may be overspending on data warehouses and processing?
Lacking data pipeline observability
Fewer than 70% of respondents feel that they have adequate ability to observe their users, and even fewer are confident in being able to truly observe and understand their data infrastructure. What’s especially concerning is that only 40% feel that they currently have a usable solution for understanding and observing their data pipelines. As pipelines are providing the sources and enabling the movement of data, observability would normally need to be prioritized at this layer.
Data pipeline failure is a major issue
Almost half of the data leaders in this survey reported that their data pipelines failed between 11-25 times due to data quality problems or errors that were caught too late. More than 20% had that issue more than 10 times, and an alarming number (about 18%) experienced this between 26-100 times.
Bad data creates customer experience issues
The vast majority of respondents indicated that data pipeline failures have created customer experienced issues. For those who dealt with multiple data pipeline failures, we can presume that these customer experience issues created ongoing issues that required immediate attention across many data team members.
Cost and visibility are keeping data leaders up at night
This question highlights the fact that data leaders are juggling multiple issues and anxious about ongoing problems, many at the same time. The top concern is cost and whether the organization is getting the most out of their data workload spend. Additionally, there is concern about lack of data visibility, data quality, and the inability to adequately manage data pipelines.
Data effectiveness is a top priority
Think data is important? So do these data leaders, where almost half are ascertaining the quality of their data on a daily basis. That said, about 15% are only assessing in the range of once per week to once every two or three weeks. Based on the engagement levels in other categories, we know that these leaders are conscientious about their data. These numbers suggest that because of the lack of the right solution, not all are capable of assessing it with regularity. None of the respondents indicated they assess on a continuous basis, which is what modern data observability solutions provide.
Data quality issues take time away from other data projects
More than 90% of respondents indicate that data quality issues are sucking up an inordinate amount of their time. Consider that more than 50% said that their teams require 1-6 days per month to fix these issues. Another 30+% are using one to four weeks per month on data quality issues. The cost and resource investment required to handle this not only detracts from other data issues and priorities, but it is a costly way to manage an IT organization.
There is an opportunity cost when data quality issues take priority
As we saw in the previous question, data teams are spending considerable time handling data quality issues. But the answers to this question make it clear that the dedication to data quality issues comes at the expense of other projects, responsibilities, and priorities. As we see here, the majority of respondents are having to focus on data quality issues more than they would like to.
Data pipeline visibility is critical
As data environments mature and become more complex, there is an increasing need for visibility into their operations and functions. The data leaders in this survey are certainly aware of how and why that’s so important, and as evidenced by the responses to this question, they are willing to invest in a tool that will give them the necessary insight and visibility they need to do their jobs. A full 80% are interested in finding such a solution.
Data observability awareness is high
This group is clearly well-versed in the options available to their data teams. More than 80% are aware of data observability as a solution for their data infrastructures. These leaders represent a growing group who recognize and understand how to optimize their data, and as we’ll see in subsequent questions, it is quickly becoming part of their core data strategy.
Data observability is being prioritized
Here again, we see that data observability is not only recognized as an important factor for data teams, but it is being prioritized in data infrastructure planning. More than 80% of respondents plan to implement a data observability solution in the next year, while fewer than 10% indicate they have no plans. Clearly, data observability will continue to be rapidly adopted and implemented for forward-thinking enterprises.
To learn more about data observability and how it can help your data operations efforts, check out Acceldata's Data Observability Cloud