Optimize Your Snowflake Environment With These Eight Data Observability Metrics

Join our virtual, hands-on workshop, Data Observability for Snowflake | December 8 @ 10am PST

Learn how to certify and ensure that all datasets in your Snowflake instance are accurate and reliable. Register now

Snowflake offers unparalleled capabilities to engineer seamless data pipelines and scalable data applications. But more importantly, it charges you on a pay-per-usage basis. So you can scale up or down the cloud compute resources as needed. 

But this pay-per-usage pricing model can act as a double-edged sword. If you’re not careful, mistakes and oversights can quickly escalate Snowflake costs, which may then come back to bite you in higher cloud bills.

You need Acceldata’s data observability cloud to get the maximum value from your Snowflake environment. Acceldata’s advanced features go beyond the set of error-prone manual queries that extract basic Snowflake cost and usage metrics. It automatically generates granular storage, compute, admin, and query reports across all your Snowflake services.

1. Monitor Costs Across All Your Snowflake Services

Get daily or monthly aggregated Snowflake cost trends automatically in the form of a stacked bar graph. Each Snowflake service, such as compute, storage, clustering, and Snowpipe, is represented in different colors.

Get daily or monthly aggregated cost trends automatically for all your Snowflake services in the form of a stacked bar graph.

Hovering over a bar shows you a breakdown of costs across all your Snowflake services for the selected time period. For example, from this dashboard, you can see how much each Snowflake service has cost over the past 12 months.

Acceldata automatically calculates costs for each Snowflake service. Hovering over a month shows the monthly cost for each service.

Periodically monitoring this dashboard can help you keep a tab on your overall Snowflake costs without underutilizing or overutilizing your budget by too much.

Without Acceldata metrics, every time you need to view credits used by each Snowflake service, you’ll need to run a manual query similar to the code that Snowflake Solution Architect John Ryan suggests in his blog post.

To see monthly credits used by each Snowflake service, you’ll need to run a manual query similar to the above image. Source

However, each and every time you make changes to your Snowflake infrastructure or services, you’ll need to update these manual queries accordingly. Otherwise, your results may have errors. Also, by monitoring Snowflake costs manually, you may lose opportunities to proactively align usage and costs.

2. Automatically Identify Cost Anomalies That Spike Your Bills

Acceldata automatically understands what your average compute resource usage boundaries are and flags any workload spikes that fall outside of them. By alerting you to cost anomalies when they occur, Acceldata helps you avoid unnecessary spikes in your monthly Snowflake bills.

Acceldata’s Compute Observability features automatically detect cost anomalies that fall beyond your estimated average daily cost.

Without Acceldata metrics, there is no straightforward way to query Snowflake for cost anomalies as and when they occur. 

The closest workaround is to use Snowflake’s resource monitor notifications. To use this feature, you’ll need to manually enter threshold limits for each resource and set what action needs to be performed after exceeding a threshold. You can choose between the following three actions:

  1. Send notification and suspend activity after the queries are completed
  2. Send notification and immediately cancel the running queries and assigned warehouses
  3. Send notification only without any additional action

Snowflake allows you to send email notifications to all admin accounts, provided that each admin has enabled their “receive notifications” preference setting.

The limitation of this workaround is the manually set resource limits that don’t take into account how much your Snowflake environment has scaled. As a result, you’ll need to constantly update these limits to avoid interrupting routine queries.

3. Get a Granular Overview of Your Compute Resource Costs

The Acceldata dashboard also allows you to filter costs incurred for each Snowflake service. For example, you can filter out the storage and clustering costs to view only the compute costs over the last 30 days. This view can help you single out cost overruns or underruns for a particular Snowflake service. Switching the time period from daily to monthly shows you Snowflake service costs for the past 12 months.  

Filter each Snowflake service cost to identify under-budgeted or over-budgeted services. View cost trends as a daily or monthly view.

Without Acceldata’s metrics on your Snowflake environment, you’ll need to run individual queries to get a monthly or daily cost overview for each service. For example, Snowflake treats compute resources as virtual warehouse credits, so without Acceldata, you’ll run a query similar to what John Ryan suggests in his blog post.

The above query shows how you can get a breakdown of your monthly compute resources from Snowflake. Source

You’ll also need to run similar queries to extract costs of other Snowflake services such as storage, clustering, or Snowpipe.

4. Identify Which Storage Repositories Cost You the Most

Within Acceldata’s storage dashboard, you can view your monthly Snowflake storage repository costs. Set the time interval to view storage costs over the last week, month, or year. This dashboard automatically summarizes storage costs for the current month as well as the previous month. It also shows you the average monthly storage cost as a baseline comparison. You can use this information to understand how your monthly costs vary.

Drill down your monthly Snowflake bill into granular views that show which databases or data tables cost you the most.

The Acceldata dashboard also allows you to further drill down your storage costs into the top five costliest databases, data tables, and churn tables. This can help you investigate ways to further reduce your storage costs. For example, you may be able to use this information to identify a costly database with little or no utility.

Without Acceldata metrics, you have to run individual queries to extract current and previous month storage costs. To get an average cost baseline every month, you’ll need to extract storage costs over the previous 12 months and calculate their average value.

Without Acceldata metrics, you can still get an ordered list of storage credits grouped by each data warehouse. To do that, you would need to run a query similar to John Ryan’s suggestion.

To get an ordered list of storage credits by users, you’ll need to run a query similar to:

Run a query similar to the above image, to get an ordered list of storage credits. Source

However, you will need to run these manual queries periodically if you want to proactively identify potential storage cost areas.

5. Investigate Timeouts That Drain Compute Resources

Acceldata automatically flags unexpected workload spikes that cause timeouts in your Snowflake environment. Acceldata Guard Rails provides a list of users, queries, warehouse, or errors that triggered a Snowflake timeout. The list narrows down problem areas, helps you investigate root causes faster, and reduces compute resource wastage.

Acceldata’s Guard Rails list narrows down your timeout problem areas and helps you investigate root cause problems faster.

Without Acceldata, when a Snowflake service timeout occurs, you will need to manually check the STATEMENT_TIMEOUT_IN_SECONDS parameter for the account that caused a timeout.

Checking the STATEMENT_TIMEOUT_IN_SECONDS parameter can help you identify what caused a Snowflake timeout.

If you want the Snowflake service to continue, you need to set a higher value for the timeout parameter. For example, you could set the new timeout parameter value to 180,000 seconds using the following command.

alter warehouse <warehousename> set STATEMENT_TIMEOUT_IN_SECONDS=180000;

Set this parameter too high, and you risk running unnecessary workloads that can quickly consume your compute resources and result in a hefty bill. Set the timeout too low and you risk interrupting necessary workloads.

6. Get All Admin-level Insights in One Place

Acceldata’s admin dashboard shows you admin-level configuration data for your Snowflake environment. Acceldata’s interactive charts can help you visually summarize user roles, dormant users, and recent user password changes.

The admin dashboard can also help you further categorize login activities by user IP address, client type, and failed logins. This helps you weed out suspicious activity, and, where needed, it also helps you reach out to users with the right support.

Without Acceldata, there’s no straightforward way to get such admin-level insights beyond the set of basic Snowflake features. As a result, you risk remaining unaware of such user-level changes, which may in turn, affect security, performance, or both.

7. Predict Resource Utilization Before You Get the Bill

Acceldata’s capacity planning feature helps you predict resource consumption across all your Snowflake accounts and services. It helps you identify under or over-utilized services. And it can help you get more value from your annual contract.

Acceldata’s contract capacity planning feature predicts your cloud resource consumption against your contract capacity.

The contract dashboard summarizes key contract information, such as the total value of services consumed through the present time, both as a percentage value of the total credits as well as a dollar-cost value. It also shows you how many months are left until you renew your annual Snowflake contract.

In addition to these basic data points, Acceldata also shows a contract prediction chart. This chart extrapolates your usage patterns to predict how many credits you’ll end up using by the end of your contract period. Instead of learning about resource utilization after the bill is generated, use this chart to identify cost overruns or underruns before you get the bill.

From the example shown above, without Acceldata, the company will find it hard to use manual queries to gain this level of understanding. Manual queries alone can’t show they’re setting themselves up to only use 35% of their contract capacity. Based on their current usage data, manual queries also can’t help them predict a potential waste of over $8,000 dollars in unused compute resources.

8. Get Granular Insights On Your Queries

Acceldata goes beyond these basic features to categorize all your query execution times with respect to warehouse name, warehouse size and database. It also can show you a historical view of your query execution times.

This dashboard can also help you get a list of the 5 most longest running and resource expensive queries, for a selected time interval.

And it can also show you average query times categorized by each data warehouse and database.

Every Snowflake query you run puts a toll on resources. The Snowflake platform allows you to check the basic performance of a particular query. Pinal Dave shows how to analyze query performances in Snowflake.

Going into the Snowflake history of queries and selecting a particular query shows the above details. Source

Clicking on the query profile tab will show you more details about query performance. It includes details such as execution time, IO bytes scanned, and pruning partitions scanned.

Get further query performance details by clicking the profile tab of a particular query. Source

But this still doesn’t offer a straightforward way to dissect your query performance and compare them with other query performances. Without knowing which queries cost you how many resources, you can’t optimize the Snowflake queries.

With Acceldata’s query performance charts, you can identify resource-hungry queries and find ways to optimize them further.

Get Data Observability Compute Metrics for Snowflake Within Minutes

Acceldata gives you more control over your Snowflake environment. It can help you predict costs, improve performance and optimize the value you get out of the annual contract.

But more importantly, you can set up Acceldata metrics within a few minutes. Get all of these rich, dynamically updated metrics in less time than it takes to write and test a single manual query.

To get started, specify the Snowflake URL used to locate the database schema along with the right username, password, user role, and analytics pipeline you want to connect to.

After connecting to your analytics pipeline, select a Warehouse from the drop-down list, and depending on your contract costs, enter the applicable cost per credit.

You’ve now successfully set up Acceldata metrics for your Snowflake environment. Acceldata’s capabilities go far beyond the eight metrics shown in this article. Sign up for a free personalized demo, and explore all the other features that Acceldata has to offer.

Photo by Sam Moghadam Khamseh on Unsplash