ACCELDATA BLOG

Acceldata Pulse for ClickHouse Monitoring

by 

Gaurav Nagar

 | 

June 9, 2021

Acceldata Pulse for ClickHouse Monitoring

Flawless real-time data observability accelerates data ROI. The need for comprehensive visibility with the ever increasing sprawl of databases in the enterprise ecosystem has never been higher. It provides us with business agility, stability, and assurance of high quality operations.

We recognize that enterprises have use-case specific purpose built databases for real-time data, OLAP and OLTP scenarios. Adding to our wide array of integrations, we are pleased to announce our deep comprehensive observability solution for Clickhouse

Figure 1: Acceldata Pulse Integrations

Clickhouse is a column-oriented database management system (DBMS) for online analytical processing of queries (OLAP). For such OLAP scenarios, ClickHouse supports real-time and expeditious querying for vast amounts of data, even when running in Petabytes.

At Acceldata, we explored Clickhouse to provide users a centralized, reliable, and interactive logging platform that empowers operations to work confidently at scale. The logs are associated with a set of contextual key value pairs, based on ad-hoc events, system information among other data, which is required to slice and dice their data to surface patterns to improve operations. 

Through our first-hand experiences with Clickhouse, we developed Acceldata Pulse's observability integration so that Clickhouse customers can better manage their environments. Like all database technologies, Clickhouse can provide great value when careful attention is paid to the health of the system.

The OLAP Approach

The Online Analytical Processing (OLAP) approach allows the processing of multi-dimensional analytical queries for swift computation. This enhances multiple facets of business intelligence, data mining, relational databases, and report generation. 

The classic OLAP approach stored raw data in aggregated form. Although this helped to reduce the amount of stored data, it came with its own disadvantages such as inability to create custom reports, data volume enlargement after aggregation, logical inconsistency in different reports, and so on.  

To resolve these issues, a different approach to storing unaggregated data was introduced. Processing raw data requires a high-performance system, to handle the real-time calculations. ClickHouse, is a competitive solution to handle such scenarios of analytical data, on the widest scale possible. 

ClickHouse Features

ClickHouse is an open source DBMS for synchronized generation of analytical data reports using SQL queries. The fast, fault tolerant, scalable, and highly reliable system, provides help on solving complex data analytics concerns. Several Clickhouse features are universally appreciated: 

  • Capability to store and process petabytes of data
  • SQL support
  • High performance
  • Data compression
  • Server log analysis

ClickHouse monitoring requires fine-tuned tools to powerfully harness its best benefits, optimize resource usage, cautiously safeguard system health, and above all, populate insights with accuracy. 

Acceldata Pulse as ClickHouse Monitoring Tool

Acceldata Pulse provides cross-sectional visibility into enterprise analytics and AI systems across hybrid data lakes and warehouses. Pulse correlates events across infrastructure, application, and data layers to provide a comprehensive understanding of individual components, data pipelines, and system performance, all in a single pane of glass UI.

Acceldata Pulse integrates with underlying database systems to collect data through technology specific connectors to collect data from Infra, applications, and data layers, and stores it in its domain specific data stores.

The real-time operational data is then visualized through the proprietary Pulse Dashplots, which manages to bring together the data elements from all layers for operational monitoring purposes. It also eases the processes to populate deep insights into your performance metrics such as query counts, replication status, memory usage, merge operations, and so on.

Most importantly, the integration comes with full-lifecycle support of alerts, log integration, automated-actions for guardrail creation. 

Acceldata Pulse supports comprehensive observability that covers monitoring metrics, anomaly detection, logs, and alerts.

Acceldata supports monitoring the ClickHouse instances automatically, straight from the UI. The top advantages of ClickHouse monitoring using Acceldata are as follows:

  • Support for monitoring more ClickHouse-related metrics. Ability to add more metrics based upon needs of the team.
  • The open-source standards based Clickhouse agent is ready for deployment across all linux environments on the cloud and on-premise.
  • Out of the box, built-in data visualizations to monitor various aspects of deployment ranging from - insertions, query, replication status.
  • Alerting based on Thresholds, Anomaly Detection, and Server Health with numerous notification integrations.

ClickHouse Monitoring Dashboard

Acceldata ClickHouse Monitoring Dashboard opens up an intuitively predefined set of visualizations that organize key metrics. Here is a quick overview of some of the top metrics:

Total query count — clickhouse.query.count. This number represents the total number of queries in your ClickHouse integration. It’s a key metric for assessing the overall level of activity in your ClickHouse system.

Inserted rows clickhouse.insert.rows. This metric represents the number of rows inserted in all tables and reflects the level of activity in the database

Inserted bytes clickhouse.insert.bytes. The number of uncompressed bytes inserted in all tables. 

Rows:

  • Merged Rows: The number of rows before a merge. This represents the total rows read for background merges. 
  • Inserted Rows: The number of rows inserted in all tables. This reflects your database size, and the level of activity within your database.

Bytes

  • Bytes Inserted: The number of uncompressed bytes inserted in all the tables. This is an insight to the activity level and database size.
  • Uncompressed Bytes Merged Per Sec: The number of bytes before a merge. This represents the uncompressed bytes that were read for background merges. 

Network Connections:

  • TCP: The total number of connections to TCP server. Hence, this also monitors the load on the ClickHouse installation.
  • HTTP Value: The number of connections to the HTTP server. This metric also reflects the load on your installation.
  • Interserver Value: The number of connections from other replicas to fetch parts. This does not monitor the overall system load, but allows you to analyse and optimize the installation performance.  

ZooKeeper metrics

ClickHouse uses Apache Zookeeper internally. Acceldata has extensive support for Zookeeper as well. You can monitor ZooKeeper metrics to help understand the state of your installation:

  • ZooKeeper watches – clickhouse.zk.watches. The number of watches (e.g., event subscriptions) in ZooKeeper.
  • ZooKeeper wait – clickhouse.zk.wait.time. Time spent waiting for ZooKeeper operations
  • ZooKeeper requests – clickhouse.zk.requests. Number of requests to ZooKeeper in progress.

Acceldata ClickHouse Dashboard provides a quick and easy solution to populate deep insights into your performance metrics.

To find out more, write to us at support@acceldata.io or drop us a note via https://www.acceldata.io/contact-us/.