ZFS Telegraf Input Plugin
Use This InfluxDB Integration for FreeZFS is both a volume manager and a file system. This means it manages the physical storage on hard drives and disks, and also manages the files stored there. It uses tiered caching for reads and writes so it can find data quicker and speed up performance. This plugin collects metrics from ZFS file systems and supports both Linux and FreeBSD operating systems.
Why use a Telegraf plugin for ZFS?
The ZFS Telegraf Input Plugin gets metrics from ZFS file systems. It lets you collect statistics such as the number of cache hits and cache misses, the size of the cache, counts for when memory is low, and much more. Gathering these metrics lets you monitor the status of your ZFS file system. This gives you insight into the health of your computer. You can use Telegraf to send the data to InfluxDB or the endpoint of your choice to analyze performance over time or set up alerts for when memory gets too low.
How to monitor ZFS using the Telegraf plugin
If you’re using a Linux machine, you can specify your ZFS kstat path or leave it at the default
kstatPath = "/proc/spl/kstat/zfs"
The path does not need to be specified on FreeBSD machines.
By default this plugin collects all ZFS metrics, which are different for Linux and FreeBSD systems. For Linux the default is
kstatMetrics = ["arcstats", "zfetchstats", "vdev_cache_stats"]
and for FreeBSD the default is
kstatMetrics = ["abdstats", "arcstats", "dnodestats", "dbufcachestats", "dmu_tx", "fm", "vdev_mirror_stats", "zfetchstats", "zil"]
You can override this by setting the kstatMetrics array to the list of statistics you want to collect.
The ZFS Telegraf Input Plugin doesn’t collect zpool statistics by default, but you can set
poolMetrics = true
to change this. It also doesn’t collect data set statistics unless you set
datasetMetrics = true
On FreeBSD machines, if you have listsnapshots enabled in the pool property, Telegraf might have problems reading this output.
Key ZFS metrics to use for monitoring
Some of the important ZFS metrics that you should proactively monitor include:
ARC stats (FreeBSD and Linux)
- arcstats_allocated (FreeBSD only)
- arcstats_anon_evict_data (Linux only)
- arcstats_anon_evict_metadata (Linux only)
- arcstats_anon_evictable_data (FreeBSD only)
- arcstats_anon_evictable_metadata (FreeBSD only)
- arcstats_anon_size
- arcstats_arc_loaned_bytes (Linux only)
- arcstats_arc_meta_limit
- arcstats_arc_meta_max
- arcstats_arc_meta_min (FreeBSD only)
- arcstats_arc_meta_used
- arcstats_arc_no_grow (Linux only)
- arcstats_arc_prune (Linux only)
- arcstats_arc_tempreserve (Linux only)
- arcstats_c
- arcstats_c_max
- arcstats_c_min
- arcstats_data_size
- arcstats_deleted
- arcstats_demand_data_hits
- arcstats_demand_data_misses
- arcstats_demand_hit_predictive_prefetch (FreeBSD only)
- arcstats_demand_metadata_hits
- arcstats_demand_metadata_misses
- arcstats_duplicate_buffers
- arcstats_duplicate_buffers_size
- arcstats_duplicate_reads
- arcstats_evict_l2_cached
- arcstats_evict_l2_eligible
- arcstats_evict_l2_ineligible
- arcstats_evict_l2_skip (FreeBSD only)
- arcstats_evict_not_enough (FreeBSD only)
- arcstats_evict_skip
- arcstats_hash_chain_max
- arcstats_hash_chains
- arcstats_hash_collisions
- arcstats_hash_elements
- arcstats_hash_elements_max
- arcstats_hdr_size
- arcstats_hits
- arcstats_l2_abort_lowmem
- arcstats_l2_asize
- arcstats_l2_cdata_free_on_write
- arcstats_l2_cksum_bad
- arcstats_l2_compress_failures
- arcstats_l2_compress_successes
- arcstats_l2_compress_zeros
- arcstats_l2_evict_l1cached (FreeBSD only)
- arcstats_l2_evict_lock_retry
- arcstats_l2_evict_reading
- arcstats_l2_feeds
- arcstats_l2_free_on_write
- arcstats_l2_hdr_size
- arcstats_l2_hits
- arcstats_l2_io_error
- arcstats_l2_misses
- arcstats_l2_read_bytes
- arcstats_l2_rw_clash
- arcstats_l2_size
- arcstats_l2_write_buffer_bytes_scanned (FreeBSD only)
- arcstats_l2_write_buffer_iter (FreeBSD only)
- arcstats_l2_write_buffer_list_iter (FreeBSD only)
- arcstats_l2_write_buffer_list_null_iter (FreeBSD only)
- arcstats_l2_write_bytes
- arcstats_l2_write_full (FreeBSD only)
- arcstats_l2_write_in_l2 (FreeBSD only)
- arcstats_l2_write_io_in_progress (FreeBSD only)
- arcstats_l2_write_not_cacheable (FreeBSD only)
- arcstats_l2_write_passed_headroom (FreeBSD only)
- arcstats_l2_write_pios (FreeBSD only)
- arcstats_l2_write_spa_mismatch (FreeBSD only)
- arcstats_l2_write_trylock_fail (FreeBSD only)
- arcstats_l2_writes_done
- arcstats_l2_writes_error
- arcstats_l2_writes_hdr_miss (Linux only)
- arcstats_l2_writes_lock_retry (FreeBSD only)
- arcstats_l2_writes_sent
- arcstats_memory_direct_count (Linux only)
- arcstats_memory_indirect_count (Linux only)
- arcstats_memory_throttle_count
- arcstats_meta_size (Linux only)
- arcstats_mfu_evict_data (Linux only)
- arcstats_mfu_evict_metadata (Linux only)
- arcstats_mfu_ghost_evict_data (Linux only)
- arcstats_mfu_ghost_evict_metadata (Linux only)
- arcstats_metadata_size (FreeBSD only)
- arcstats_mfu_evictable_data (FreeBSD only)
- arcstats_mfu_evictable_metadata (FreeBSD only)
- arcstats_mfu_ghost_evictable_data (FreeBSD only)
- arcstats_mfu_ghost_evictable_metadata (FreeBSD only)
- arcstats_mfu_ghost_hits
- arcstats_mfu_ghost_size
- arcstats_mfu_hits
- arcstats_mfu_size
- arcstats_misses
- arcstats_mru_evict_data (Linux only)
- arcstats_mru_evict_metadata (Linux only)
- arcstats_mru_ghost_evict_data (Linux only)
- arcstats_mru_ghost_evict_metadata (Linux only)
- arcstats_mru_evictable_data (FreeBSD only)
- arcstats_mru_evictable_metadata (FreeBSD only)
- arcstats_mru_ghost_evictable_data (FreeBSD only)
- arcstats_mru_ghost_evictable_metadata (FreeBSD only)
- arcstats_mru_ghost_hits
- arcstats_mru_ghost_size
- arcstats_mru_hits
- arcstats_mru_size
- arcstats_mutex_miss
- arcstats_other_size
- arcstats_p
- arcstats_prefetch_data_hits
- arcstats_prefetch_data_misses
- arcstats_prefetch_metadata_hits
- arcstats_prefetch_metadata_misses
- arcstats_recycle_miss (Linux only)
- arcstats_size
- arcstats_sync_wait_for_async (FreeBSD only)
Zfetch stats (FreeBSD and Linux)
- zfetchstats_bogus_streams (Linux only)
- zfetchstats_colinear_hits (Linux only)
- zfetchstats_colinear_misses (Linux only)
- zfetchstats_hits
- zfetchstats_max_streams (FreeBSD only)
- zfetchstats_misses
- zfetchstats_reclaim_failures (Linux only)
- zfetchstats_reclaim_successes (Linux only)
- zfetchstats_streams_noresets (Linux only)
- zfetchstats_streams_resets (Linux only)
- zfetchstats_stride_hits (Linux only)
- zfetchstats_stride_misses (Linux only)
Vdev cache stats (FreeBSD)
- vdev_cache_stats_delegations
- vdev_cache_stats_hits
- vdev_cache_stats_misses
Pool metrics (optional)
On Linux (reference: kstat accumulated time and queue length statistics):
- zfs_pool
- nread (integer, bytes)
- nwritten (integer, bytes)
- reads (integer, count)
- writes (integer, count)
- wtime (integer, nanoseconds)
- wlentime (integer, queuelength * nanoseconds)
- wupdate (integer, timestamp)
- rtime (integer, nanoseconds)
- rlentime (integer, queuelength * nanoseconds)
- rupdate (integer, timestamp)
- wcnt (integer, count)
- rcnt (integer, count)
For ZFS >= 2.1.x the format has changed significantly:
- zfs_pool
- writes (integer, count)
- nwritten (integer, bytes)
- reads (integer, count)
- nread (integer, bytes)
- nunlinks (integer, count)
- nunlinked (integer, count)
On FreeBSD:
- zfs_pool
- allocated (integer, bytes)
- capacity (integer, bytes)
- dedupratio (float, ratio)
- free (integer, bytes)
- size (integer, bytes)
- fragmentation (integer, percent)
Dataset metrics (optional, only on FreeBSD)
- zfs_dataset
- avail (integer, bytes)
- used (integer, bytes)
- usedsnap (integer, bytes
- usedds (integer, bytes)
Tags
- ZFS stats (
zfs
) will have the following tag:- pools - A :: concatenated list of all ZFS pools on the machine.
- datasets - A :: concatenated list of all ZFS datasets on the machine.
- Pool metrics (
zfs_pool
) will have the following tag:- pool - with the name of the pool which the metrics are for.
- health - the health status of the pool. (FreeBSD only)
- dataset - ZFS >= 2.1.x only. (Linux only)
- Dataset metrics (
zfs_dataset
) will have the following tag:- dataset - with the name of the dataset which the metrics are for.