Cluster v1

Description

This page provides information about the cluster.

  1. Limitations of the queries, store and I/O columns
  2. Volume of records

Columns

Name Type
ip varchar
version varchar
slices:partial int2
slices:full int2
disks:count int2
disks:size int8
disks:total int8
queries:leader:main int8
queries:worker:main int8
queries:worker:csc int8
store:blocks:sorted int8
store:blocks:unsorted int8
store:blocks:total int8
store:rows:sorted int8
store:rows:unsorted int8
store:rows:total int8
i/o:bytes read:disk int8
i/o:bytes read:network int8
i/o:rows:inserted int8
i/o:rows:deleted int8
i/o:rows:returned int8
i/o:bytes processed:in memory int8
i/o:bytes processed:on disk int8

Column Descriptions

ip

The cluster IP name or number, as provided in the index page connect form.

version

The version of Redshift running on the cluster.

slices:partial

The number of partial slices. These are compute slices in Amazon terminology.

See clusters, nodes and slices.

slices:full

The number of full slices. These are data slices in Amazon terminology.

See clusters, nodes and slices.

disks:count

The total number of disks.

disks:size

The size of the disks. All disks have the same size.

disks:total

The total size of disk (count multiplied by size).

With ra3 node types, this is simply the amount of local disk, and all of it used in conjunction with RMS to store blocks (it’s almost a cache - as far as I can tell, and I have investigated, it behaves like an LRU cache, except that blocks are not copied from RMS (and so exist in RMS and in cache) but are moved from RMS. A block exists in one place at a time only. I suspect this is a consequence of legacy constraints in the code base.

With the other node types, there is a significant difference between the store indicated in the node specification and the actual amount of store. The actual amount of store is over-provisioned, something like 10% to 15% larger than the store indicated in the specification. There is no indication in the system tables of the cluster node type, and so it is not possible to know the specification size.

As such, the value in this column is the over-provisioned size.

queries:leader:main

The most accurate obtainable count of leader node queries, both running and completed, issued by the users in this group.

Currently (2022-12-25) there appears to be a regression in stv_recents, which is the only system table holding information about running leader node queries, such that running leader node queries are not shown; only completed queries.

Additionally, certain types of query (such as select) have, as far as I know, no record in the system tables other than in stv_recents, which holds only the most recent 100 queries.

As such the information about completed leader node queries, although fairly complete, is not fully complete.

queries:worker:main

The count of worker node queries completed on the cluster.

queries:worker:csc

The count of worker node queries completed by a CSC cluster.

store:blocks:sorted

The number of sorted blocks.

store:blocks:unsorted

The number of unsorted blocks.

store:blocks:total

The total number of blocks.

store:rows:sorted

The number of sorted rows.

store:rows:unsorted

The number of unsorted rows.

store:rows:total

The total number of rows.

i/o:bytes read:disk

This column then shows the total number of bytes read from disk, as best I can judge the types indicated in stl_scan.

i/o:bytes read:network

This column then shows the total number of bytes read from network, as best I can judge the types indicated in stl_scan.

Importantly, it only counts the receive side of network activity - the step is scan, after all, not broadcast, so we’re not counting bytes twice.

i/o:rows:inserted

The number of rows inserted into the table.

For tables with all distribution, this is the physical number of rows (i.e. one per node), not the logical number of rows.

i/o:rows:deleted

The number of rows deleted from the table.

For tables with all distribution, this is the physical number of rows (i.e. one per node), not the logical number of rows.

i/o:rows:returned

The number of rows returned from the leader node to the SQL client.

i/o:bytes processed:in memory

This column then shows the total number of bytes processed by the stl_aggr, stl_hash, stl_save, stl_sort and stl_unique steps, when running in memory rather than on disk.

i/o:bytes processed:on disk

This column then shows the total number of bytes processed by the stl_aggr, stl_hash, stl_save, stl_sort and stl_unique steps, when running on disk rather than in memory.