Cluster v1

Description

This page provides information about the cluster.

Columns

Name	Type
ip	varchar
version	varchar
slices:partial	int2
slices:full	int2
disks:count	int2
disks:size	int8
disks:total	int8
queries:leader:main	int8
queries:worker:main	int8
queries:worker:csc	int8
store:blocks:sorted	int8
store:blocks:unsorted	int8
store:blocks:total	int8
store:rows:sorted	int8
store:rows:unsorted	int8
store:rows:total	int8
i/o:bytes read:disk	int8
i/o:bytes read:network	int8
i/o:rows:inserted	int8
i/o:rows:deleted	int8
i/o:rows:returned	int8
i/o:bytes processed:in memory	int8
i/o:bytes processed:on disk	int8

Column Descriptions

ip

The cluster IP name or number, as provided in the index page connect form.

version

The version of Redshift running on the cluster.

slices:partial

The number of partial slices. These are compute slices in Amazon terminology.

See clusters, nodes and slices.

slices:full

The number of full slices. These are data slices in Amazon terminology.

See clusters, nodes and slices.

disks:count

The total number of disks.

disks:size

The size of the disks. All disks have the same size.

disks:total

The total size of disk (count multiplied by size).

With ra3 node types, this is simply the amount of local disk, and all of it used in conjunction with RMS to store blocks (it’s almost a cache - as far as I can tell, and I have investigated, it behaves like an LRU cache, except that blocks are not copied from RMS (and so exist in RMS and in cache) but are moved from RMS. A block exists in one place at a time only. I suspect this is a consequence of legacy constraints in the code base.

With the other node types, there is a significant difference between the store indicated in the node specification and the actual amount of store. The actual amount of store is over-provisioned, something like 10% to 15% larger than the store indicated in the specification. There is no indication in the system tables of the cluster node type, and so it is not possible to know the specification size.

As such, the value in this column is the over-provisioned size.

queries:leader:main

The most accurate obtainable count of leader node queries, both running and completed, issued by the users in this group.

Currently (2022-12-25) there appears to be a regression in stv_recents, which is the only system table holding information about running leader node queries, such that running leader node queries are not shown; only completed queries.

Additionally, certain types of query (such as select) have, as far as I know, no record in the system tables other than in stv_recents, which holds only the most recent 100 queries.

As such the information about completed leader node queries, although fairly complete, is not fully complete.

queries:worker:main

The count of worker node queries completed on the cluster.

queries:worker:csc

The count of worker node queries completed by a CSC cluster.

store:blocks:sorted

The number of sorted blocks.

store:blocks:unsorted

The number of unsorted blocks.

store:blocks:total

The total number of blocks.

store:rows:sorted

The number of sorted rows.

store:rows:unsorted

The number of unsorted rows.

store:rows:total

The total number of rows.

i/o:bytes read:disk

This column then shows the total number of bytes read from disk, as best I can judge the types indicated in stl_scan.

i/o:bytes read:network

This column then shows the total number of bytes read from network, as best I can judge the types indicated in stl_scan.

Importantly, it only counts the receive side of network activity - the step is scan, after all, not broadcast, so we’re not counting bytes twice.

i/o:rows:inserted

The number of rows inserted into the table.

For tables with all distribution, this is the physical number of rows (i.e. one per node), not the logical number of rows.

i/o:rows:deleted

The number of rows deleted from the table.