Groups (cluster) v1

Description

Groups have two properties; the set of users who are members, and the set of privileges granted to the group.

Groups can never be the owner of an object, or issue queries, or anything like that; all there is, is which users are in the group, and which privileges are granted to the group.

However, if we take the set of users in a group and aggregate information about those users, we can quite reasonably say this is information about that group, and this is the approach I’ve taken.

  1. Limitations of the queries, store and I/O columns
  2. Volume of records

Columns

Name Type
group_id int4
group varchar
count:users int8
idle interval
queries:leader:main int8
queries:worker:main int8
queries:worker:csc int8
i/o:bytes read:disk int8
i/o:bytes read:network int8
i/o:rows:inserted int8
i/o:rows:deleted int8
i/o:rows:returned int8
i/o:bytes processed:in memory int8
i/o:bytes processed:on disk int8
related pages varchar

Column Descriptions

group_id

The group ID. This column is emitted in CSV exports only.

Group IDs are unique across all databases.

group

The group name.

count:users

The number of users in this group.

idle

Duration since the start of most recent table access in stl_delete, stl_insert and stl_scan, which is to say, the time since the table last had rows deleted, inserted, or was scanned. If empty, there are no records in any of these system tables within the time period being examined by the page.

queries:leader:main

The most accurate obtainable count of leader node queries, both running and completed, issued by the users in this group.

Currently (2022-12-25) there appears to be a regression in stv_recents, which is the only system table holding information about running leader node queries, such that running leader node queries are not shown; only completed queries.

Additionally, certain types of query (such as select) have, as far as I know, no record in the system tables other than in stv_recents, which holds only the most recent 100 queries.

As such the information about completed leader node queries, although fairly complete, is not fully complete.

queries:worker:main

The count of worker node queries completed on the cluster.

queries:worker:csc

The count of worker node queries completed by a CSC cluster.

i/o:bytes read:disk

This column then shows the total number of bytes read from disk, as best I can judge the types indicated in stl_scan.

i/o:bytes read:network

This column then shows the total number of bytes read from network, as best I can judge the types indicated in stl_scan.

Importantly, it only counts the receive side of network activity - the step is scan, after all, not broadcast, so we’re not counting bytes twice.

i/o:rows:inserted

The number of rows inserted into the table.

For tables with all distribution, this is the physical number of rows (i.e. one per node), not the logical number of rows.

i/o:rows:deleted

The number of rows deleted from the table.

For tables with all distribution, this is the physical number of rows (i.e. one per node), not the logical number of rows.

i/o:rows:returned

The number of rows returned from the leader node to the SQL client.

i/o:bytes processed:in memory

This column then shows the total number of bytes processed by the stl_aggr, stl_hash, stl_save, stl_sort and stl_unique steps, when running in memory rather than on disk.

i/o:bytes processed:on disk

This column then shows the total number of bytes processed by the stl_aggr, stl_hash, stl_save, stl_sort and stl_unique steps, when running on disk rather than in memory.