This page provides information about the databases in the cluster.
Databases in and of themselves are associated with very little information; their owner, and their MVCC method.
We can however examine the contents of each database, and produce totals for storage and processing, which provide a useful overview, which we see here.
Name | Type |
---|---|
database_id | int8 |
database | varchar |
owner_user_id | int8 |
owner | varchar |
mvcc | varchar |
idle | interval |
queries:worker:main | int8 |
queries:worker:csc | int8 |
store:blocks:sorted | int8 |
store:blocks:unsorted | int8 |
store:blocks:total | int8 |
store:rows:sorted | int8 |
store:rows:unsorted | int8 |
store:rows:total | int8 |
i/o:bytes read:disk | int8 |
i/o:bytes read:network | int8 |
i/o:rows:inserted | int8 |
i/o:rows:deleted | int8 |
i/o:rows:returned | int8 |
i/o:bytes processed:in memory | int8 |
i/o:bytes processed:on disk | int8 |
The database ID. This column is emitted in CSV exports only.
The database name.
The owner user ID. This column is emitted in CSV exports only.
The owner user name.
The method used to implement MVCC.
Redshift is derived from ParAccel (sorry ParAccel devs, I gotta say, worse product name ever), which in turn derived from Postgres 8 in 2008. The method Postgres 8 used for MVCC was state of the art at that time. Shortly after though some bright spark figured out a new and better method, which is called snapshot isolation, and that went into Postgres 9, which came out in 2009.
Redshift used the Postgres 8 method since it was released in 2013, but, in 2022, introduced snapshot isolation. Clearly Redshift lives in Cincinnati.
By default databases use the original method. You must when creating a database explicitly specify the new method. You can convert an existing database to the new method, but you have to make the database completely idle first - no connections at all.
The possible values in this column then are;
Value | Meaning |
---|---|
serializable | the original method |
snapshot | the fancy new method from 2009! |
Duration since the start of most recent table access in
stl_delete
, stl_insert
and
stl_scan
, which is to say, the time since the table last
had rows deleted, inserted, or was scanned. If empty, there are no
records in any of these system tables within the time period being
examined by the page.
The count of worker node queries completed on the cluster.
The count of worker node queries completed by a CSC cluster.
The number of sorted blocks.
The number of unsorted blocks.
The total number of blocks.
The number of sorted rows.
The number of unsorted rows.
The total number of rows.
This column then shows the total number of bytes read from disk, as
best I can judge the types indicated in stl_scan
.
This column then shows the total number of bytes read from network,
as best I can judge the types indicated in stl_scan
.
Importantly, it only counts the receive side of network
activity - the step is scan
, after all, not
broadcast
, so we’re not counting bytes twice.
The number of rows inserted into the table.
For tables with all
distribution, this is the physical
number of rows (i.e. one per node), not the logical number of rows.
The number of rows deleted from the table.
For tables with all
distribution, this is the physical
number of rows (i.e. one per node), not the logical number of rows.
The number of rows returned from the leader node to the SQL client.
This column then shows the total number of bytes processed by the
stl_aggr
, stl_hash
, stl_save
,
stl_sort
and stl_unique
steps, when running in
memory rather than on disk.
This column then shows the total number of bytes processed by the
stl_aggr
, stl_hash
, stl_save
,
stl_sort
and stl_unique
steps, when running on
disk rather than in memory.