Databases v1

Description

This page provides information about the databases in the cluster.

Databases in and of themselves are associated with very little information; their owner, and their MVCC method.

We can however examine the contents of each database, and produce totals for storage and processing, which provide a useful overview, which we see here.

Columns

Name	Type
database_id	int8
database	varchar
owner_user_id	int8
owner	varchar
mvcc	varchar
idle	interval
queries:worker:main	int8
queries:worker:csc	int8
store:blocks:sorted	int8
store:blocks:unsorted	int8
store:blocks:total	int8
store:rows:sorted	int8
store:rows:unsorted	int8
store:rows:total	int8
i/o:bytes read:disk	int8
i/o:bytes read:network	int8
i/o:rows:inserted	int8
i/o:rows:deleted	int8
i/o:rows:returned	int8
i/o:bytes processed:in memory	int8
i/o:bytes processed:on disk	int8

Column Descriptions

database_id

The database ID. This column is emitted in CSV exports only.

database

The database name.

owner_user_id

The owner user ID. This column is emitted in CSV exports only.

owner

The owner user name.

mvcc

The method used to implement MVCC.

Redshift is derived from ParAccel (sorry ParAccel devs, I gotta say, worse product name ever), which in turn derived from Postgres 8 in 2008. The method Postgres 8 used for MVCC was state of the art at that time. Shortly after though some bright spark figured out a new and better method, which is called snapshot isolation, and that went into Postgres 9, which came out in 2009.

Redshift used the Postgres 8 method since it was released in 2013, but, in 2022, introduced snapshot isolation. Clearly Redshift lives in Cincinnati.

By default databases use the original method. You must when creating a database explicitly specify the new method. You can convert an existing database to the new method, but you have to make the database completely idle first - no connections at all.

The possible values in this column then are;

Value	Meaning
serializable	the original method
snapshot	the fancy new method from 2009!

idle

Duration since the start of most recent table access in stl_delete, stl_insert and stl_scan, which is to say, the time since the table last had rows deleted, inserted, or was scanned. If empty, there are no records in any of these system tables within the time period being examined by the page.

queries:worker:main

The count of worker node queries completed on the cluster.

queries:worker:csc

The count of worker node queries completed by a CSC cluster.

store:blocks:sorted

The number of sorted blocks.

store:blocks:unsorted

The number of unsorted blocks.

store:blocks:total

The total number of blocks.

store:rows:sorted

The number of sorted rows.

store:rows:unsorted

The number of unsorted rows.

store:rows:total

The total number of rows.

i/o:bytes read:disk

This column then shows the total number of bytes read from disk, as best I can judge the types indicated in stl_scan.

i/o:bytes read:network

This column then shows the total number of bytes read from network, as best I can judge the types indicated in stl_scan.

Importantly, it only counts the receive side of network activity - the step is scan, after all, not broadcast, so we’re not counting bytes twice.

i/o:rows:inserted

The number of rows inserted into the table.

For tables with all distribution, this is the physical number of rows (i.e. one per node), not the logical number of rows.

i/o:rows:deleted

The number of rows deleted from the table.