Sessions v1

Description

This page provides information about sessions.

When a user connects to Redshift, a process is spawned, and that process is the session. A session is a process, and so it has a pid (process ID) and that is the pid you see scattered around the system tables.

A session at any one time is connected to one and only one database; to connect to Redshift, you have to specify a database in the connection string, and it is that database which the session initially connects to. There is no such thing as a session which is not connected to a database. Once connected, you can change the database you are connected to.

Sessions then have a few properties of interest; the user who has connected, which database they are currently connected to, how long the session has been in existence for, the session timeout (which appears not to be applied to certain sessions where Redshift connects to itself, something that appears to happen as part of the queries Redshift itself issues on an ongoing basis as part of running itself), and, finally, how long the session has been idle for.

The first four are all taken directly from stv_sessions, but last property, the idle time, I figure out, by finding the most recent query (of any kind - leader node or worker node) issued on the session, and taking its end time as the moment of last activity.

Note that the column starttime, which is in stv_sessions and indicates when a session began, although a timestamp, appears to be truncated to one second resolution; it never has a fractional part, and this is reflected in the duration and idle times.

(I actually also strip the fractional part from all timestamps and intervals in this page, for ease of reading, as I think the fraction part of a second is not important for sessions.)

Finally, the system table which carries session information, stv_sessions, provides only 50 bytes of the database name and the user name. Both these names in fact have a maximum length of 127 characters. As such, the joins the underlying query this page uses to obtain information will go wrong if there are multiple database and/or user names which are identical over their first 50 bytes.

(Note I say bytes, not characters - the columns have the data type char, but in fact are storing UTF-8. You get the first 50 bytes, whether or not the last UTF-8 character ends up being truncated.)

Columns

Name Type
user_id int4
user varchar
pid int4
database_id int8
database varchar
connected at timestamp
duration interval
idle interval
timeout interval

Column Descriptions

user_id

The user ID. This column is emitted in CSV exports only.

user

The user name.

Note stv_sessions stores only the first 50 bytes of the 127 possible bytes of the user name.

pid

The process ID.

When a user connects to the cluster, a process is spawned, which accepts that connection. That’s the process the ID is for. That process is also the process which issues queries, and when you need to cancel a query, you in fact cancel the process, which by being cancelled, inherently also cancels the query it is running.

database_id

The database ID. This column is emitted in CSV exports only.

database

The database name.

This is the database name the session is currently connected to (rather than always being the database specified in the initial connection).

Note stv_sessions stores only the first 50 bytes of the 127 possible bytes of the database name.

connected at

The timestamp of the time the connection was made to the database, rather than to the cluster, and as such this timestamp is set when an existing session changes database.

duration

Duration of the session in minutes.

In stv_sessions, the column starttime is a timestamp, but it has no fractional second, and when used with any other timestamp, the other timestamp loses its fractional second, too!

As such, duration has a resolution of one second. The fractional part seen in this column is the result of the division by 60, to obtain minutes.

idle

Duration since the completion of the most recent query issued by the session, in seconds.

timeout

Period of time until a connection is automatically disconnected.

Connections made by rdsdb, the root user owned by AWS, do not appear to honour the timeout, even though it is set.