Queues v1

Description

This source contains information about WLM queues.

The Redshift docs, especially in the system tables, refer to queues as “service classes”, without anywhere, as far as I know, explaining what a service class is. I just call them queues.

In this page, there is one row per queue, and here we find the queue name, the query timeout, a boolean which indicates if this queue when full will start using concurrency scaling clusters, and finally the number of used slots, and the total number of slots.

As you might expect, when a queue has queued queries, the number of used slots for that queue normally equals the total number of slots.

Python Functions

An exception is for queries which use Python functions. These have additional queuing rules, which are hard to reason about, as they set a cluster-wide maximum on the number of concurrently executing queries which use Python functions. When this limit is reached, queries which use Python functions queue, even if there are available slots in their queue. The interaction between this cluster-wide limit, and the individual queues, is not clear.

In any event, I benchmark Python functions as being 300 times slower than SQL functions, so try not to use them.

  1. Critique of AutoWLM
  2. Memory per slot
  3. Volume of records
  4. Clusters, nodes and slices

Columns

Name Type
queue_id int2
queue varchar
slots:in use int2
slots:total int2
memory per slot:max int8
memory per slot:min int8
timeout interval
csc bool
idle interval
queries:queued int8
queries:worker:main int8
queries:worker:csc int8

Column Descriptions

queue_id

The queue ID. This column is emitted in CSV exports only.

queue

The queue name.

The default queue is played by Clint Eastwood, for it has no name. I’ve called it “default”.

slots:in use

The number of slots in use. The nature of queues being what they are, this is usually zero (and the number of queued queries is also zero), or the number of slots in the queue (and then there are queued queries).

slots:total

The number of slots in this queue.

memory per slot:max

The largest amount of memory, in megabytes, available to a slice, to the slots in this queue. See memory per slot.

memory per slot:min

The smallest amount of memory, in megabytes, available to a slice, to the slots in this queue. See memory per slot.

timeout

Query timeouts are, frankly, a complicated mess.

There are a number of timeout mechanisms; this column indicates the timeout value which is set on a per-queue basis. When a query has been executing for longer than the timeout, it is aborted. The timeout has nothing to do with queuing, only the time taken for compilation and execution.

csc

True if CSC is enabled for this queue; false otherwise.

idle

Time since a query was queued into this queue.

queries:queued

Number of queued queries.

Queries which use Python functions have special rules which restrict how many can execute concurrently, so it is possible for this value to be greater than zero while the number of free slots is also greater than zero.

Also, queries can be configured when issued to use up more than one slot. In this case also, when such a query cannot run because there are not enough slots available, the number of queued queries can be greater than zero while the number of free slots is also greater than zero.

queries:worker:main

The count of worker node queries completed on the cluster.

queries:worker:csc

The count of worker node queries completed by a CSC cluster.