Two critically important columns are disk:transient
and
disk:intermediate
.
Redshift has a problem, like most databases, with “killer queries”, queries which are improperly written and generate vast numbers of rows. Killer queries bring the cluster to a halt, and they do this by hammering shared resources, disk in particular as disk is always the bottleneck in a database system.
From observing queries as they run, what seems to happen is that a
query consumes the memory available in the slot, until none is left, and
then dumps all this data from memory to disk, freeing up all memory -
and this process repeats until the query is complete. This data is known
as transient data, and disk:transient
column indicates the
size, in bytes, of transient data in use by the query. I think when the
transient data is smaller than the memory available in the slot, none of
it goes to disk; but as a query comes to generate more and more output,
and exceeds the memory available in the slot, more and more is written
to disk. In any event, the more rows the query is generating, the more
transient data. Killer queries run for a long time and generate a
lot of rows, so the value for transient data becomes very
large.
Next, there is then this not quite defined concept of “spill”. In general, this is that work being performed by a query needs more memory than is available, and so necessarily begins to use disk. Exactly what this means in practise is not clear. I think this is not the amount of transient data which is on disk, but I am not certain of that. I see there are some steps which (hash and unique, for example) which in their system tables indicate they ran in memory, or on disk; it might be this is, either fully or in part, a measure of this disk use. Alternatively, or additionally, it may be for any step which cannot allocate sufficient memory, disk ends up being used.
In any event, empirically, I see killer queries can end up with large
values for spill. The amount of spill is recorded in the column
disk:intermediate
, and it is measured in blocks (as opposed
to transient, which is in bytes).
Note that the data for spill comes from
stv_query_metrics
. This system table I normally avoid like
the plague, but it is the only source of this information, and for the
purpose of detecting killer queries, it is I think adequate; as long as
you’re aware of the issues with this system table, which I will now
describe, and so can interpret what you see, it’s fine.
The table is actually pretty overloaded. It contains rows about queries, rows about segments, and rows about steps.
However, when it comes to killer queries, where they are long-running, I believe we will be able to obtain meaningful and useful information from this table.
In the case of a killer query, there is a column for
pid
, so log in as admin and cancel the process which owns
the query, as this will abort the query.