For the last two weeks or so, a few nodes types in a few regions are not starting.
I’ve listed non-starters in the table.
|Very slow disk-read-write and network, benchmarks about 3x longer than usual.
|Network benchmark back to normal (~4 seconds), from 2x this.
|Very slow disk-read-write, 13s vs normal 3 to 4. Processor also slow (4s vs 3s) which is very unusual.
|Very slow disk-read-write, 9.5s vs normal 3s.
|Network slow, 6s vs normal 4.5s.
|Disk-read-write slow, 7s vs 3.5s
|Failed to start node type.
|Failed to start node type.
|Failed to start node type.
|Failed to start node type.
|Failed to start node type.
Redshift is composed of a single leader node and then two or more worker nodes (don’t use single node clusters).
The leader node is Postgres 8, plus a ton of new functionality to control worker nodes.
The worker nodes are completely new, nothing to do with Postgres.
By default a query runs and only runs on the leader node. Such a query does not go into WLM, and so you are not limited by the number of slots in WLM.
If a query requires resources on worker nodes (which is usually to say, tables created by users), then the query goes into WLM and runs on the worker nodes.
It can sometimes be useful to be able to identify if a query is running on the leader node, and this can be done by certain queries, because the output of those queries are different on the leader node to worker nodes.
The method I’ve always used is to print a BC date. The leader node prints in a different format to worker nodes (and indeed, worker nodes cannot ingest their own output format!)
A new way recently came to light, which is to do with support for the
Leader nodes, being Postgres, support intervals up to and including
Worker nodes, being much more minimal in implementation, support
intervals up to and include
If you issue a query using an interval of more than a week, it will run successfully on the leader node, but fail on worker nodes - it’s a second easy method to tell the two apart.
Working on a new PDF, first in a while.
Looking at the performance impact of compute slices.
Taking a two node cluster, benchmarking it, then elastic resize to four nodes, benchmark again. Then make a four node cluster, and benchmark that.
Then see what we get - a measure of performance loss from elastic resize.
I’m benchmark disk-read-write only right now. A comprehensive test would need to look at which steps compute slices participate in, which isn’t published, and then look at the performance loss for those steps.
This is superb.
Europe Must Urgently Prepare to Deter Russia Without Large-Scale US Support
ra3 Bring-Up Problems
Trying to write a new PDF, running into a lot of problems getting an
ra3 cluster up. Multiple regions unable to bring up a
dc2 Bring-Up Problems
I just ran a cross-region version check.
Failed to bring up a
Just nailed how the function which provides late-binding view info slows down to the point where it’s effectively frozen.
PDF tomorrow (Monday).
There are in Redshift two types of view, normal and late-binding.
When the leader node creates a normal view, it checks at the time of
pg_class for the tables and views being used by
the view, and will not create the view if it refers to non-existent
tables or views. Accordingly, external tables cannot be used with normal
views because external tables are not present in
and so are perceived as being non-existent. To work around this,
late-binding views were created, which perform no dependency checking -
they do not inspect
pg_class. This approach has two major
and three minor issues; the lack of dependency checking means blunders
when creating or modifying views lead to faults which can become
apparent only at a very different time and in a very different place to
a very different person who has no idea what caused the fault, there are
as the number and complexity of late-binding views increase critical
performance problems when enumerating the columns in late-binding views
such that it is possible for it to become impossible to enumerate
late-binding view columns, a system table function is used to enumerates
late-binding view columns and access privileges for this function are
incorrectly implemented such that
syslog unrestricted and
access system table do not work, the lack of dependency
information means no records in
pg_depend so it is
impossible to perform data lineage or inspect dependencies, and finally,
late-binding views pollute
pg_class as they look like
normal views (except for a single undocumented difference). Late-binding
views should due to their flaws be used for and only for their one
essential purpose, when an external table must be in a view.
New Redshift is out.
28 new system table tables, 13 new system table views, 84 new system table functions.
With Redshift releases, normally there’s two or three releases with no system table changes, and then a release with a bunch of changes - this is such a release, and the first such release since I improved the system table diff page, which finally comes into its own;
It’s now easy to examine the changes.
We can also look at the official changes, here, so let’s begin with those and then look directly at what’s changed in the system tables.
Changes CURRENT_USER to no longer truncate the returned username to 64 characters.
Oooof. I didn’t know this problem existed! glad that’s been fixed.
Adds OBJECT_TRANSFORM SQL function. For more information, see OBJECT_TRANSFORM function in the Amazon Redshift Database Developer Guide.
Ahhhhh - as we will see when we look at the system tables, this
explains the mass of new functions with names like
try_float(). BTW, the docs
don’t say it, but this is in preview. It’s not GA.
Adds the INTERVAL data type.
Oooooooooh. Nice. Good. This immediately raises the question of how intervals are sorted. They are probably 8 bytes in length, and count microseconds, so hopefully the sorting value is a direct copy of the interval value.
Hmm. IIRC, worker nodes only support intervals up to days or weeks, not longer time periods. That restriction may well still be in play, which would reduce the usefulness of this change.
Adds CONTINUE_HANDLER, which is a type of exception handler that controls the flow of a stored procedure. Using it, you can catch and handle exceptions without ending the existing statement block.
Ah, that’s good. This was a major piece of missing functionality, both in and of itself because you need it, and because Postgres et al can do this, so moving code over would require rewrites (which might be fundamental rewrites).
Adds the ability to define permissions on a scope (schema or database) in addition to individual objects. This allows users and roles to be granted a permission on all current and future objects within the scope.
Hmm. You can do this already, with default privileges. This is a second mechanism to do the same thing. I’m normally not impressed with what the devs produce for RS, so I fear the worst.
Changes external functions to now implicitly cast numbers with or without fractional parts to the numeric data type of the column. For int2, int4, and int8 columns, numbers with fractional digits are accepted by truncating unless the number is out of range. For float4 and float8 columns, numbers are accepted without fractional digits.
I may be wrong, but I think this is bad. This is reducing the strength of typing, which is not what you want.
Okay, now on to the system table changes, here;
There’s a whole ton of functions with names like
try_int2. I guess these are for OBJECT_TRANSFORM.
curious, because that information is already in a system table (unless
I’m about to find it’s been removed).
Couple of new functions for intervals,
intervald2s. This hints that the limitation of worker
nodes to days or weeks (whichever it was) has been lifted.
Two very curious functions,
_pg_interval_type(). This implies more than one type of
interval. (The official docs have not been updated, so there’s no info
on this in Data Types).
Now, new tables.
I’ve had code of my own producing this table for years. I wonder if this new table actually has all objects in, or just some? what I do see is that FINALLY the devs have included OIDs. Historically, this has not been done - you’d just get the object names - and then need to look up the OIDs yourself (which was not always possible, you might have a table name but no schemaname, or the names might be truncated to shorter than their normal maximum length, so you’d potentially have a partial name only).
Matching new table,
In Postgres, an attribute is a column. This table lists all columns
of column-possessing objects, with the object defined by its OID.
There’s a table which does this already,
pg_attribute - but
this table has normal tables and normal views only. I wonder if this new
table has external tables and late-binding views and so on? in which
case, why not just enhance the existing table? why have two
Some new tables which contain information specifically about Sage (presumably Sagemaker), which seems a bit odd. A database shouldn’t know specifically about external data source entities?
Curious. Some new tables to do with “undo”, like
What’s that all about?
Next, modified tables;
*_aggr tables have a new column
dist_by_distinct. Intriguing. This will relate to
distribution of rows, for the aggregation, and I would guess whether or
not distribution gave every distinct value to the same slice - but I
could be wrong, because the data type is
boolean. Is this new functionality, an improvement, or now
simply publishing some state about behaviour which already
Curious little change to
Arcadia billing is about Serverless billing. The new column is
force_billable_state. Sounds like something has changed a
bit with Serverless billing.
(and I think probably globally) has changed from
char(128). Officially, maximum possible length of an ARN
is 2048 characters, so shrug. Maybe RS ARNs are always shorter.
stcs_burst_manager_personalization (and I think probably
globally) has changed from
char(120) (one hundred and
twenty, not one hundred) to
char(128). Ah, looks like
someone noticed ARN field lengths were not consistent and they’ve all
(hopefully all) been changed to 128.
stv_exec_state has a new column
Not present in the official docs;
It’s showing up in one or two other modified tables, too.
Now, new views.
stl_auto_undo_events, which provides the public face
to the various “undo” objects seen previously.
Cute little view,
CREATE TABLE commands.
information_schema.columns has changed -
interval_precision is no longer a
cardinal_number but is now
That’s going to mess up people’s SQL, I think. Same for
stl_vacuum_detail now includes columns which - get
this - actually tell you WHICH table was vacuumed =-) WORK OF
GENIUS I SAY! :-) :-)
Home 3D Друк Blog Bring-Up Times Consultancy Cross-Region Benchmarks Email Forums IRC Mailing Lists Reddit Redshift Price Tracker Redshift Version Tracker Redshift Workbench System Table Tracker The Known Universe Twitter White Papers