Notes Regarding Views

Notes

Remember here that a view is an SQL statement, and the way views work, is that in a query which uses a view, the name of the view is replaced by the SQL which is the view; and the resultant SQL is then issued. You could have written that SQL yourself, directly; views are a purely mechanical, physical manipulation of the SQL of a query.

As such, views, and their columns, have much less information associated with them than table and table columns, as view columns are not actually on disk, but are the output from a query. So there are no concepts of encoding, constraints, etc.

A partial exception to this of course are materialized views.

In Redshift a materialized view is the combination of a normal view, and a normal table.

The normal view behaves just like any view, and the normal table of course behaves just like any table - and so it does have encodings and can be sorted and so on.

The page for materialized views shows information about both the view and the table. I have not yet updated the page for materialized view columns to show information about the table columns.

Issues with Late-Binding Views

Information about late-binding views does not come from a system table, but from a function, and as such, does not scale. You can have at most a few thousand late-binding views; any more, and the function to enumerate them will grind to a halt.

As such, late-binding views are presented separately, to keep them from messing up information about normal views.

However, they’re a bad idea anyway. If you drop a table or view which other views depend upon, you want to find out when you do it, when it’s there in front of you, not at some random point later on when you’ve entirely forgotten about it.

Issues with Materialized Views

Materialized views are never be used a part of a normal system.

Their only use is when the data and table design are broken, such that joins are issuing large or huge hash joins, which are going to disk, and so the system has ground to a halt. A materialized view acts essentially as a cache - it performs the join ahead of time and stores the result. This the acts to avoid queries issuing problematic joins. However, materialized views themselves are dumb. They have no idea about sorting, and the entire point of Redshift is sorting. It is the failure to use sorting correctly which is causing large or huge hash joins in the first place, and the cure, materialized views, still have no idea about sorting - all they are doing is removing one aspect of the failure to use sorting. They are not using sorting correctly (indeed, I believe they are normally fully unsorted - only auto-vacuum can sort them, and it almost never runs), and using sorting correctly is staggeringly efficient - sorting is the entire reason for using Redshift, for bearing the many and varied costs, constraints and restrictions that must be born for sorting to be used.

If sorting is not being used, then Redshift is always the wrong choice - and so you see materialized views do not change this; what they are doing is palliative care, moving you from a state of utter and abject failure to a state merely of failure. You are not now in any way, shape or form, actually using Redshift correctly, and so you are not getting the staggering efficiency provided by sorting, and you are still using the wrong database.