Redshift Research Project

Max Ganz's Redshift Workbench

Index

Introduction
Videos and Sample Screenshots
Value Proposition
Pricing
Payment
Getting Redshift Workbench
Using Redshift Workbench
Security
Support
Long Term Future
Warranty
Documentation
Contact Me

Introduction

Redshift Workbench is an Amazon Redshift specific management web-app, distributed as an AMI.

Redshift Workbench not cloud based; you run it locally.

It requires no external network access, no superuser, holds no privileges (so no SELECT privilege) on your tables, and you have full access to the instance running the AMI, allowing security checks and updates.

Videos and Sample Screenshots

Starting Up

Tour #1

Tour #2

Open the videos in a new tab, to view them full size.

(Note video codec/container support across browser/platform is problematic; I have provided VP9/WebM and h264/mp4, which should maybe might possibly could cover all the bases, unless it doesn't. Please contact me if the videos are not working for you.)

Views (materialized) v1

Views (normal) v1

Value Proposition

The difference between a correctly and an incorrectly operated Redshift cluster is an actual cubic kilometer of money.
Redshift operated correctly is staggeringly efficient, and by this needs very little hardware for a given load.
Redshift operated incorrectly is mind-bendingly inefficient, and by this needs extensive - and so expensive - hardware for a given load.

When I write here about efficiency, I mean both in terms of query performance and disk space use.

AWS Support and TAMs, in my experience, do not understand Redshift, and their advice when performance or disk space are lacking is always the same : buy more nodes.

I have never, not once, seen or heard of AWS coming to a client and saying : do [xyz] to make your cluster more efficient, so you need less nodes.

Support and the TAMs do not understand Redshift to be able to tell you what to do, and there's an obvious financial conflict of interest for AWS in doing so.

In this matter, you have to find your own way forward.

To correctly operate a Redshift cluster you need three things;

A use case where it is possible for Redshift to operate correctly.
Your cluster admin, developers, and users all understand Redshift.
To be able to observe the state of the cluster.

This entire site is about providing understanding of Redshift.

Redshift Workbench provides observability, and paying for it is how you fund my work so you can develop understanding, and also obtain that observability.

Pricing

1% of your monthly Redshift spend, capped at 100 USD per month.

(Note, I don't like percentage based fees for services which actually are a fixed cost; the reason I have one here is because 100 USD - the fixed price cost - is too much for a small cluster, say 360 USD a month; I have to charge less than 100 USD when the cluster is small.)

I trust you to be honest and let me know when there's a cluster change large enough to be worth taking into account.

Payment

Payment can be made in many currencies by local (international is not required) bank transfer.
The FinTech bank account I'm using accepts AUD, CAD, EUR, GBP, HUF, NZD, RON, SGD and USD, as local bank transfers.
If that's not viable, please contact me and we'll work something out.

I looked at payment processors (so I could take bank cards on the site) and they all looked to be expensive, bureaucratic, with a non-negligible rate of false alarms which freeze accounts, and with significant customer support issues. It was not an appealing value proposition.

Payments are made to my UK company.

When you want to stop using Redshift Workbench, you simply stop paying.

Getting Redshift Workbench

I have to be careful with distribution, because Redshift Workbench to be as useful as it can be has to PII safe, which means no network access, and completely open to the end-user, for security checks and updates, which permits arbitrary end-user modifications.

That means I have no way to know, or control, whether not Redshift Workbench is or is not being used. You could take a copy, tell me you're not using it, and use it forever. I can't know.

Accordingly, to get Redshift Workbench;

You contact me, and we have a bit of a chat; I find out about you and your company, you meet me, we decide the initial duration of the period needed to get Redshift Workbench into production, the initial duration of the trial period (typically a month), and if you decide to keep using Redshift Workbench after the trial period how much to pay per month.
I share the Redshift Workbench AMI with your AWS account, in us-east-1.
You copy that AMI, in us-east-1, but now you have a copy which you own, in your AWS account, so you can copy it into whatever regions you wish.

Usage of Redshift Workbench is unlimited - any number of AWS accounts, regions, AMI instances, Redshift clusters, etc.

The only constraints are that you cannot give it to other entities, and you cannot use the source code to produce or improve a competing product.

There are two builds of the AMI, one for arm64 and one for x86_64. Both are shared with you.

The latest AMI is always shared with your AWS account, so you can take it at any time. There's an announcements mailing list you can join, if you wish, which will let you know when a new version is released.

If, while the war is on, you want to start paying from the start, rather than say six months down the line once you're into production, that would be fantastic. The whole point of all of this is to get weapons into the hands of the Ukrainian military, and the war is on now.

Using Redshift Workbench

Instantiate the AMI, specifying your own KMS key, which will encrypt the disk.
Point a browser at port 443 of the AMI instance, which will load the Redshift Workbench home page.
Follow (which means copy'n'paste) the SQL commands on the home page (which creates a Redshift user and grants one or two privileges).
Specify the Redshift cluster IP, port, database, username and password in the connect form, and hit connect - and you're in. That's it.

Redshift Workbench knows nothing about AWS or AWS accounts, or regions, or anything like that. It needs a Redshift user in the cluster, and it needs to know about the cluster (IP, port, database, user and password), that's all.

You can connect concurrently to any number of Redshift clusters, and any number of users can concurrently connect to the same Redshift cluster.

The only reasons to bear the EC2 costs of running more than one Redshift Workbench instance are to reduce latency and data transfer costs by keeping instances near or in the region(s) the Redshift cluster(s) are in - but these can be good reasons.

A single core, 2 GB memory instance (about 20 USD/month) should be fine for a single user connecting to a single cluster at a time.

Redshift Workbench stores nothing on local disk, so the 8GB minimum of a GP3 volume is plenty of space.

Security

Redshift Workbench is designed to be PII and regulatory safe - so for medical, financial, defence, you name it.

What this actually means is;

Redshift Workbench runs locally; exfiltration is impossible as there no external network access at all.
Redshift Workbench must be able to access the Redshift cluster IP:port, and you must be able to access Redshift Workbench (on 443). That's it. Apart from that, firewall the EC2 instance completely.
No superuser in Redshift; a normal user is used.
The normal user needs read access on pg_catalog, and to be able to write temp tables; that's it.
There are no privileges, at all, on your actual tables; these tables cannot be read by Redshift Workbench.
The EC2 instance disk is encrypted with your KMS key.
You can SSH in the usual way (AWS insert your keys into the instance when it is created), run your own security software, and keep the instance up to date with security patches.

The full security design document is here, and should make the most hawk-eyed SecOps happy.

Support

I'm currently in the EU.

Long Term Future

I am never going to sell ownership of Redshift Workbench. In my experience, whenever a company or product is bought out, it is necessarily to someone looking to make a profit from it, and then it all goes downhill.

When the time comes I move on - and I have no plans to do so, and expect this to be many years away - Redshift Workbench will either be given to a successor, someone with profound knowledge of Redshift, or if no successor can be found, it will be made open source.

Any successor will be bound also to hand on Redshift Workbench in the same way.

Your investment of time and effort in getting to know the product, and integrating it into your own processes and systems, will not be lost.

Warranty

I need to describe clearly the current state of Redshift Workbench, so you know what you're getting - how well tested and so on. Redshift Workbench consists of essentially two components - the underlying SQL which produces information about the Redshift system, and the front-end.

The underlying SQL has been developed over the last five years, and has been in constant use, but by me only. I have written a test framework for the SQL, which generates a fairly comprehensive fake database, but I need to populate it with actual tests, and as it is these tests can only validate static information - tables, users, privileges. The framework needs to be enhanced to allow testing of dynamic behaviour, by performing actions and checking to see those actions turn up in the queries.

For the front-end, I have written a site walker (nicer name than spider!), which walks the site. This detects any non-200 pages, any pages with error reports, and has tests to check if a page is malformed. The site walker currently does not test any of the data or page controls, such as excluding the rdsdb user, but it does let me know that all pages are displaying successfully.

All in all, I expect there to be bugs, but through constant use over many years, I do not expect serious bugs, but as with all software, until there is a thorough test suite, a well-proved expectation of correctness cannot be provided. However, I am happy and confident to say the SQL and the front-end are both good enough to put to use, and where they are informational only - they only read data, and then from only the system tables - there are no risks of Redshift Workbench harming the cluster.

Documentation

I always fully document my work.

Each Redshift Workbench page comes with its own documentation page, which describes the page and every column in the data table for the page.

This iframe shows the documentation root page, so you can also here directly view the documentation.

Contact Me

I'm available during EU hours on IRC.
You can email me directly.

Also, here's a contact form;

You'll receive a confirmation email. I'm in the EU, and if you send when I'm awake, you'll likely get a response pretty quickly - few hours, tops.

Note all and any information you enter is used and only used to reply to you. You are not added to mailing lists, aggregated, sold on to other companies, or anything like that.

Redshift Workbench Home Bugzilla Forums Mailing Lists Releases Support

Home 3D Друк Blog Bring-Up Times Consultancy Cross-Region Benchmarks Email Forums IRC Mailing Lists Reddit Redshift Price Tracker Redshift Version Tracker Redshift Workbench System Table Tracker The Known Universe Twitter White Papers

Email
Anti-spam
Message