=encoding UTF-8 =head1 Name sqitchtutorial-vertica - A tutorial introduction to Sqitch change management on Vertica =head1 Synopsis sqitch * =head1 Description This tutorial explains how to create a sqitch-enabled Vertica project, use a VCS for deployment planning, and work with other developers to make sure changes remain in sync and in the proper order. We'll start by creating a new project from scratch, a fictional antisocial networking site called Flipr. All examples use L as the VCS and L as the storage engine, but for the most part you can substitute other VCSes and database engines in the examples as appropriate. If you'd like to manage a database with a different database engine, see one of the following other tutorials: =over =item L =item L =item L =item L =item L =item L =item L =item L =back =head2 Connection Configuration Sqitch requires ODBC to connect to the Vertica database. As such, you'll need to make sure that the Vertica ODBC driver is properly configured. At its simplest, on Unix-like systems, name the driver "Vertica" by adding this entry to C (usually found in C, C, or C): [Vertica] Description = ODBC for Vertica Driver = /opt/vertica/lib64/libverticaodbc.so And also creating a C file in the same directory that contains: [Driver] DriverManagerEncoding=UTF-16 ODBCInstLib=/usr/lib64/libodbcinst.so ErrorMessagesPath=/opt/vertica/lib64 You might also consider naming your database connection by putting an entry in C (same directory), like so (assuming that Vertica is running on your local host): [dbadmin] Description = Vertica dbadmin connection Driver = Vertica Database = dbadmin Servername = localhost UserName = dbadmin Password = password Port = 5433 Locale = en_US See the L for details. Specific links: =over =item * L =item * L)|https://my.vertica.com/docs/7.1.x/HTML/index.htm#Authoring/ConnectingToHPVertica/ClientODBC/AdditionalODBCDriverConfigurationSettings.htm> =item * L =item * L =back =head1 Starting a New Project Usually the first thing to do when starting a new project is to create a source code repository. So let's do that with Git: > mkdir flipr > cd flipr > git init . Initialized empty Git repository in /flipr/.git/ > touch README.md > git add . > git commit -am 'Initialize project, add README.' If you're a Git user and want to follow along the history, the repository used in these examples is L. Now that we have a repository, let's get started with Sqitch. Every Sqitch project must have a name associated with it, and, optionally, a unique URI. We recommend including the URI, as it increases the uniqueness of object identifiers internally, and will prevent the deployment of a different project with the same name. So let's specify one when we initialize Sqitch: > sqitch init flipr --uri https://github.com/sqitchers/sqitch-vertica-intro/ --engine vertica Created sqitch.conf Created sqitch.plan Created deploy/ Created revert/ Created verify/ Let's have a look at F: > cat sqitch.conf [core] engine = vertica # plan_file = sqitch.plan # top_dir = . # [engine "vertica"] # target = db:vertica: # registry = sqitch # client = vsql Good, it picked up on the fact that we're creating changes for the Vertica engine, thanks to the C<--engine vertica> option, and saved it to the file. Furthermore, it wrote a commented-out C<[engine "vertica"]> section with all the available Vertica engine-specific settings commented out and ready to be edited as appropriate. By default, Sqitch will read F in the current directory for settings. But it will also read F<~/.sqitch/sqitch.conf> for user-specific settings. Since Vertica's C client is not in the path on my system, let's go ahead an tell it where to find the client on our computer (don't bother if you're using the L because it uses the client inside the container, not on your host machine): > sqitch config --user engine.vertica.client /opt/vertica/bin/vsql And let's also tell it who we are, since this data will be used in all of our projects: > sqitch config --user user.name 'Marge N. O’Vera' > sqitch config --user user.email 'marge@example.com' Have a look at F<~/.sqitch/sqitch.conf> and you'll see this: > cat ~/.sqitch/sqitch.conf [engine "vertica"] client = /opt/vertica/bin/vsql [user] name = Marge N. O’Vera email = marge@example.com Which means that Sqitch should be able to find C for any project, and that it will always properly identify us when planning and committing changes. Back to the repository. Have a look at the plan file, F: > cat sqitch.plan %syntax-version=1.0.0 %project=flipr %uri=https://github.com/sqitchers/sqitch-vertica-intro/ Note that it has picked up on the name and URI of the app we're building. Sqitch uses this data to manage cross-project dependencies. The C<%syntax-version> pragma is always set by Sqitch, so that it always knows how to parse the plan, even if the format changes in the future. Let's commit these changes and start creating the database changes. > git add . > git commit -am 'Initialize Sqitch configuration.' [main a42564d] Initialize Sqitch configuration. 2 files changed, 16 insertions(+), 0 deletions(-) create mode 100644 sqitch.conf create mode 100644 sqitch.plan =head1 Our First Change First, our project will need a schema. This creates a nice namespace for all of the objects that will be part of the flipr app. Run this command: > sqitch add appschema -n 'Add schema for all flipr objects.' Created deploy/appschema.sql Created revert/appschema.sql Created verify/appschema.sql Added "appschema" to sqitch.plan The L|sqitch-add> command adds a database change to the plan and writes deploy, revert, and verify scripts that represent the change. Now we edit these files. The C script's job is to create the schema. So we add this to F: CREATE SCHEMA flipr; The C script's job is to precisely revert the change to the deploy script, so we add this to F: DROP SCHEMA flipr; Now we can try deploying this change. We tell Sqitch where to send the change via a L, assuming the default C database and user and an ODBC driver named C (see L for details). If you want to first L, simply use its name in place of C: > sqitch deploy 'db:vertica://dbadmin:password@localhost:5433/dbadmin?Driver=Vertica' Adding registry tables to db:vertica://dbadmin:@localhost:5433/dbadmin?Driver=Vertica Deploying changes to db:vertica://dbadmin:@localhost:5433/dbadmin?Driver=Vertica + appschema .. ok First Sqitch created registry tables used to track database changes. The structure and name of the registry varies between databases (Vertica uses a schema to namespace its registry, while SQLite and MySQL use separate databases). Next, Sqitch deploys changes. We only have one so far; the C<+> reinforces the idea that the change is being C to the database. With this change deployed, if you connect to the database, you'll be able to see the schema: > vsql -U dbadmin -c '\dn flipr' List of schemas Name | Owner | Comment -------+---------+--------- flipr | dbadmin | =head2 Trust, But Verify But that's too much work. Do you really want to do something like that after every deploy? Here's where the C script comes in. Its job is to test that the deploy did was it was supposed to. It should do so without regard to any data that might be in the database, and should throw an error if the deploy was not successful. In Vertica, the simplest way to do so for schema is probably to simply create an object in the schema. Put this SQL into F: CREATE TABLE flipr.verify__ (id int); DROP TABLE flipr.verify__; In truth, you can use I query that generates an SQL error if the schema doesn't exist. Another handy way to do that is to divide by zero if an object doesn't exist. For example, to throw an error when the C schema does not exist, you could do something like this: SELECT 1/COUNT(*) FROM v_catalog.schemata WHERE schema_name = 'flipr'; Either way, run the C script with the L|sqitch-verify> command: > sqitch verify 'db:vertica://dbadmin:password@localhost:5433/dbadmin?Driver=Vertica' Verifying db:vertica://dbadmin:@localhost:5433/dbadmin?Driver=Vertica * appschema .. ok Verify successful Looks good! If you want to make sure that the verify script correctly dies if the schema doesn't exist, temporarily change the schema name in the script to something that doesn't exist, something like: CREATE TABLE nonesuch.verify__ (id int); Then L|sqitch-verify> again: > sqitch verify 'db:vertica://dbadmin:password@localhost:5433/dbadmin?Driver=Vertica' Verifying db:vertica://dbadmin:@localhost:5433/dbadmin?Driver=Vertica * appschema .. vsql:verify/appschema.sql:5: ROLLBACK 4650: Schema "nonesuch" does not exist # Verify script "verify/appschema.sql" failed. not ok Verify Summary Report --------------------- Changes: 1 Errors: 1 Verify failed It's even nice enough to tell us what the problem is. Or, for the divide-by-zero example, change the schema name: SELECT 1/COUNT(*) FROM v_catalog.schemata WHERE schema_name = 'nonesuch'; Then the verify will look something like: > sqitch verify 'db:vertica://dbadmin:password@localhost:5433/dbadmin?Driver=Vertica' Verifying db:vertica://dbadmin:@localhost:5433/dbadmin?Driver=Vertica * appschema .. vsql:verify/appschema.sql:5: ERROR 2005: division by zero # Verify script "verify/appschema.sql" failed. not ok Verify Summary Report --------------------- Changes: 1 Errors: 1 Verify failed Less useful error output, but enough to alert us that something has gone wrong. Don't forget to change the schema name back before continuing! =head2 Status, Revert, Log, Repeat For purely informational purposes, we can always see how a deployment was recorded via the L|sqitch-status> command, which reads the registry tables from the database: > sqitch status 'db:vertica://dbadmin:password@localhost:5433/dbadmin?Driver=Vertica' # On database db:vertica://dbadmin:@localhost:5433/dbadmin?Driver=Vertica # Project: flipr # Change: f9759f0ed77964b6a3b6c7aa3b6058b4bb7db764 # Name: appschema # Deployed: 2014-09-04 15:26:28 -0700 # By: Marge N. O’Vera # Nothing to deploy (up-to-date) Let's make sure that we can revert the change: > sqitch revert 'db:vertica://dbadmin:password@localhost:5433/dbadmin?Driver=Vertica' Revert all changes from db:vertica://dbadmin:@localhost:5433/dbadmin?Driver=Vertica? [Yes] - appschema .. ok The L|sqitch-revert> command first prompts to make sure that we really do want to revert. This is to prevent unnecessary accidents. You can pass the C<-y> option to disable the prompt. Also, notice the C<-> before the change name in the output, which reinforces that the change is being I from the database. And now the schema should be gone: > vsql -U dbadmin -c '\dn flipr' List of schemas Name | Owner | Comment ------+-------+--------- (0 rows) And the status message should reflect as much: > sqitch status 'db:vertica://dbadmin:password@localhost:5433/dbadmin?Driver=Vertica' # On database db:vertica://dbadmin:@localhost:5433/dbadmin?Driver=Vertica No changes deployed Of course, since nothing is deployed, the L|sqitch-verify> command has nothing to verify: > sqitch verify 'db:vertica://dbadmin:password@localhost:5433/dbadmin?Driver=Vertica' Verifying db:vertica://dbadmin:@localhost:5433/dbadmin?Driver=Vertica No changes deployed However, we still have a record that the change happened, visible via the L|sqitch-log> command: > sqitch log 'db:vertica://dbadmin:password@localhost:5433/dbadmin?Driver=Vertica' On database db:vertica://dbadmin:@localhost:5433/dbadmin?Driver=Vertica Revert f9759f0ed77964b6a3b6c7aa3b6058b4bb7db764 Name: appschema Committer: Marge N. O’Vera Date: 2014-09-04 16:33:02 -0700 Add schema for all flipr objects. Deploy f9759f0ed77964b6a3b6c7aa3b6058b4bb7db764 Name: appschema Committer: Marge N. O’Vera Date: 2014-09-04 15:26:28 -0700 Add schema for all flipr objects. Note that the actions we took are shown in reverse chronological order, with the revert first and then the deploy. Cool. Now let's commit it. > git add . > git commit -m 'Add flipr schema.' [main 9bee4bd] Add flipr schema. 5 files changed, 197 insertions(+), 0 deletions(-) create mode 100644 deploy/appschema.sql create mode 100644 revert/appschema.sql create mode 100644 sqitch.sql create mode 100644 verify/appschema.sql And then deploy again. This time, let's use the C<--verify> option, so that the C script is applied when the change is deployed: > sqitch deploy --verify 'db:vertica://dbadmin:password@localhost:5433/dbadmin?Driver=Vertica' Deploying changes to db:vertica://dbadmin:@localhost:5433/dbadmin?Driver=Vertica + appschema .. ok And now the schema should be back: > vsql -U dbadmin -c '\dn flipr' List of schemas Name | Owner | Comment -------+---------+--------- flipr | dbadmin | When we look at the status, the deployment will be there: > sqitch status 'db:vertica://dbadmin:password@localhost:5433/dbadmin?Driver=Vertica' # On database db:vertica://dbadmin:@localhost:5433/dbadmin?Driver=Vertica # Project: flipr # Change: f9759f0ed77964b6a3b6c7aa3b6058b4bb7db764 # Name: appschema # Deployed: 2014-09-04 16:37:38 -0700 # By: Marge N. O’Vera # Nothing to deploy (up-to-date) =head1 On Target I'm getting a little tired of always having to type C<'db:vertica://dbadmin:password@localhost:5433/dbadmin?Driver=Vertica'>, aren't you? This L tells Sqitch how to connect to the deployment target, but we don't have to keep using the URI. We can name the target: > sqitch target add flipr_test 'db:vertica://dbadmin:password@localhost:5433/dbadmin?Driver=Vertica' The L|sqitch-target> command, inspired by L|https://git-scm.com/docs/git-remote>, allows management of one or more named deployment targets. We've just added a target named C, which means we can use the string C for the target, rather than the URI. But since we're doing so much testing, we can also tell Sqitch to deploy to the C target by default: > sqitch engine add vertica flipr_test Now we can omit the target argument altogether, unless we need to deploy to another database. Which we will, eventually, but at least our examples will be simpler from here on in, e.g.: > sqitch status # On database flipr_test # Project: flipr # Change: f9759f0ed77964b6a3b6c7aa3b6058b4bb7db764 # Name: appschema # Deployed: 2014-09-04 16:37:38 -0700 # By: Marge N. O’Vera # Nothing to deploy (up-to-date) Yay, that allows things to be a little more concise. Let's also make sure that changes are verified after deploying them: > sqitch config --bool deploy.verify true > sqitch config --bool rebase.verify true We'll see the L|sqitch-rebase> command a bit later. In the meantime, let's commit the new configuration and and make some more changes! > git commit -am 'Set default deployment target and always verify.' [main 469779a] Set default deployment target and always verify. 1 files changed, 8 insertions(+), 0 deletions(-) =head1 Deploy with Dependency Let's add another change, this time to create a table. Our app will need users, of course, so we'll create a table for them. First, add the new change: > sqitch add users --requires appschema -n 'Creates table to track our users.' Created deploy/users.sql Created revert/users.sql Created verify/users.sql Added "users [appschema]" to sqitch.plan Note that we're requiring the C change as a dependency of the new C change. Although that change has already been added to the plan and therefore should always be applied before the C change, it's a good idea to be explicit about dependencies. Now edit the scripts. When you're done, F should look like this: -- Deploy flipr:users to vertica -- requires: appschema CREATE TABLE flipr.users ( nickname VARCHAR PRIMARY KEY, password VARCHAR NOT NULL, fullname VARCHAR(256) NOT NULL, twitter VARCHAR NOT NULL, timestamp TIMESTAMPTZ NOT NULL DEFAULT NOW() ); A few things to notice here. On the second line, the dependence on the C change has been listed. This doesn't do anything, but the default C Vertica template lists it here for your reference while editing the file. Useful, right? The table itself will be created in the C schema. This is why we need to require the C change. Now for the verify script. The simplest way to check that the table was created and has the expected columns without touching the data? Just select from the table with a false C clause. Add this to F: SELECT nickname, password, fullname, twitter, timestamp FROM flipr.users WHERE FALSE; Now for the revert script: all we have to do is drop the table. Add this to F: DROP TABLE flipr.users; Couldn't be much simpler, right? Let's deploy this bad boy: > sqitch deploy Deploying changes to flipr_test + users .. ok We know, since verification is enabled, that the table must have been created. But for the purposes of visibility, let's have a quick look: > vsql -U dbadmin -c '\d flipr.users' List of Fields by Tables Schema | Table | Column | Type | Size | Default | Not Null | Primary Key | Foreign Key --------+-------+-------------+-------------+------+---------+----------+-------------+------------- flipr | users | nickname | varchar(80) | 80 | | t | t | flipr | users | password | varchar(80) | 80 | | t | f | flipr | users | "timestamp" | timestamptz | 8 | now() | t | f | We can also verify all currently deployed changes with the L|sqitch-verify> command: > sqitch verify Verifying flipr_test * appschema .. ok * users ...... ok Verify successful Now have a look at the status: > sqitch status # On database flipr_test # Project: flipr # Change: d647ac8c130a7e0b12c9049789e46afb4a4f6e53 # Name: users # Deployed: 2014-09-04 16:42:45 -0700 # By: Marge N. O’Vera # Nothing to deploy (up-to-date) Success! Let's make sure we can revert the change, as well: > sqitch revert --to @HEAD^ -y Reverting changes to appschema from flipr_test - users .. ok Note that we've used the C<--to> option to specify the change to revert to. And what do we revert to? The symbolic tag C<@HEAD>, when passed to L|sqitch-revert>, always refers to the last change deployed to the database. (For other commands, it refers to the last change in the plan.) Appending the caret (C<^>) tells Sqitch to select the change I to the last deployed change. So we revert to C, the penultimate change. The other potentially useful symbolic tag is C<@ROOT>, which refers to the first change deployed to the database (or in the plan, depending on the command). Back to the database. The C table should be gone but the C schema should still be around: > vsql -U dbadmin -c '\d flipr.users' Did not find any relation. The L|sqitch-status> command politely informs us that we have undeployed changes: > sqitch status # On database flipr_test # Project: flipr # Change: f9759f0ed77964b6a3b6c7aa3b6058b4bb7db764 # Name: appschema # Deployed: 2014-09-04 16:37:38 -0700 # By: Marge N. O’Vera # Undeployed change: * users As does the L|sqitch-verify> command: > sqitch verify Verifying flipr_test * appschema .. ok Undeployed change: * users Verify successful Note that the verify is successful, because all currently-deployed changes are verified. The list of undeployed changes (just "users" here) reminds us about the current state. Okay, let's commit and deploy again: > git add . > git commit -am 'Add users table.' [main c7c24c5] Add users table. 4 files changed, 18 insertions(+), 0 deletions(-) create mode 100644 deploy/users.sql create mode 100644 revert/users.sql create mode 100644 verify/users.sql Deploying changes to flipr_test + users .. ok Looks good. Check the status: > sqitch status # On database flipr_test # Project: flipr # Change: d647ac8c130a7e0b12c9049789e46afb4a4f6e53 # Name: users # Deployed: 2014-09-04 17:42:53 -0700 # By: Marge N. O’Vera # Nothing to deploy (up-to-date) Excellent. Let's do some more! =head1 Add Two at Once Let's add a couple more changes. Our app will need to store status messages from users. Let's call them -- and the table to store them -- "flips". And we'll also need a view that lists user names with their flips. Let's add changes for them both: > sqitch add flips -r appschema -r users -n 'Adds table for storing flips.' Created deploy/flips.sql Created revert/flips.sql Created verify/flips.sql Added "flips [appschema users]" to sqitch.plan > sqitch add userflips -r appschema -r users -r flips \ -n 'Creates the userflips view.' Created deploy/userflips.sql Created revert/userflips.sql Created verify/userflips.sql Added "userflips [appschema users flips]" to sqitch.plan Now might be a good time to have a look at the deployment plan: > cat sqitch.plan %syntax-version=1.0.0 %project=flipr %uri=https://github.com/sqitchers/sqitch-vertica-intro/ appschema 2014-09-04T18:40:34Z Marge N. O’Vera # Add schema for all flipr objects. users [appschema] 2014-09-04T23:40:15Z Marge N. O’Vera # Creates table to track our users. flips [appschema users] 2014-09-05T00:16:58Z Marge N. O’Vera # Adds table for storing flips. userflips [appschema users flips] 2014-09-05T00:18:43Z Marge N. O’Vera # Creates the userflips view. Each change appears on a single line with the name of the change, a bracketed list of dependencies, a timestamp, the name and email address of the user who planned the change, and a note. Let's write the code for the new changes. Here's what F should look like: -- Deploy flipr:flips to vertica -- requires: appschema -- requires: users CREATE TABLE flipr.flips ( id AUTO_INCREMENT PRIMARY KEY , nickname VARCHAR NOT NULL REFERENCES flipr.users(nickname), body VARCHAR(180) NOT NULL DEFAULT '', timestamp TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp() ); Here's what F might look like: -- Verify flipr:flips on vertica SELECT id, nickname, body, timestamp FROM flipr.flips WHERE FALSE; We simply take advantage of the fact that C throws an exception if the specified function does not exist. And F should look something like this: -- Revert flipr:flips from vertica DROP TABLE flipr.flips; Now for C; F might look like this: -- Deploy flipr:userflips to vertica -- requires: appschema -- requires: users -- requires: flips CREATE OR REPLACE VIEW flipr.userflips AS SELECT f.id, u.nickname, u.fullname, f.body, f.timestamp FROM flipr.users u JOIN flipr.flips f ON u.nickname = f.nickname; Use a C