This post documents the database upgrade process.
Issues encountered during initial attempts
There are permission issues on the database. Not everything is owned by the postgres super-user. In fact, many of them are owned by dan – this database is 20+ years old.
To easily resolve the ownership issue, I take the database dump (named freshports.org.dump) from production (the hostname is supernews) and load it into a development database using the –no-owner option on the pg_restore command. More on that later. The –no-owner option ensures that the restore process does not “set ownership of objects to match the original database“.
The permission issue also extends to functions. If I look at the production database, I see this:
freshports.org=# \df+ armor List of functions Schema | Name | Result data type | Argument data types | Type | Volatility | Parallel | Owner | Security | Access privileges | Language | Source code | Description --------+-------+------------------+---------------------+------+------------+----------+-------+----------+-------------------+----------+-------------+------------- public | armor | text | bytea | func | immutable | unsafe | dan | invoker | | c | pg_armor | (1 row) freshports.org=#
That permission issue is also fixed by the –no-owner option. After importing in the dev database, that function looks like this:
freshports.prod.test.owners=# \df+ armor List of functions Schema | Name | Result data type | Argument data types | Type | Volatility | Parallel | Owner | Security | Access privileges | Language | Source code | Description --------+-------+------------------+---------------------+------+------------+----------+----------+----------+-------------------+----------+-------------+------------- public | armor | text | bytea | func | immutable | unsafe | postgres | invoker | | c | pg_armor | (1 row) freshports.prod.test.owners=#
The current procedure
This section outlines the steps to convert the existing production database and copy it over to Amazon’s RDS.
On the production server:
pg_dump -Fc freshports.org -f freshports.org.dump
Load into dev
Load that into the development server:
sudo su -l postgres createdb -O postgres -T template0 -E SQL_ASCII freshports.prod pg_restore -j 20 --no-owner -d freshports.prod freshports.org.dump
Doing this as the postgres use ensures optimal permissions.
Estimated time: 25 minutes.
Adjust the database schema
Once we had the database loaded into the dev environment, I ran this script to update the schema to be git-ready.
-- repo table needs to know which repository tool we are using ALTER TABLE public.repo ADD COLUMN repository text NOT NULL DEFAULT 'subversion'; COMMENT ON COLUMN public.repo.repository IS 'subversion? git? cvs?'; -- Let's rename the svn_hostname column: ALTER TABLE public.repo RENAME svn_hostname TO repo_hostname; -- We don't need these unique indexes. We might do something with constraints later. drop index repo_description; drop index repo_name; drop index repo_path_to_repo; -- Add in the git data: insert into repo (name, description, repo_hostname, path_to_repo, repository) values ('ports', 'The FreeBSD Ports tree', 'github.com', '/freebsd/freebsd-ports', 'git'); insert into repo (name, description, repo_hostname, path_to_repo, repository) values ('doc', 'The FreeBSD doc tree', 'github.com', '/freebsd/doc', 'git'); insert into repo (name, description, repo_hostname, path_to_repo, repository) values ('src', 'The FreeBSD src tree', 'github.com', '/freebsd/freebsd', 'git'); -- Also, the short hash. ALTER TABLE public.commit_log ADD COLUMN commit_hash_short text; COMMENT ON COLUMN public.commit_log. commit_hash_short IS 'This is the short version of the git hash stored in svn_revision. If null/empty, it is not a git hash.'; -- we need an index on the short stuff -- Index: commit_log_commit_hash_short -- DROP INDEX public.commit_log_commit_hash_short; CREATE INDEX commit_log_commit_hash_short ON public.commit_log USING btree (commit_hash_short COLLATE pg_catalog."default" ASC NULLS LAST) TABLESPACE pg_default;
This fixes up a lingering permissions issue – the permissions in question were too generous.
select grantee, privilege_type from information_schema.role_table_grants where table_name='pg_am'; select grantee, privilege_type from information_schema.role_table_grants where table_name='pg_authid';
If rsyncer is listed, then this:
revoke all PRIVILEGES ON all tables in schema pg_catalog from rsyncer;
Extensions made location
These steps are mentioned because it came up in response to these errors:
pg_restore: from TOC entry 351; 1255 81083 FUNCTION armor(bytea) dan pg_restore: error: could not execute query: ERROR: permission denied for language c
Why is that? This is an old database, and the permissions issues I mentioned above.
This was the recommended fix. The –no-owner option is insufficient to fix this problem.
begin; create extension pgcrypto from unpackaged; create extension fuzzystrmatch from unpackaged; create extension plperl from unpackaged;
If all good, also do a commit.
datatypes, relational integrity, and functions
With the table changes, we have updates to datatypes, triggers, and functions (stored procedures). These steps update those:
psql freshports.prod begin; \i datatype.txt \i ri.txt \i sp.txt commit;
NOTICES during the datatype.txt command are expected. There should be no errors.
That is the last step.
Dump the fixed schema
Now we dump this new schema and all the data:
pg_dump -Fc freshports.prod > freshports.prod.new-schema.dump
Estimated time: 8 minutes
Copy to AWS
The dump was copied to AWS and loaded into the RDS instance.
scp freshports.prod.new-schema.dump ec2-user@aws:
Load fixed schema
On AWS, create and load the database:
createdb -h pg01.[redacted].us-east-1.rds.amazonaws.com -U postgres -E SQL_ASCII -T template0 --locale=C freshports.fixed time pg_restore -h pg01.[redacted].us-east-1.rds.amazonaws.com -U postgres -d freshports.fixed freshports.prod.new-schema.dump
Estimate time: 35 minutes
These errors on loading is expected and does not affect the outcome.
pg_restore: while PROCESSING TOC: pg_restore: from TOC entry 465; 1255 847666 FUNCTION plpgsql_call_handler() postgres pg_restore: error: could not execute query: ERROR: permission denied for language c Command was: CREATE FUNCTION public.plpgsql_call_handler() RETURNS language_handler LANGUAGE c AS '$libdir/plpgsql', 'plpgsql_call_handler'; pg_restore: error: could not execute query: ERROR: function public.plpgsql_call_handler() does not exist Command was: ALTER FUNCTION public.plpgsql_call_handler() OWNER TO postgres; pg_restore: warning: errors ignored on restore: 2
Using the new database
Now that the new database is over on AWS, you can modifying the configuration files to start using it.
The list of configuration files which need to be updated are covered by Moving devgit.freshports.org from GitHub to git.FreeBSD.org