Jan 262019
 

Ever since the new server went into production, sometime in 2017, the notification emails, which tell you what has changed, have been coming from the old server. I never did that changeover. The old server was still processing incoming commits, so the notifications were accurate. However, new accounts would not get notifications and changes to existing accounts would be ignored. The notifications were getting increasingly inaccurate for your settings.

This is far from ideal and not good at all.

What I have to do now is copy over the notification statistics.

When did this change occur

I was not sure when the new server went into production. I looked in old blog posts, and on social media, and failed to find out.

I resorted to DNS because every change is recorded in subversion, thanks to nsnotifyd (more information on how I did that via this post).

Finding that date took me a while, because I was searching for my name in the logs, I failed. Every commit for past 17 months has been done by the dns-notify user. Eventually I found it. I can confidently say the change over occurred at Sun Nov 5 16:53:23 2017 UTC.

This date will be used later in this post.

What needs to be moved over

The following is on the old server.

I wasn’t sure exactly what needed to be moved, but I knew the answer might be in the report-notification.pl scripts. Looking in there, I found this snippet:

                my $dbh = FreshPorts::Database::GetDBHandle();
                if ($dbh->{Active}) {

                        $sql = "select last_sent 
                                  from report_log_latest
                                 where frequency = '$Frequency'
                                   and report_id = $ReportID";
                        if ($FreshPorts::Config::ReportDebugging) {
                                print "sql is $sql\n";
                        }

                        $sth = $dbh->prepare($sql);
                        $sth->execute ||
                                        die "Could not execute SQL $sql ... maybe invalid";

                        @row=$sth->fetchrow_array;
                        my $last_sent = $row[0];

That’s it, I need entries from the report_log_latest table. What does that table look like?

freshports.org=# \d report_log_latest
                     View "public.report_log_latest"
    Column    |           Type           | Collation | Nullable | Default 
--------------+--------------------------+-----------+----------+---------
 report_id    | integer                  |           |          | 
 frequency_id | integer                  |           |          | 
 frequency    | character(1)             |           |          | 
 last_sent    | timestamp with time zone |           |          | 

freshports.org=# 
 

Oh, it’s a view, not a table. Let’s look closer:

freshports.org=# \d+ report_log_latest
                                  View "public.report_log_latest"
    Column    |           Type           | Collation | Nullable | Default | Storage  | Description 
--------------+--------------------------+-----------+----------+---------+----------+-------------
 report_id    | integer                  |           |          |         | plain    | 
 frequency_id | integer                  |           |          |         | plain    | 
 frequency    | character(1)             |           |          |         | extended | 
 last_sent    | timestamp with time zone |           |          |         | plain    | 
View definition:
 SELECT report_log.report_id,
    report_log.frequency_id,
    report_frequency.frequency,
    max(report_log.report_date) AS last_sent
   FROM report_log,
    report_frequency
  WHERE report_log.frequency_id = report_frequency.id
  GROUP BY report_log.report_id, report_log.frequency_id, report_frequency.frequency;

freshports.org=# 

How long has it been since any reports were sent out:

freshports.org=# SELECT * FROM report_log_latest ORDER BY last_sent DESC;
 report_id | frequency_id | frequency |           last_sent           
-----------+--------------+-----------+-------------------------------
         2 |            4 | F         | 2018-11-09 19:20:00.456444+00
         1 |            4 | F         | 2018-11-09 19:20:00.434312+00
         2 |            2 | D         | 2018-11-09 18:10:00.431505+00
         1 |            2 | D         | 2018-11-09 18:10:00.411285+00
         5 |            2 | D         | 2018-11-09 18:05:00.407087+00
         1 |            3 | W         | 2018-11-06 19:00:01.348784+00
         2 |            3 | W         | 2018-11-06 19:00:00.497571+00
         1 |            5 | M         | 2018-10-23 19:40:00.677857+00
         2 |            5 | M         | 2018-10-23 19:40:00.426695+00
(9 rows)

freshports.org=# 

OK, that’s about 11 weeks ago.

Let’s look at production

This is from the production server. Yes, the database name is the same. I think the database should be freshports.prod on this server. It helps that each of the databases is on a different server. At home, where they are all on one server, I have:

  1. freshports.dev
  2. freshports.stage
  3. freshports.test

On production, it has been 64 weeks since we took a copy of the old database and moved it to the new server:

freshports.org=# SELECT * FROM report_log_latest ORDER BY last_sent DESC;
 report_id | frequency_id | frequency |           last_sent           
-----------+--------------+-----------+-------------------------------
         1 |            2 | D         | 2017-11-04 18:10:00.440969+00
         2 |            2 | D         | 2017-11-04 18:10:00.425991+00
         5 |            2 | D         | 2017-11-04 18:05:00.453121+00
         2 |            3 | W         | 2017-10-31 19:00:00.802006+00
         1 |            3 | W         | 2017-10-31 19:00:00.740916+00
         1 |            5 | M         | 2017-10-23 19:40:00.540706+00
         2 |            5 | M         | 2017-10-23 19:40:00.535091+00
         1 |            4 | F         | 2017-10-23 19:20:00.614792+00
         2 |            4 | F         | 2017-10-23 19:20:00.592397+00
(9 rows)

To retain history, I would like to copy over everything missing from the new server.

The date from above, when the DNS changed, was Sun Nov 5 16:53:23.

That corresponds nicely to the production data. This increases confidence in what I am about to do.

What’s missing from prod?

On the old server, what is newer than what’s on prod?

freshports.org=# SELECT COUNT(*) FROM report_log WHERE report_date >= '2017-11-04 18:10:00.440969+00';
 count 
-------
  1072
(1 row)

Let’s check the boundary condition there, and see if I can safely use > without missing any data:

freshports.org=# SELECT * FROM report_log WHERE report_date >= '2017-11-04 18:10:00.440969+00' ORDER BY report_date ASC LIMIT 10;
  id   | report_id | frequency_id |          report_date          | email_count | commit_count | port_count 
-------+-----------+--------------+-------------------------------+-------------+--------------+------------
 19073 |         1 |            2 | 2017-11-04 18:10:00.440969+00 |         123 |          396 |        435
 19074 |         5 |            2 | 2018-01-05 18:06:15.429902+00 |           1 |            2 |         88
 19076 |         1 |            2 | 2018-01-05 18:11:15.513286+00 |         459 |        18287 |      18829
 19075 |         2 |            2 | 2018-01-05 18:11:15.515059+00 |         118 |            0 |      44840
 19077 |         5 |            2 | 2018-01-06 18:06:15.501673+00 |           0 |            0 |          0
 19079 |         1 |            2 | 2018-01-06 18:11:15.68547+00  |         153 |          323 |        354
 19078 |         2 |            2 | 2018-01-06 18:11:15.699523+00 |         118 |            0 |        118
 19080 |         5 |            2 | 2018-01-07 18:05:00.545592+00 |           0 |            0 |          0
 19082 |         1 |            2 | 2018-01-07 18:10:00.820374+00 |         114 |          203 |        246
 19081 |         2 |            2 | 2018-01-07 18:10:00.830676+00 |         118 |            0 |        118
(10 rows)

freshports.org=# 

Yes, I can avoid duplicates easily by changing to > instead of >=.

I am confident I can just dump this query to a file, transfer it to the new server, and load it up without overwriting any data.

Let’s do that.

PostgreSQL Copy

Both the production and the old server are using PostgreSQL 11.1 but that should not matter much when using the COPY command. On the other hand, with pg_dump, you should always use pg_dump from the destination version.

Let’s do this!

freshports.org=# COPY (SELECT * FROM report_log WHERE report_date >= '2017-11-04 18:10:00.440969+00' ORDER BY report_date ASC) TO '/usr/home/dan/missing-reports.copy';
ERROR:  could not open file "/usr/home/dan/missing-reports.copy" for writing: Permission denied
HINT:  COPY TO instructs the PostgreSQL server process to write a file. You may want a client-side facility such as psql's \copy.
freshports.org=#

Ahh, PostgreSQL itself, the database server cannot write to my home directory. I have sudo privileges, so I could get around this easily:

$ touch /usr/home/dan/missing-reports.copy
$ sudo chown postgres /usr/home/dan/missing-reports.copy

After running the above command, we have:

$ head -2 missing-reports.copy 
19073	1	2	2017-11-04 18:10:00.440969+00	123	396	435
19074	5	2	2018-01-05 18:06:15.429902+00	1	2	88

Not everyone can do this, especially since I am on the PostgreSQL server with sudo access.

Instead, let’s try the \copy command.

freshports.org=# \copy (SELECT * FROM report_log WHERE report_date >= '2017-11-04 18:10:00.440969+00' ORDER BY report_date ASC) TO '/usr/home/dan/missing-reports2.copy';
COPY 1072

What did we get?

$ head -2 missing-reports2.copy
19073	1	2	2017-11-04 18:10:00.440969+00	123	396	435
19074	5	2	2018-01-05 18:06:15.429902+00	1	2	88
$ diff -ruN missing-reports.copy missing-reports2.copy 
$

Yes, that looks right and that diff confirms I have the same data as with the COPY command.

Loading up the \copy data

The transfer of the file from one server to another has been done.

Over on the production server, I have taken a backup.

freshports.org=# begin;
BEGIN
freshports.org=# \copy report_log FROM '~/missing-reports2.copy';
ERROR:  duplicate key value violates unique constraint "report_log_pkey"
DETAIL:  Key (id)=(19073) already exists.
CONTEXT:  COPY report_log, line 1
freshports.org=#

What?

Oh. Look back. I used >= instead of > as I planned to do.

The proper thing to do would be to run the query again, but I am confident that deleting that one line will be the same result. I took a copy of the local file and used vi to delete the first line, which contained the 19073 key.

Let’s try that again:

freshports.org=# begin;
BEGIN
freshports.org=# \copy report_log FROM '~/missing-reports2.copy';
COPY 1071
freshports.org=# 

That is the expected number of records, 1071.

Let’s check the latest sent dates:

freshports.org=# SELECT * FROM report_log_latest ORDER BY last_sent DESC;
 report_id | frequency_id | frequency |           last_sent           
-----------+--------------+-----------+-------------------------------
         2 |            4 | F         | 2018-11-09 19:20:00.456444+00
         1 |            4 | F         | 2018-11-09 19:20:00.434312+00
         2 |            2 | D         | 2018-11-09 18:10:00.431505+00
         1 |            2 | D         | 2018-11-09 18:10:00.411285+00
         5 |            2 | D         | 2018-11-09 18:05:00.407087+00
         1 |            3 | W         | 2018-11-06 19:00:01.348784+00
         2 |            3 | W         | 2018-11-06 19:00:00.497571+00
         1 |            5 | M         | 2018-10-23 19:40:00.677857+00
         2 |            5 | M         | 2018-10-23 19:40:00.426695+00
(9 rows)

Yes, that’s precisely what we had on the old server.

One final check:

freshports.org=# SELECT count(*) FROM report_log;
 count 
-------
 19863
(1 row)

Yes, that matches the old server (I have not shown that verification here).

Time to complete this:

freshports.org=# commit;
COMMIT
freshports.org=# 

Done.

Thank you for flying Air FreshPorts.

Website Pin Facebook Twitter Myspace Friendfeed Technorati del.icio.us Digg Google StumbleUpon Premium Responsive