I originally documented this so I would have a reference while tracking down the problem. I have since solved the issue and you can jump to the bottom of this post to find the solution. What I do not know is why this was not an issue under PostgreSQL 10.6 but was under PostgreSQL 11.1.
How did the issue arise
FreshPorts deals with the FreeBSD package tree. One of the things it does is replicate the directory tree for the directories and files. Yesterday, I moved my development database from a PostgreSQL 10.6 server to a PostgreSQL 11.1 server. Now I am getting errors on commits which add a new file to the tree. I think I’ve narrowed it down to this trigger:
CREATE OR REPLACE FUNCTION element_pathname_insert() RETURNS TRIGGER AS $$ BEGIN INSERT INTO element_pathname (element_id, pathname) VALUES (NEW.id, element_pathname(NEW.id)); RETURN NEW; END $$ LANGUAGE 'plpgsql'; DROP TRIGGER IF EXISTS element_pathname_insert ON element; CREATE TRIGGER element_pathname_insert AFTER INSERT on element FOR EACH ROW EXECUTE PROCEDURE element_pathname_insert();
This trigger is invoked each time a new row is added to the element table.
This is where some background may be useful.
element
element is a self referential table, and implements a directory & file structure.
freshports.dev=# \d element Table "public.element" Column | Type | Collation | Nullable | Default ---------------------+--------------+-----------+----------+------------------------------------- id | integer | | not null | nextval('element_id_seq'::regclass) name | text | | not null | parent_id | integer | | | directory_file_flag | character(1) | | not null | status | character(1) | | not null | Indexes: "element_pkey" PRIMARY KEY, btree (id) "element_delete" btree (status) WHERE status = 'D'::bpchar "element_name" btree (name) "element_parent_id" btree (parent_id) Check constraints: "element_directory_file_flag" CHECK (directory_file_flag = 'F'::bpchar OR directory_file_flag = 'D'::bpchar) "element_status" CHECK (status = 'A'::bpchar OR status = 'D'::bpchar) Foreign-key constraints: "$1" FOREIGN KEY (parent_id) REFERENCES element(id) ON UPDATE RESTRICT ON DELETE CASCADE
For simplicity, the the Referenced by and Triggers definitions have been omitted.
Given a pathname, we can find the element id:
freshports.dev=# select Pathname_ID('/ports/head/sysutils/bacula-server/Makefile'); pathname_id ------------- 205195 (1 row)
Given an id, we can find the pathname:
freshports.dev=# select element_pathname(205195); element_pathname --------------------------------------------- /ports/head/sysutils/bacula-server/Makefile (1 row)
For historical reasons, element_pathname() is implemented with a WHILE … LOOP structure. I am sure it could be updated to use the element_pathname table.
element_pathname
The element_pathname table is a relatively recent addition, whereas element_pathname() dates back about 17 years. The table was added to make some of the more complex queries easier to construct.
The problem
The problem arises when a commit adds a new file to the tree. The code first checks to see if the pathname already exists in the tree:
freshports.dev=# select Pathname_ID('/base/releng/12.0/release/tools/gce.conf'); pathname_id ------------- (1 row) freshports.dev=#
It finds nothing, so it adds:
freshports.dev=# select Element_Add('/base/releng/12.0/release/tools/gce.conf', 'F'); ERROR: duplicate key value violates unique constraint "element_pathname_pathname" DETAIL: Key (pathname)=() already exists. CONTEXT: SQL statement "INSERT INTO element_pathname (element_id, pathname) VALUES (NEW.id, element_pathname(NEW.id))" PL/pgSQL function element_pathname_insert() line 3 at SQL statement SQL statement "insert into element(id, parent_id, directory_file_flag, name, status) values(element_id, element_parent_id, 'D', element_name, 'A')" PL/pgSQL function element_add(text,character) line 89 at SQL statement
This is a conflict in the element_pathname table.
I suspect the reason is that element_pathname(NEW.id) is returning an empty string because it cannot yet see the newly inserted value in the element table.
I have not supplied the contents of Element_Add() because it is about 100 lines of code.
This ’empty string’ theory is supported by finding this:
freshports.dev=# select * from element_pathname where pathname = ''; element_id | pathname ------------+---------- 910588 | (1 row)
What file is that?
freshports.dev=# select element_pathname(910588); element_pathname ----------------------------------------------- /base/head/bin/sh/tests/expansion/question2.0 (1 row)
Checking the logs, this new file was recently added, and I suspect, but have not confirmed, that it was the first addition to the tree after moving to PostgreSQL 11.1.
Sometimes it fails, but then it succeeds
I started doing some testing to see what happens and I created a test database for this very purpose. I tried a series of COMMIT INSERT ROLLBACK commands.
Let’s try that now:
[dan@pg03:~] $ psql FPtest psql (11.1) Type "help" for help. FPtest=# begin; BEGIN FPtest=# select Element_Add('/base/releng/12.0/release/tools/gce.conf', 'F'); ERROR: duplicate key value violates unique constraint "element_pathname_pathname" DETAIL: Key (pathname)=() already exists. CONTEXT: SQL statement "INSERT INTO element_pathname (element_id, pathname) VALUES (NEW.id, element_pathname(NEW.id))" PL/pgSQL function element_pathname_insert() line 3 at SQL statement SQL statement "insert into element(id, parent_id, directory_file_flag, name, status) values(element_id, element_parent_id, element_directory_file_flag, element_name, 'A')" PL/pgSQL function element_add(text,character) line 86 at SQL statement FPtest=# rollback; ROLLBACK
OK, let’s try it again, but with something I know will not exist.
FPtest=# begin; BEGIN FPtest=# select Element_Add('/DELETEME/1/2/3/4', 'F'); ERROR: duplicate key value violates unique constraint "element_pathname_pathname" DETAIL: Key (pathname)=() already exists. CONTEXT: SQL statement "INSERT INTO element_pathname (element_id, pathname) VALUES (NEW.id, element_pathname(NEW.id))" PL/pgSQL function element_pathname_insert() line 3 at SQL statement SQL statement "insert into element(id, parent_id, directory_file_flag, name, status) values(element_id, element_parent_id, 'D', element_name, 'A')" PL/pgSQL function element_add(text,character) line 89 at SQL statement FPtest=# rollback; ROLLBACK
Well, there is no reason for that to fail, yet it does. Let’s try this:
FPtest=# begin; BEGIN FPtest=# select Element_Add('/DELETEME/1', 'F'); element_add ------------- 910593 (1 row)
Woah, that worked. Let’s try the original, or close to it:
FPtest=# select Element_Add('/base/releng/12.0/release/tools/gce.conf', 'F'); element_add ------------- 910595 (1 row) FPtest=# rollback; ROLLBACK FPtest=# \q
That worked too. What’s up with that?
The IRC help
I got onto IRC and asked what might be up. I included this gist and it was RhodiumToad who pointed out that I must have two functions called element_pathname().
Have a look at line 5 of the stored element_pathname_insert() trigger in that gist (originally presented at the top of this post, but included here so you don’t have to scroll back):
CREATE OR REPLACE FUNCTION element_pathname_insert() RETURNS TRIGGER AS $$ BEGIN RAISE WARNING 'element_pathname_insert says: %', NEW.id; INSERT INTO element_pathname (element_id, pathname) VALUES (NEW.id, element_pathname(NEW.id)); RETURN NEW; END $$ LANGUAGE 'plpgsql';
You might notice that this version of the function has a NOTICE added to it. I was trying many things to track this down.
Looking at the output of df I did find the second function:
freshports.org=# \df Element_Pathname List of functions Schema | Name | Result data type | Argument data types | Type --------+------------------+------------------+---------------------+-------- public | element_pathname | text | integer | normal public | element_pathname | text | integer, boolean | normal (2 rows)
df+ showed me the differences, reproduced here, side by side to make it easier to spot the differences. I have added whitespace to make similar code line up.
CREATE FUNCTION public.element_pathname(integer) RETURNS text LANGUAGE plpgsql IMMUTABLE AS $_$ DECLARE element_id ALIAS FOR $1; my_parent_id int4; element_name text; pathname text; begin pathname = ''; select name, parent_id into element_name, my_parent_id from element where id = element_id; IF FOUND THEN pathname := '/' || element_name || pathname; WHILE FOUND LOOP select name, parent_id into element_name, my_parent_id from element where id = my_parent_id; IF FOUND THEN pathname = '/' || element_name || pathname; END IF; END LOOP; END IF; return pathname; END; $_$; |
CREATE FUNCTION public.element_pathname(integer, boolean) RETURNS text LANGUAGE plpgsql STABLE AS $_$ DECLARE element_id ALIAS FOR $1; leading_slash ALIAS for $2; my_parent_id int4; element_name text; pathname text; begin pathname = ''; select name, parent_id into element_name, my_parent_id from element where id = element_id; IF FOUND THEN pathname := '/' || element_name || pathname; WHILE FOUND LOOP select name, parent_id into element_name, my_parent_id from element where id = my_parent_id; IF FOUND THEN IF my_parent_id IS NULL THEN IF leading_slash THEN pathname = '/' || element_name || pathname; ELSE pathname = element_name || pathname; END IF; EXIT; ELSE pathname = '/' || element_name || pathname; END IF; END IF; END LOOP; END IF; return pathname; END; $_$; |
These two functions are very similar in both purpose and structure, the primary difference: having a leading slash or not.
SIDE NOTE: whenever I code this I always consider whether or not to just remove the leading slash at the end of the code instead of testing a condition each time through the loop. The current code is plenty fast.
What stood out when comparing the two functions?
When I looked at the two functions, I noticed the one being invoked by the trigger was declared to be IMMUTABLE, which is defined as:
An IMMUTABLE function cannot modify the database and is guaranteed to return the same results given the same arguments forever.
This function does not modify the database, but it could return different results if the path was modified. The path does not get modified by these functions. Any modification would be external to what we are doing here.
Changed to STABLE
I changed the function to STABLE and tried my tests again.
A STABLE function cannot modify the database and is guaranteed to return the same results given the same arguments for all rows within a single statement.
This is definitely true for this function.
Defined as STABLE, this problem went away.
Why 11.1?
I do not know why this function is OK under PostgreSQL 10.6, but causes a problem under 11.1 but I can speculate. I am also sure I will be corrected.
I suspect 11.1 is better at handling immutable functions and started doing optimizations which 10 did not.
Or you could take the view that it got worse.
Yes, I know the function should not have been immutable at all, but I would like to know the explanation behind the change in behavior.
Further function changes
Those two functions were similar, but can be combined. I reduced the element_pathname (integer) function to merely invoke element_pathname (integer, boolean):
CREATE FUNCTION public.element_pathname(integer) RETURNS text LANGUAGE plpgsql STABLE AS $_$ DECLARE element_id ALIAS FOR $1; BEGIN return Element_Pathname(element_id, TRUE); END; $_$;