Website slowness – the investigation

The website has been slow for some time. I think it is because of increased disk-io.

In this post:

FreeBSD 12.0
PostgreSQL 11.5
LibreNMS does the graphs via SNMP

The diskio overall

By increased, I mean it has gone from about 0.1k to about 1.6k, peaking to 140-150. This is steady and persistent.

This graphs shows the jump which occurred earlier this year.

My eyes can’t tell for sure which two drives are involved, but going to the graphs for individual drives, it is clearly ada3 and ada2.

ada2

This is for ada2:

ada3

And the same period for ada3:

zroot

zpool status confirms this is the zroot pool.

$ zpool status
  pool: main_tank
 state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
	still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
	the pool may no longer be accessible by software that does not support
	the features. See zpool-features(7) for details.
  scan: scrub repaired 0 in 0 days 02:43:45 with 0 errors on Wed Dec 18 06:32:53 2019
config:

	NAME        STATE     READ WRITE CKSUM
	main_tank   ONLINE       0     0     0
	  mirror-0  ONLINE       0     0     0
	    ada0p1  ONLINE       0     0     0
	    ada1p1  ONLINE       0     0     0

errors: No known data errors

  pool: zroot
 state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
	still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
	the pool may no longer be accessible by software that does not support
	the features. See zpool-features(7) for details.
  scan: scrub repaired 0 in 0 days 00:04:40 with 0 errors on Thu Dec 19 03:10:19 2019
config:

	NAME        STATE     READ WRITE CKSUM
	zroot       ONLINE       0     0     0
	  mirror-0  ONLINE       0     0     0
	    ada2p3  ONLINE       0     0     0
	    ada3p3  ONLINE       0     0     0

errors: No known data errors

The filesystems

What’s on there?

[dan@x8dtu:~] $ zfs list -r zroot
NAME                                    USED  AVAIL  REFER  MOUNTPOINT
zroot                                  26.8G  78.8G    96K  /zroot
zroot/ROOT                             5.92G  78.8G    96K  none
zroot/ROOT/default                     5.92G  78.8G  4.19G  /
zroot/TESTING                           616K  78.8G    88K  /zroot/TESTING
zroot/TESTING/top                       528K  78.8G    88K  /zroot/TESTING/top
zroot/TESTING/top/test1                  88K  78.8G    88K  /zroot/TESTING/top/test1
zroot/TESTING/top/test2                  88K  78.8G    88K  /zroot/TESTING/top/test2
zroot/TESTING/top/test3                  88K  78.8G    88K  /zroot/TESTING/top/test3
zroot/TESTING/top/test4                  88K  78.8G    88K  /zroot/TESTING/top/test4
zroot/TESTING/top/test5                  88K  78.8G    88K  /zroot/TESTING/top/test5
zroot/data                             20.8G  78.8G    88K  /zroot/data
zroot/data/dan-pg                        88K  78.8G    88K  /usr/home/dan/pg
zroot/data/freshports                  4.51G  78.8G    88K  none
zroot/data/freshports/repo             4.51G  78.8G    88K  none
zroot/data/freshports/repo/PORTS-head  4.51G  78.8G  4.51G  /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/var/db/repos/PORTS-head
zroot/data/ingress01-testing            675M  78.8G   675M  /iocage/jails/x8dtu-ingress01/root/usr/home/dan/tmp-fast
zroot/data/pg01-postgres               15.6G  78.8G  15.6G  /iocage/jails/x8dtu-pg01/root/var/db/postgres
zroot/data/pg02-postgres               9.25M  78.8G  9.25M  /iocage/jails/x8dtu-pg02/root/var/db/postgres
zroot/tmp                               120K  78.8G   120K  /tmp
zroot/usr                               192K  78.8G    96K  /usr
zroot/usr/src                            96K  78.8G    96K  /usr/src
zroot/var                              9.77M  78.8G    96K  /var
zroot/var/audit                          96K  78.8G    96K  /var/audit
zroot/var/crash                          96K  78.8G    96K  /var/crash
zroot/var/log                          8.48M  78.8G  8.48M  /var/log
zroot/var/mail                          120K  78.8G   120K  /var/mail
zroot/var/tmp                           912K  78.8G   912K  /var/tmp
[dan@x8dtu:~] $

What is likely to have the traffic?

zroot/data/freshports/repo/PORTS-head – that’s the man FreeBSD ports repo
zroot/data/pg01-postgres – the PostgreSQL database server (pg02 is another database server, but not used)

PostgreSQL backends

The average number of PostgreSQL backends has jumped from about 1 to 3. This is the PostgreSQL pg01 jail.

What changed?

Two things come to mind:

caching – I added caching on disk for some of the web pages. If anything, this should reduce the disk io.
recordsize – I changed some parameters on both zroot/data/pg01-postgres and zroot/data/pg02-postgres

Looking back at command history, I see:

[dan@x8dtu:~] $ history | grep recordsize
    7  zfs get recordsize,redundant_metadata,primarycache,logbias zroot/data/pg01-postgres postgres                                  11.6M  63.2G  9.18M  /iocage/jails/x8dtu-pg02/root/var/db/postgres
    8  zfs get recordsize,redundant_metadata,primarycache,logbias zroot/data/pg01-postgres zroot/data/pg02-postgres
    9  sudo zfs set recordsize=8k              redundant_metadata=most              primarycache=metadata              logbias=throughput zroot/data/pg01-postgres zroot/data/pg02-postgres
   10  zfs get recordsize,redundant_metadata,primarycache,logbias zroot/data/pg01-postgres zroot/data/pg02-postgres
  496  history | grep recordsize

These are the current settings for pg01, the current production database server:

[dan@x8dtu:~] $ zfs get recordsize,redundant_metadata,primarycache,logbias zroot/data/pg01-postgres 
NAME                      PROPERTY            VALUE         SOURCE
zroot/data/pg01-postgres  recordsize          8K            local
zroot/data/pg01-postgres  redundant_metadata  most          local
zroot/data/pg01-postgres  primarycache        metadata      local
zroot/data/pg01-postgres  logbias             throughput    local
[dan@x8dtu:~] $

Looking at the parent, these are the likely settings before the change:

[dan@x8dtu:~] $ zfs get recordsize,redundant_metadata,primarycache,logbias zroot/data
NAME        PROPERTY            VALUE         SOURCE
zroot/data  recordsize          128K          default
zroot/data  redundant_metadata  all           default
zroot/data  primarycache        all           default
zroot/data  logbias             latency       default
[dan@x8dtu:~] $

What changed?

recordsize – from 128K to 8K
redundant_metadata – from all to most
primarycache – from all to metadata
logbias – from latency to throughput

top

An extract from top shows:

 pid: 43619;  load averages:  1.01,  1.39,  1.48           up 34+18:05:39  15:19:57
195 processes: 9 running, 186 sleeping
CPU:  1.9% user,  0.0% nice,  6.5% system,  0.0% interrupt, 91.6% idle
Mem: 338M Active, 720M Inact, 102G Wired, 85G Free
ARC: 62G Total, 48G MFU, 8620M MRU, 3603K Anon, 759M Header, 5407M Other
     47G Compressed, 117G Uncompressed, 2.47:1 Ratio
Swap: 4096M Total, 4096M Free

Lots of free RAM (85G).

Plenty of ARC in use (62G).

Theories

Could the change in recordsize cause this issue? Some records are now 128K, some are 8K?

If yes, then shutting down the database, and zfs send | zfs recv the entire dataset to a new dataset would make everything 8K.

The jump in number of backend workers, could that be because queries now take longer and requests from the websites are now over lapping?

If the cause is one or a combination of redundant_metadata, primarycache, and logbias, changing them back and observing might show a difference.

Setting changes

The following commentary is based on a IRC chat with Allan Jude.

primarycache=metadata means that ZFS will cache only the ZFS metadata, not any PostgreSQL database data. The idea behind this configuration is to have no ZFS data in the ARC, thereby avoiding both PostgreSQL and ZFS caching the same data twice. However, we might be better off letting ZFS cache this data, since the dataset is only 15.6GB and it will all fit into ARC compressed.

logbias=throughput tells ZFS to optimize for throughput. This will not do synchronous writes to the ZIL. This will make commits to the database take longer. Setting this back to latency is probably better for a database.

At about Fri Dec 20 15:38 UTC 2019, I ran: zfs set primarycache=all zroot/data/pg01-postgres

To speed up the caching, I ran this command:

[dan@x8dtu:~] $ sudo zfs snapshot zroot/data/pg01-postgres@for.caching
[dan@x8dtu:~] $ sudo zfs send zroot/data/pg01-postgres@for.caching > /dev/null
[dan@x8dtu:~] $ sudo zfs destroy zroot/data/pg01-postgres@for.caching
[dan@x8dtu:~] $

That will read all the dataset into ARC.

PostgreSQL database settings

This is what PostgreSQL is working with:

shared_buffers = 128MB

That is spread rather thinly for a 15.6GB database.

A a wise person recently said: the key to database performance with ZFS is you need exactly one of [database|filesystem] caching the database. 0 of them caching or both of them caching is bad. In years past I’ve had better luck letting the database cache than ZFS caching. But that was before compressed ARC.

20 minutes later

Now the graph is trending down. This is the past 6 hours.

Nearly two hours later

Yes, I declare this fixed. The overall diskio has dropped back to what it was before the change.

The number of PostgreSQL backends has also dropped.

28 hours later

I waited until the next day to include this graph. It clearly shows how the IO has dropped back to the same level as before. This graph covers a two-month period.