A recent post on the FreeBSD Ports mailing list asked:
Hi,
I noticed a big difference between the number of ports on
freebsd.org/ports/ and on freshports.org. Currently, it’s 33348 vs.
41346.
The freebsd.org’s number equals roughly the number of lines of a current
INDEX, but how does FreshPorts count?
Best,
Moritz
In short, they are both wrong.
The FreeBSD value is based on INDEX, which includes flavors. The counts on the webpages under https://www.freebsd.org/ports/ will list some ports multiple times. See below for examples.
The FreshPorts total is wrong because it is including ports on branches.
The real number of ports is in the 28,800 range.
It is debatable whether py27-atspi and py37-atspi should be listed as separate ports. There are separate packages, yes, but they are both generated from one port: accessibility/py-atspi.
The rest of this post has background on how I reached these values.
Where is this FreshPorts count?
In the Statistics box on the right hand side of FreshPorts, you will see:

Statistics box saying Port Count 41418
Let’s see where this value comes from.
FreshPorts count
Everything in the Statistics box is generated by the backend via a periodic job. Let’s grep the code and find out where:
[dan@dev-ingress01:~/scripts] $ grep -r 'Calculated hourly' *
hourly_stats.pl: print FILE '<BR>Calculated hourly:<BR>';
If I look in there, I find: select Stats_PortCount()
Going to the sp.txt file, I find this stored procedure:
CREATE OR REPLACE FUNCTION Stats_PortCount() returns int8 AS $$
DECLARE
PortCount int8;
BEGIN
SELECT count(*)
INTO PortCount
FROM ports, element
WHERE element.status = 'A'
AND ports.element_id = element.id;
return PortCount;
END
$$ LANGUAGE 'plpgsql';
Let’s run that query:
freshports.org=# select * from Stats_PortCount();
stats_portcount
-----------------
41418
(1 row)
freshports.org=#
FreshPorts count with branches
I know why this values is so far from the FreeBSD count. Branches. Let’s look at this output where I start pulling back the port names:
freshports.org=# SELECT EP.pathname
freshports.org-# FROM ports P , element E, element_pathname EP
freshports.org-# WHERE E.status = 'A'
freshports.org-# AND P.element_id = E.id
freshports.org-# AND E.id = EP.element_id
freshports.org-# ORDER BY EP.pathname LIMIT 10;
pathname
--------------------------------------------------
/ports/branches/2016Q2/math/blitz++
/ports/branches/2016Q4/archivers/file-roller
/ports/branches/2016Q4/archivers/p7zip
/ports/branches/2016Q4/archivers/p7zip-codec-rar
/ports/branches/2016Q4/archivers/php56-bz2
/ports/branches/2016Q4/archivers/php56-phar
/ports/branches/2016Q4/archivers/php56-zip
/ports/branches/2016Q4/archivers/php56-zlib
/ports/branches/2016Q4/archivers/php70-bz2
/ports/branches/2016Q4/archivers/php70-phar
(10 rows)
freshports.org=#
FreshPorts count without branches
Let’s try the query and ignore branches.
freshports.org=# SELECT EP.pathname
freshports.org-# FROM ports P , element E, element_pathname EP
freshports.org-# WHERE E.status = 'A'
freshports.org-# AND P.element_id = E.id
freshports.org-# AND E.id = EP.element_id
freshports.org-# AND EP.pathname NOT LIKE '/ports/branches/%'
freshports.org-# ORDER BY EP.pathname desc
freshports.org-# LIMIT 10;
pathname
-----------------------------
/ports/head/x11/zenity
/ports/head/x11/yelp
/ports/head/x11/yeahconsole
/ports/head/x11/yalias
/ports/head/x11/yakuake
/ports/head/x11/yad
/ports/head/x11/xzoom
/ports/head/x11/xxkb
/ports/head/x11/xwud
/ports/head/x11/xwit
(10 rows)
freshports.org=#
That looks better.
Let’s get a count now.
freshports.org=# SELECT count(*)
freshports.org-# FROM ports P , element E, element_pathname EP
freshports.org-# WHERE E.status = 'A'
freshports.org-# AND P.element_id = E.id
freshports.org-# AND E.id = EP.element_id
freshports.org-# AND EP.pathname NOT LIKE '/ports/branches/%';
count
-------
28759
(1 row)
freshports.org=#
Well, that’s not great either.
That can’t be right
Let’s suspect the element_pathname table and remove it from the query. Instead, I will create the pathname based on a function:
freshports.org=#
freshports.org=# SELECT count(*) FROM (
freshports.org(# SELECT element_pathname(E.id) as pathname
freshports.org(# FROM ports P , element E
freshports.org(# WHERE E.status = 'A'
freshports.org(# AND P.element_id = E.id) AS tmp
freshports.org-# WHERE pathname NOT LIKE '/ports/branches/%';
count
-------
28759
(1 row)
freshports.org=#
That matches the count via the element_pathname table.
So it’s not that table skewing the results. What is it then?
Looking at category counts
Let’s compare https://www.freebsd.org/ports/categories-alpha.html with FreshPorts.
Let’s start with this query on the port_active table, which is actually a view of non-deleted ports.
FROM ports_active PA, element_pathname EP
WHERE PA.element_id = EP.element_id
AND EP.pathname NOT LIKE '/ports/branches/%' limit 10;
category | name | pathname
------------+----------------------------+--------------------------------------------
textproc | rubygem-raabro | /ports/head/textproc/rubygem-raabro
biology | pyfasta | /ports/head/biology/pyfasta
math | symmetrica | /ports/head/math/symmetrica
java | sigar | /ports/head/java/sigar
databases | phpmyadmin5 | /ports/head/databases/phpmyadmin5
devel | rubygem-rbtrace | /ports/head/devel/rubygem-rbtrace
x11 | xfce4-screenshooter-plugin | /ports/head/x11/xfce4-screenshooter-plugin
science | hdf5-18 | /ports/head/science/hdf5-18
lang | nhc98 | /ports/head/lang/nhc98
multimedia | xanim | /ports/head/multimedia/xanim
(10 rows)
Now, it’s get count by category.
freshports.dev=# SELECT PA.category, count(PA.name)
FROM ports_active PA, element_pathname EP
WHERE PA.element_id = EP.element_id
AND EP.pathname NOT LIKE '/ports/branches/%'
GROUP BY PA.category
ORDER BY PA.category;
category | count
---------------+-------
accessibility | 26
arabic | 8
archivers | 258
astro | 124
audio | 877
base | 1
benchmarks | 100
biology | 176
cad | 126
chinese | 106
comms | 213
converters | 178
databases | 1033
deskutils | 261
devel | 6875
dns | 238
editors | 263
emulators | 177
finance | 113
french | 14
ftp | 96
games | 1133
german | 21
graphics | 1128
hebrew | 7
hungarian | 7
irc | 114
japanese | 280
java | 122
korean | 39
lang | 364
mail | 709
math | 970
misc | 533
multimedia | 457
net | 1563
net-im | 176
net-mgmt | 404
net-p2p | 94
news | 67
polish | 14
ports-mgmt | 67
portuguese | 9
print | 256
russian | 32
science | 340
security | 1313
shells | 56
sysutils | 1538
textproc | 1896
ukrainian | 9
vietnamese | 16
www | 2358
x11 | 534
x11-clocks | 42
x11-drivers | 44
x11-fm | 30
x11-fonts | 250
x11-servers | 10
x11-themes | 145
x11-toolkits | 240
x11-wm | 119
(62 rows)
freshports.dev=#
Primary categories vs secondary categories
Remember that some categories are virtual, and do not appear on disk. The above count are only for primary categories, those which do appear on disk. For example, afterstep is not listed above, but you’ll find it in the FreeBSD list. The above SQL is for primary categories only. Virtual categories are covered in FreshPorts, but it’s not relevant to our search.
Also, a port exists on disk only within its primary category. There may be secondary categories, but the port should not be counted there as well. A port should only be counted once.
Picking on Hungarian
Let’s pick Hungarian, which has a small number of ports.
freshports.dev=# SELECT PA.category, PA.name, EP.pathname
freshports.dev-# FROM ports_active PA, element_pathname EP
freshports.dev-# WHERE PA.element_id = EP.element_id
freshports.dev-# AND EP.pathname NOT LIKE '/ports/branches/%'
freshports.dev-# AND PA.category = 'hungarian'
freshports.dev-# ORDER BY EP.pathname
freshports.dev-# LIMIT 10;
category | name | pathname
-----------+--------------------------+------------------------------------------------
hungarian | aspell | /ports/head/hungarian/aspell
hungarian | hunspell | /ports/head/hungarian/hunspell
hungarian | hyphen | /ports/head/hungarian/hyphen
hungarian | jdictionary-eng-hun | /ports/head/hungarian/jdictionary-eng-hun
hungarian | jdictionary-eng-hun-expr | /ports/head/hungarian/jdictionary-eng-hun-expr
hungarian | libreoffice | /ports/head/hungarian/libreoffice
hungarian | mythes | /ports/head/hungarian/mythes
(7 rows)
freshports.dev=#
Let’s compare that with what is on disk:
[dan@pkg01:~/ports/head/hungarian] $ ls -l
total 13
-rw-r--r-- 1 dan dan 332 May 5 2020 Makefile
-rw-r--r-- 1 dan dan 97 Oct 27 2019 Makefile.inc
drwxr-xr-x 2 dan dan 6 Oct 27 2019 aspell
drwxr-xr-x 2 dan dan 5 Oct 27 2019 hunspell
drwxr-xr-x 2 dan dan 5 Oct 27 2019 hyphen
drwxr-xr-x 2 dan dan 5 Oct 27 2019 jdictionary-eng-hun
drwxr-xr-x 2 dan dan 5 Oct 27 2019 jdictionary-eng-hun-expr
drwxr-xr-x 2 dan dan 4 Nov 12 15:09 libreoffice
drwxr-xr-x 2 dan dan 5 Oct 27 2019 mythes
[dan@pkg01:~/ports/head/hungarian] $ svn info
Don’t trust me. Look at subversion for ports/head/hungarian/
The FreshPorts count is correct. What is FreeBSD talking about then?
Comparing with https://www.freebsd.org/ports/hungarian.html, I see that FreeBSD is including:
Will this account for the differences? I don’t know.
The 33399 count listed at https://www.freebsd.org/ports/ (on 2020-11-18) seems close the value contained within INDEX-12 (33406).
The category totals at https://www.freebsd.org/ports/categories-grouped.html include ports listed in their secondary categories. This counts some ports more than once.
Looking at INDEX
Let’s look at INDEX-12:
[dan@pkg01:~/ports/head] $ make fetchindex
/usr/bin/env fetch -am -o /usr/home/dan/ports/head/INDEX-12.bz2 https://www.FreeBSD.org/ports/INDEX-12.bz2
/usr/home/dan/ports/head/INDEX-12.bz2 2315 kB 1436 kBps 02s
[dan@pkg01:~/ports/head] $
[dan@pkg01:~/ports/head] $ wc -l INDEX-12
33406 INDEX-12
[dan@pkg01:~/ports/head] $ grep -c jdictionary-ger-hun INDEX-12
1
OK, it’s only counted once within INDEX.
So far, we know why the port counts on the web pages differ.
Let’s pick a category which is not language related: x11-servers
This is what FreshPorts has:
freshports.org=# SELECT PA.category, PA.name, EP.pathname
FROM ports_active PA, element_pathname EP
WHERE PA.element_id = EP.element_id
AND EP.pathname NOT LIKE '/ports/branches/%'
AND PA.category = 'x11-servers'
ORDER BY EP.pathname;
category | name | pathname
-------------+-----------------+-----------------------------------------
x11-servers | Xfstt | /ports/head/x11-servers/Xfstt
x11-servers | x2vnc | /ports/head/x11-servers/x2vnc
x11-servers | x2x | /ports/head/x11-servers/x2x
x11-servers | xephyr | /ports/head/x11-servers/xephyr
x11-servers | xorg-dmx | /ports/head/x11-servers/xorg-dmx
x11-servers | xorg-nestserver | /ports/head/x11-servers/xorg-nestserver
x11-servers | xorg-server | /ports/head/x11-servers/xorg-server
x11-servers | xorg-vfbserver | /ports/head/x11-servers/xorg-vfbserver
x11-servers | xwayland | /ports/head/x11-servers/xwayland
x11-servers | xwayland-devel | /ports/head/x11-servers/xwayland-devel
(10 rows)
freshports.org=#
From disk:
[dan@pkg01:~/ports/head/x11-servers] $ ls -l
total 34
-rw-r--r-- 1 dan dan 375 Feb 14 2020 Makefile
drwxr-xr-x 3 dan dan 7 Sep 19 01:19 Xfstt
drwxr-xr-x 2 dan dan 5 Nov 9 2019 x2vnc
drwxr-xr-x 3 dan dan 6 Nov 9 2019 x2x
drwxr-xr-x 2 dan dan 4 Feb 25 2020 xephyr
drwxr-xr-x 2 dan dan 5 Feb 25 2020 xorg-dmx
drwxr-xr-x 2 dan dan 4 Feb 25 2020 xorg-nestserver
drwxr-xr-x 3 dan dan 8 Sep 19 01:19 xorg-server
drwxr-xr-x 2 dan dan 4 Feb 25 2020 xorg-vfbserver
drwxr-xr-x 2 dan dan 4 Oct 11 13:22 xwayland
drwxr-xr-x 2 dan dan 5 Nov 16 15:28 xwayland-devel
[dan@pkg01:~/ports/head/x11-servers] $
That matches.
Looking at https://www.freebsd.org/ports/x11-servers.html I find listings not found above:
- tigervnc-server net/tigervnc-server
- tigervnc-viewer net/tigervnc-viewer
- xorg-minima x11/xorg-minimal
Again, it is ports listed here, which are not actually in this category. Ports are being counted twice, at least in the web page.
This extracts the list of ports from INDEX:
[dan@pkg01:~/ports/head] $ cut -f 2 -d '|' INDEX-12 > ~/tmp/INDEX-12-list
[dan@pkg01:~/ports/head] $ head -4 ~/tmp/INDEX-12-list
/usr/ports/accessibility/accerciser
/usr/ports/accessibility/at-spi2-atk
/usr/ports/accessibility/at-spi2-core
/usr/ports/accessibility/atkmm
[dan@pkg01:~/ports/head] $
[dan@pkg01:~/ports/head] $ wc -l INDEX-12 ~/tmp/INDEX-12-list
33406 INDEX-12
33406 /usr/home/dan/tmp/INDEX-12-list
66812 total
The line count matches. Let’s get the same information out of FreshPorts, but this time, I’ll use production.
cat << EOF | psql -t freshports.org > INDEX.FreshPorts
SELECT '/usr/ports/' || PA.category || '/' || PA.name
FROM ports_active PA, element_pathname EP
WHERE PA.element_id = EP.element_id
AND EP.pathname NOT LIKE '/ports/branches/%'
ORDER BY 1;
EOF
We have 28759 entries there.
$ wc -l ~/INDEX.FreshPorts
28759 /usr/home/dan/INDEX.FreshPorts
That is far from the 33406 lines in INDEX-12.
Removing flavors from INDEX-12 list of ports
When I started comparing the output, I noticed that INDEX-12 listed accessibility/py-atspi twice. Why? Because of flavors. Here are the first two columns from INDEX-12:
py27-atspi-2.38.0|/usr/ports/accessibility/py-atspi
py37-atspi-2.38.0|/usr/ports/accessibility/py-atspi
Let’s remove duplicate lines from INDEX-12:
[dan@pkg01:~/ports/head] $ wc -l INDEX-12 ~/tmp/INDEX-12-list ~/tmp/INDEX-12-list-nodups
33406 INDEX-12
33406 /usr/home/dan/tmp/INDEX-12-list
28755 /usr/home/dan/tmp/INDEX-12-list-nodups
95567 total
[dan@pkg01:~/ports/head] $
That means 4651 lines relate directly to flavors.
That uniq output is much closer to the FreshPorts count of 28759. It is off by 4.
Comparing INDEX-12 and FreshPorts
Let’s do a diff.
All the + lines indicates a port included in FreshPorts, but not INDEX-12. I have annotated the output to indicate what my investigations found.
All the – lines indicate something not found on FreshPorts.
When you see DELETED, that means FreshPorts has marked this port was deleted.
[dan@pkg01:~/ports/head] $ diff -ruN ~/tmp/INDEX-12-list-nodups ~/INDEX.FreshPorts
--- /usr/home/dan/tmp/INDEX-12-list-nodups 2020-11-18 16:58:12.360853000 +0000
+++ /usr/home/dan/INDEX.FreshPorts 2020-11-18 17:49:49.321133000 +0000
@@ -1290,6 +1290,7 @@
/usr/ports/audio/zita-resampler
/usr/ports/audio/zrythm
/usr/ports/audio/zynaddsubfx
+/usr/ports/base/binutils NOT A PORT
/usr/ports/benchmarks/ali
/usr/ports/benchmarks/apib
/usr/ports/benchmarks/autobench
@@ -7405,7 +7406,6 @@
/usr/ports/devel/php80-sysvsem
/usr/ports/devel/php80-sysvshm
/usr/ports/devel/php80-tokenizer
-/usr/ports/devel/phpunit6 PORT MARKED AS DELETED
/usr/ports/devel/phpunit7
/usr/ports/devel/phpunit8
/usr/ports/devel/physfs
@@ -7703,6 +7703,7 @@
/usr/ports/devel/py-cachy
/usr/ports/devel/py-canonicaljson
/usr/ports/devel/py-capstone
+/usr/ports/devel/py-case NEWLY CREATED
/usr/ports/devel/py-castellan
/usr/ports/devel/py-castellan1
/usr/ports/devel/py-cbor
@@ -9873,6 +9874,7 @@
/usr/ports/devel/rubygem-rspec-support
/usr/ports/devel/rubygem-rspec_junit_formatter
/usr/ports/devel/rubygem-rubocop
+/usr/ports/devel/rubygem-rubocop-ast NOT IN INDEX-12
/usr/ports/devel/rubygem-ruby-atmos-pure
/usr/ports/devel/rubygem-ruby-bugzilla
/usr/ports/devel/rubygem-ruby-enum
@@ -14078,7 +14080,6 @@
/usr/ports/korean/hanyangfonts
/usr/ports/korean/hcode
/usr/ports/korean/hmconv
-/usr/ports/korean/hpscat PORT MARKED AS DELETED
/usr/ports/korean/hunspell
/usr/ports/korean/ibus-hangul
/usr/ports/korean/imhangul-gtk2
@@ -14439,6 +14440,7 @@
/usr/ports/lang/spidermonkey24
/usr/ports/lang/spidermonkey52
/usr/ports/lang/spidermonkey60
+/usr/ports/lang/spidermonkey68 NOT IN SUBVERSION
/usr/ports/lang/spidermonkey78
/usr/ports/lang/spl
/usr/ports/lang/squeak
@@ -14958,6 +14960,7 @@
/usr/ports/mail/py-dkimpy
/usr/ports/mail/py-email-validator
/usr/ports/mail/py-email_reply_parser
+/usr/ports/mail/py-flanker NOT IN INDEX-12
/usr/ports/mail/py-flask-mail
/usr/ports/mail/py-flufl.bounce
/usr/ports/mail/py-fuglu
[dan@pkg01:~/ports/head] $
Totals:
- NOT A PORT – base looks a category to FreshPorts so that is included
- PORT MARKED AS DELETED – FreshPorts thinks this port is deleted, but it is not
- NEWLY CREATED – this port was created today. INDEX-12 predates that
- NOT IN INDEX-12 – no idea why this is not included
- NOT IN SUBVERSION – this port is not listed in subversion.
Conclusion
FreshPorts has some errors, which I will look into.
The actual number of ports is wrong on both sites and the correct values is in the 28,800 range.