Eric Radman : a Journal

Performance Tuning a NetBSD Server

Unusable Defaults

In recent years the NetBSD developers have made consistent improvements in performance even while adding some cool features like systrace. In comparison to OpenBSD there is no contest, but the sizing in the default kernels is beyond inadequate if you have more than 256MB of RAM.

All of the modifications I'll discuss should be made by duplicating GENERIC.MP in /usr/src/sys/arch/*/conf/.

MAXUSERS, NPROC, and MAXFILES

The first option is not on option at all, but a parameter that is used to set the sizing of a number of other parameters:

maxusers        64

If you're including another config file such as GENERIC you'll first have to open that file and comment out the maxusers line so that you can set it. On i386 and amd64 the default is 32. This is not as low as it sounds because it's a number that is used to size two other parameters in the kernel source:

/usr/src/sys/param.h:
   #define  NPROC    (20 + 16 * MAXUSERS)

/usr/src/sys/conf/param.c:
   #define  MAXFILES (3 * (NPROC + MAXUSERS) + 80)

So on the default i386 kernel, 532 (20 + 16 * 32) is the hard limit for the maximum number of processes (NPROC) that can be started. Apparently this formula is part of traditional UNIX history. A setting of 64 is usually plenty (1044) because even on a busy mail server there are rarely more than 400 processes, BUT remember that this is a hard limit, not the soft limit for unprivileged users.

$ sysctl proc.curproc.rlimit.maxproc
proc.curproc.rlimit.maxproc.soft = 160
proc.curproc.rlimit.maxproc.hard = 532

MAXFILES is a soft limit for the number of files a process is allowed to have open at any given point. So the default for i386 is 1772 (3 * (532 + 32) + 80). Again, this is the hard limit, and has no affect on the default soft limits.

$ sysctl proc.curproc.rlimit.descriptors
proc.curproc.rlimit.descriptors.soft = 64
proc.curproc.rlimit.descriptors.hard = 1772

CHILD_MAX and OPEN_MAX

You can find the current limits with ulimit:

$ ulimit -H -n
1772
$ ulimit -S -n
64

NetBSD doesn't seem to include a convenient mechanism for setting ulimit parameters for each rc.d service so it may be easier to just build the new kernel with some new parameters:

/usr/src/sys/sys/syslimits.h:
    #define CHILD_MAX    160
    #define OPEN_MAX     64 

Raising maxusers bumped up the hard limit from 1772, but each process can still only open 64 descriptors at once. It turns out that this is a problem for processes that use a lot of pipes, like Dovecot:

# lsof -p `head -1 /var/run/dovecot/master.pid` | wc -l
      65

It's totally up to you, but I often set these limits to four times their default values.

Memory File System

It's not commonly talked about, but all of the mainstream BSD's support memory file system, or MFS. If used correctly it can be an awesome boost to performance. The most common use might be to mount /tmp in RAM, but it can be anything.

/dev/sd2b /tmp mfs rw,-s=2101464

The cool thing about MFS is that it can grow dynamically, and the -s flag sets the maximum size, in sectors by default, but you can use a suffix of m to specify a value in megabyes.

Soft Dependencies

For partitions that are written to frequently like /home and /varthe softdep option makes a huge difference. I'm not 100% sure, but I believe applications that call fsync() cause their data to be flushed to disk immediately no matter what file system options are used.

/dev/wd0g /home ffs rw,softdep 1 2

Shared Memory for PostgreSQL

PostgreSQL is not the only application ever written to use shared memory, but it's the most prominent. These are the settings I've been using on a system with 2GB or RAM:

options         SHMMAXPGS=59400
options         SHMSEG=512
options         SEMMNI=512      # Maximum number of sets of IPC semaphores
options         SEMMNS=1024     # Sys-wide max number of individual IPC semaphores
options         SEMMNU=512
options         SEMMAP=512

the only option that you probably have to weak is SHMMAXPGS because that is the parameter that sets how much shared memory can be allocated. Use ipcs -i to see the state of shared memory usage. The formula for shared memory on NetBSD looks like this:

512 kB + 8k * 1000 buffers = 8512kB = 2128 pages

1000 is the default allocation in postgresql.conf, which allows max_connections to be set to a maximum of 500 (providing there are enough semaphores), but increasing shared_buffers to 10,000 or more should improve performance.

SEMMNI (max semaphore sets) and SEMMNS (system wide maximum number of semaphores) need to be increased if more than 450 or so connections are required.

SPARC with Lots of RAM

This is not really a performance problem, more of a stability issue. There all kinds of kernel structures that are statically sized on older architectures that won't work with 512MB of RAM. This is what I used in my SparcStation 20:

options         BUFPAGES=12000          # 4K pages available for buffer cache
options         NKMEMPAGES=8192         # crashes with default of 1536
options         MAXDSIZ="(1024*1024*1024)" # Bump max data size
                                        # (see machine/vmparam.h>)
options         NVNODE=40960            # Default too low

References

Running a High-Performance Web Server for BSD

BSD Tricks: MFS

Shared memory - Wikipedia

PostgreSQL: Managing Kernel Resources

Performance Tuning PostgreSQL

Setting our sights on semaphores