Jim's Depository

this code is not written

I noticed that the Opera browser rocketed up to 38.4% of my hits. A quick dig of the logs shows that I am being drilled by bots that look to be trying to create link spam and masquerade as Opera browsers.

I suppose eventually they will have a human help them through the captcha and succeed. I have changed things about so untrusted users will get rel=nofollow tags on all their links. Maybe that will make them lose interest and go away.

I should probably make an RSS feed for comments while I’m at is so I notice when they get through the defenses.

While updating my systems monitoring I discovered Munin today. Munin captures a wide variety of system information and dumps it into RRD files to ultimately graph it at a central location.

The user interface doesn’t communicate problems well, but it provides the underlying data for you to answer those nagging questions that come up, like “When did our email traffic get so high?” or “Has that disk always run that hot?”

And my install notes:

  • Also install sensord, smartmontools, and ethtool when installing the munin-node package.
  • Make sure to punch a firewall rule for port 4949 from the central machine. The central machine does not want to get hung on one of the nodes.
  • Run sensors-detect after installing sensord.
  • Check your syslog after restarting sensord, a bunch of my Dells explode the daemon with a problem in the fan sensor, I have to take lm85 out of their modules. 
  • If you are a linux user with SATA drives, then when you install smartmontools, you must edit the /etc/smartd.conf file to comment out the DEVICESCAN line and put in a specific device line with the “-d ata”. May as well add a “-m you@email.address” while you are there so it will notify you. Don’t forget to edit /etc/defaults/smartmontools to enable the daemon.
  • In /etc/munin/plugins/ you might wish to take out the ntp* files, unless you care about that. Look where those links go and you will see other plugins you could link in. I added veth* ones for my virtual ethernets and the smart one to get all my drive failure data.
  • Again, if on linux with SATA drives you must edit the /etc/munin/plugin-conf.d/munin-node file in a couple of places to tell it to use “-d ata”

Sample bits of /etc/munin/plugin-conf.d/munin-node:

[hddtemp_smartctl] user root env.drives sda env.type_sda ata
[smart_sd*] user root env.smartargs -H -c -l error -l selftest -l selective -d ata

If you want to collect apache statistics with Munin you need to enable extended server status in apache.
ExtendedStatus On
<Location /server-status>
   SetHandler server-status
   Order deny,allow
   Deny from all
   Allow from
   Allow from munin-server.mydomain.com

If your web server does not bind to localhost (, you need to define the server status URL in your /etc/munin/plugin-conf.d/munin-node config file.
env.url "http://servername.mydomain.com/server-status?auto"

If you run sendmail as your mail server munin has 3 plugins that are in the base Debian install.  Link all 3 into your /etc/munin/plugins directory.   One, sendmail_mailqueue will work out of the box.  The other two depend on sendmail stats files that do not get created in a base Debian install.

To enable stats logging you must manually create the stats files.

# touch /var/lib/sendmail/sendmail.st
# touch /var/lib/sendmail/sm-client.st

Once these files have been created, with sendmail write permission, sendmail will start logging to them.  Gotta love sendmail, "If you create the log file for me, I will write to it."

You can test your mail statistics file creation manually with the mailstats command.

After decades of backing up with dump I no longer do it. I suppose I got in the habit back in the days of tapes and just stayed through the disk years.

rsync is far better. I should have switched years ago.

  • Efficient incremental backups over networks, even for appending little bits to the ends of long files.
  • The backups are real directories of real files, easy to ferret about and find what you need.
  • Nifty trick to keep N days of backups without using N times the space, but still each tree looks like a snapshot.
  • Easy ssh based security.
  • You can do either a push or a pull depending on your security requirements.

The first thing to do is to read about the rsync –link-dest option. It lets you use hard links to share the contents of files across days of your backups.

The second thing to do is to decide on your backup strategy. For many machines I just keep 7 days of backups, it makes things easy. There is a backup for each day of the week and they overwrite when it wraps. For other applications where I have to go further back I rename directories, much like logrotate would.

The third thing to think about is what happens if your ssh key or rsync password is compromised. If you are running backups from cron, then there will be a machine readable key on your machine somehow. This may or may not be a danger depending how you have things secured. In my setups if you could get the key you could have gotten the data anyway. (Remember that your backup archive machine needs to be at least as secure as the live machine.)

Enough talking, more sample code:

Scenario #1: Many big machines, lots of bits to push, on the same secure network. We want to go fast. All the machines have different security policies.

I use push in this situation. There is a dedicated backup machine to receive and hold the bits. Only two trusted people have access to this machine. The backup machine runs an rsync daemon with a module for each host that lets the host write backups (only in its host specific area, write only). On each host there is a root cron job with the rsync password embedded to run the backup.

Sample backup script… cron these, offset their run times so keep contention down….

 HOST=`hostname` export
 DAY=`date +%a | tr '[A-Z]' '[a-z]'`   
 case $DAY in   sun ) PDAY=sat ;;   mon ) PDAY=sun ;;   tue ) PDAY=mon
;;   wed ) PDAY=tue ;;   thu ) PDAY=wed ;;   fri ) PDAY=thu ;;   sat )
PDAY=fri ;; esac   
 OPTS="-aqH --link-dest=/$PDAY/ --no-devices --no-specials
--exclude=/proc/ --exclude=/sys/ --exclude=/dev/ --exclude=/tmp/
 time rsync $OPTS / $HOST@warehouse.federated.com::$HOST/$DAY

Sample module from rsyncd.conf…

[nexus]     auth users = nexus     secrets file = /etc/rsyncd.secrets   
 use chroot = yes     path = /warehouse/nexus     numeric ids = yes   
 list = no     read only = no     write only = yes     uid = 0     gid =
0     hosts allow =     hosts deny = *

Scenario #2: Offsite backup of Virtual Private Server

I have a machine that lives in a hosting facility. I have broken my first rule of service providers. They are not close enough for me to pop over and wrap my hands around someone if there is a problem, so I content myself with a full backup and the ability to be up and running at a new provider in 60 minutes if needed. I don’t want any credentials sitting on a machine at the hosting facility, so I do a pull in this situation. I also use ssh to protect my data in transit, but use the rsync daemon and modules on the far side to get better control, for instance to make it read only.

Cron job on my backup server (pardon the db_client related noise, that machine has to run dropbear instead of a more common ssh. But do notice that I have a –rsh to force a tunnel and a :: to use daemon mode and modules.)


function saveone () { TAG=$1 SRC=$2

rm -rf vhosts/\$TAG.9 [ -d vhosts/\$TAG.8 ] && mv vhosts/\$TAG.8
vhosts/\$TAG.9 [ -d vhosts/\$TAG.7 ] && mv vhosts/\$TAG.7 vhosts/\$TAG.8
[ -d vhosts/\$TAG.6 ] && mv vhosts/\$TAG.6 vhosts/\$TAG.7 [ -d
vhosts/\$TAG.5 ] && mv vhosts/\$TAG.5 vhosts/\$TAG.6 [ -d vhosts/\$TAG.4
] && mv vhosts/\$TAG.4 vhosts/\$TAG.5 [ -d vhosts/\$TAG.3 ] && mv
vhosts/\$TAG.3 vhosts/\$TAG.4 [ -d vhosts/\$TAG.2 ] && mv vhosts/\$TAG.2
vhosts/\$TAG.3 [ -d vhosts/\$TAG.1 ] && mv vhosts/\$TAG.1 vhosts/\$TAG.2
[ -d vhosts/\$TAG.0 ] && mv vhosts/\$TAG.0 vhosts/\$TAG.1 [ -d
vhosts/\$TAG ] && mv vhosts/\$TAG vhosts/\$TAG.0
RSYNC\_PASSWORD=a8e261e7bac90138087f770caa5fea5b export RSYNC\_PASSWORD
OPTS="-aqHz --bwlimit=400 --exclude lost+found --exclude /tmp --exclude
/var/tmp --exclude /proc --exclude /sys --no-devices --no-specials
--delete" rsync \$OPTS --rsh "dbclient -l root -i .ssh/id\_archivist.db"
--link-dest=/home/archivist/vhosts/\$TAG.0/ \$SRC
/home/archivist/vhosts/\$TAG/ \> \~/\$TAG.log }

saveone studt-net rhth.lunarware.com::rhth

\~root/.ssh/authorized_keys on the virtual private server (Look at the bit in front of the ssh-dss, it restricts what that key can do, in particular it makes it only able to run the rsync daemon.)

--server --daemon ." ssh-dss
adfeadfaefasdfefeI\_DELETED\_MY\_KEY\_HEREadfasdfefadfae backups


[machine] auth users = archivist secrets file = /etc/rsyncd.secrets path
= / numeric ids = yes list = no read only = yes write only = no uid = 0
gid = 0

There you have it. Reasonably safe backups. There is room for improvement, for instance, rather than coming straight into root with the restricted command it could be a different account and use “super” to run the command, and it should check the source IP and only work from the backup machine.


Eww, nasty double spacing of the code segments. I'll have to think about how to fix that. Safari put each line into its own div for some reason.

I’m entering into Virtual Hosting. Many of the pages and services I’ve run out of spare gear at FSG are now moved or moving to a virtual host. I chose VPSLink for their \$7/mo tiny server, then while moving my domain name pointers discovered that Gandi.net is starting a virtual hosting business with more RAM and disk for about the same price.

Living in a tiny slice of a virtual host is certainly different from living in a leftover 2GHz/1GB/500GB PC, fortunately I remember when 64M of RAM was huge so I think I’ll get by just fine.

  • postfix/dovecot in TLS, lighttpd+fastcgi+php5 all fits nicely.
  • TLS incurs a fair bit of RAM use.
  • CPU use is odd. I don’t have visibility to know if I am CPU bound. I don’t think I am, but I can’t see from inside my Xen box.
  • The machine is noticeably sluggish at things like an initial rsync backup offsite. Slow disk? Slow network? I’m not sure.
  • The total annual bill will cost less than the electricity to keep a PC powered up. Happy Earth Day!

I guess if you find the site gone and just this at archive.org you’ll know I’ve had a virtual hosting disaster.  I don’t foresee a problem. I don’t know what the hosting company does if a machine fails, but I keep a full backup every night in my closet so I’ll be ok.