Jim's Depository

this code is not yet written
 

I generally forget some nice packages when I toss up a new debian machine and then spend too much time trying to remember which ones they are. Now I keep them listed here. Maybe you will like them too.

  • dpkg-dev-el  (for debian/changelog files, stupid picky date syntax)
  • dfu-programmer (to reflash Atmel microcontrollers)
  • iproute (aways forget it)
  • psmisc (killall)
  • sysstat (iostat)
  • strace (sometimes you have to know)
  • tcpdump
  • mtr-tiny (mtr, but without all the X dependencies)
  • ifstat
  • iftop
  • iotop
  • htop,atop (not comitted to these yet, but they seem useful)
  • attr (see and set file’s extended attributes)
  • emacs-goodies-el (markdown mode)

That attachment is an etch build of dfu-programmer, I can’t test it, but it probably works on Etch.

Attachments

hello

can you tell me how to install dfu-programmer on debian etch?
i can not find the package with apt.

greetings remo
I just built dfu-programmer for etch with these steps, and attached it to the parent article (I guess I should implement attachments for comments.)
  1. Go to the Debian packages page for the Lenny package.
  2. Download the original source and debian diffs.
  3. Unpack the original sources
    tar -zxvf dfu-programmer_0.4.3.orig.tar.gz
  4. Apply debian patches... 
    zcat dfu-programmer_0.4.3-1.diff.gz | patch -p0
  5. Hop in
    cd dfu-programmer-0.4.3
  6. Build packages
    fakeroot ./debian/rules binary-arch
  7. Fail... fix x flags on rules
    chmod +x debian/rules
  8. Fail... install libusb-dev
  9. Success! I now have a dfu-programmer_0.4.3-1_i386.deb in the parent directory.
  10. I can't test it, my only Etch machines are remote servers, but it installs ok and I expect it works.
I probably should have modified the version number to mark it as an Etch build, but this works for private use.

macvlan is used to give a second MAC address to a network adapter and see it as a new device at the higher levels. It is useful to pretend you are multiple machines, as the container people do, or in my case, to implement your own TCP stack without interference from the kernel IP code.

My ugly scars thus far:

  • There is nearly no documentation.
  • What there is, is inaccurate.
  • Just because your iproute ip command doesn’t have any help or documentation for “link add”, doesn’t mean it won’t work. Lenny’s will.
  • You create a new interface thusly (note eth0, missing in most docs):  ip link add link eth0 address 00:19:d1:29:d2:58 macvlan0 type macvlan
  • You can leave out the macvlan0 and it will allocate one.
  • The Debian Lenny kernels do not have macvlan turned on.
  • make-kpkg is broken in Lenny. Badly broken. But if you specify the architecture and revision on the command line and turn off Xen support in your config you can get it to work.

Not macvlan’s fault, but if you are working on a user space TCP stack and you are wondering why it seems to be sending RST packets… make sure you aren’t accidentally sharing the link with the kernel.

The Road to Bankruptcy is a short blog of Darren’s house rebuilding woes. I highly recommend it for anyone that has every owned a house.

Background: Family with four nine-year old children in a small house with a fatally flawed foundation decides that somehow tearing down the house and constructing a new one is a good idea.

I’ve been working on a musical instrument tuner for the iPhone. Sadly, I was not one of the lucky 500 first round of developers accepted by Apple, and have been excluded from the market.

At some point when Apple decides my \$100 bill is good enough to let me into the market my application would be starting far back in the pack, behind whichever one of these first, blessed three gains the title as “the tuner you use on an iPhone”.

Given that handicap, I’ll probably not release the application, but it was interesting writing it. Among the things I discovered:

  • Pitch is subjective for some sounds. There are notes on the concertina I was using for some of the testing for which the human testers disagreed about the octave of the note. (And my algorithms were also having trouble making a decision.)
  • For many instruments, the fundamental is not strongly present in the spectrum. Going in, I had assumed it would be large, if not the largest component.
Oh cruel Apple. 12 hours after releasing the Apple Store they processed my application and sent me a key to test on my iPhone.

Decisions.

femtoblogger has reached that odd state for software. It works well enough that I am happy using it. There are rough edges, but not rough enough that I will fix them.

The only thing I have changed recently is to add a meta robots tag to suggest the aggregate pages, like the front page and archive months, not be indexed. That should help keep the clicks on target. I already had robot tags to deter indexing of all the non-content pages.

There remain two rough points:

  1. The WYSIWYG editing: This is still a bit awkward. Sometimes I get stuck in bold and have to pop into HTML mode to get out. Pasting in code ends up double spaced. I could make lots of elaborate workarounds, but I consider these to be browser bugs and hope they will shake themselves out over time. I could also switch to one of the giant WYSIWYG javascript editors, but then I wouldn’t be very femto, would I.
  2. I keep having a nagging desire to have images. I could do it now by attaching the image and making my own IMG tag, but I’m too lazy for that. I’ve been holding off coding proper image support (with resizing for display and full resolution click through) on the grounds that if I’m too lazy to type my own image tag then surely I shouldn’t spend a couple hours making full image support.

I suppose since femtoblogger has become stable it is time to move it into a public subversion repository.

A Debian administrator might want to install…

  • debsums  - check installed files for tampering, not complete, but a good start.
  • rkhunter - look for root kits.
  • chkrootkit - look for root kits.

Think about running these regularly to catch your basic root kitter.

You could cron them, but I prefer to run them manually, since I know I’d pull the cron entry if I rooted you.

I suppose you could do a forced reinstall before running for a little extra comfort.

I think a better tool would be one that used a central repository with a copy of each package and called on the observed machine to generate on the fly signatures of files with a random seed.

A truly nasty rooter could still thwart that by faking things in either the C runtime library or the appropriate system calls.

I write from the end of June, 2008 having just completed a quarterly spam analysis and adjustment. Following is a brief description of the mail community, the incoming mail stream, how I process it, and the results.

The Mail Community

  • 150 people, mostly engineers in a software company
  • old addresses, average age 5 years+
  • many “first name” addresses

The Incoming Mail Stream

  • We are running about 500 to 1000 incoming emails per hour.
  • 95% of the incoming email is spam, 5% is real.

The Process

  • No mail is destroyed or rejected for spaminess, it is marked with a header and the mail clients shuttle it off to a junk folder, just in case.
  • All mail first passes through bogofilter. This can definitively mark a message as real mail or spam or it may be unsure and pass the message on to more expensive filters. 90% of the real mail is discovered at this point, as is 85% of the spam. I have a broad ‘unsure’ area to reduce false positives.
  • Only the 15% or so of the mail that bogofilter was not sure about will proceed to the following filters.
  • The second filter is dcc, the distributed checksum clearinghouse. This sends a fuzzy checksum to a central server and checks how many copies of the message have been seen so far. If it has been seen too many times then I consider it spam. This successfully discovers about 50% of the remaining spam with a quick round trip of a UDP packet.
  • clamav is used to detect viruses and mark them as spam so the mail clients will sequester them. This only marks a couple messages out of a 1000 incoming, but dcc marks many viruses so I don’t see the total size of my virus stream.
  • If a message is still uncharacterized it goes on to spamassassin. This discovers 90% of the remaining spam. That leaves about 0.3% of the total spam sneaking past my filters to offend the users. Spamassassin is configured to do the network checks, but not to use its bayesian filter, since bogofilter already does something similar.

The Results

  • 99.7% of the spam is detected and tagged.
  • <0.0% false positives. (I haven’t found one.)
  • CPU consumption small enough to be unmeasurable.
  • Mail which gets as far as spamassassin will take a 4 to 10 second delay while it processes. The other tests are fast enough to not be noticed.

Maintenance

The bogofilter works best if it is trained regularly to follow spam trends. I have in the past manually sorted thousands of messages into good and bad piles for training, but that is mind numbing. For ongoing training I do the following:

  • Anything that just barely got tagged as spam by bogofilter (scored above 85% but below 90%) is used as spam to train bogofilter. This tracks spam techniques as they drift out of my target sights without warping my spam stats by reporting 10,000 copies of the same message.
  • Anything that gets past bogofilter, but is subsequently caught by dcc or spamassassin is trained into bogofilter as spam. This catches new trends in spam.
  • Periodically I spot check the real mail, pick out any spam that squeaked through, and train it into the bogofilter to keep up with trends in our real mail.

Results

The end result is I spend dozens of man hours per year to stop 250,000 spam. I’d just hire google to front end filter our mail for \$3/address/year, but the security policy won’t allow that.

An extra note on bogofilter:

Bogofilter is built with a single user in mind. I'm sure it works better when it has a single user's mail to think about and can rely on the human to tag the false positives and negatives.

In a 150 user common filter you can rely on exactly 0 of them to report their miscategorized spam. If you try to force them to comply you will find that 10% of them do it backwards and pollute your statistics so badly you have to erase everything and start again.

That said, it works quite well and is speedy and doesn't rely on external network servers so it makes a good first line of defense.
Going forward:

I will have to drop dcc. Their licensing is no longer free enough to be distributed by Debian. That will slow more messages, but in practice anything dcc catches is also caught by spamassassin.

I'd like to add an adaptive whitelist out front to prevent false positives and give me a stream of known good messages for training the bogofilter. I haven't found one I like yet, but I keep looking. Maybe I'll have to write it.

This morning was 53°F in the cabin. When I awoke I didn’t need to crawl out into the cold to open the curtains and check the local weather because I could use my iPhone to access my server in St. Louis, that displays data from my server in Reston that de-NATs the server in Wisconsin so I can download live video from the webcam looking out from the front of the cabin. 

A different sort of geek might have built a heater.

I have a server which contains a bunch of virtual machines. These machines are continually harassed by script kiddies. I use Fail2ban to keep the trolling to a minimum. 

  • Each virtual machine sends its syslog activity to the physical server, using something like this in its syslog.conf…  *.* @some.host.com
  • The physical server saves all the syslog activity from the virtual machines, safe from tampering. (/etc/defaults/syslogd needs a -r)
  • fail2ban runs on the physical server and drops bans into the FORWARD chain to protect the inner machines.
  • The syslog port needs to be protected to only take traffic from trusted machines.  This ought to block anything from the machine’s two physical ethernets but let through the virtual ones… /sbin/iptables -I INPUT -p udp –dport 514 -m physdev –physdev-in eth0 -j REJECT /sbin/iptables -I INPUT -p udp –dport 514 -m physdev –physdev-in eth1 -j REJECT

Things that needed changing…

/etc/fail2ban/actions.d/iptables.conf… the actionstart and actionstop need to also put the chains into the FORWARD rule….

# Option:  fwstart

# Notes.:  command executed once at the start of Fail2Ban.

# Values:  CMD

#

actionstart = iptables -N fail2ban-<name>

              iptables -A fail2ban-<name> -j RETURN

              iptables -I INPUT -p <protocol> –dport <port> -j fail2ban-<name>

              iptables -I FORWARD -p <protocol> –dport <port> -j fail2ban-<name>

# Option:  fwend

# Notes.:  command executed once at the end of Fail2Ban

# Values:  CMD

#

actionstop = iptables -D INPUT -p <protocol> –dport <port> -j fail2ban-<name>

             iptables -D FORWARD -p <protocol> –dport <port> -j fail2ban-<name>

             iptables -F fail2ban-<name>

             iptables -X fail2ban-<name>

Interesting observation when using a single fail2ban on multiple machines. It catches horizontal sweeps much sooner. Today I noticed it catch someone that was making one try at root on each of my machines. The merged auth.log files tripped my 10 hour ban after one attempt on each of three machines.

Things you will want to know if you have to replace your OpenVPN certificates, because say you got caught in the Debian key entropy problem.

  • Don’t forget to also run build-key-server.
  • Don’t forget to copy keys/server.* and ca.crt up to /etc/openvpn if that is where you keep them.
  • Each windows client with old keys is going to chew up 30 slots in your server until they get new keys. If you have many users, you don’t have enough slots. The windows clients retry every two seconds, but it takes 60 seconds to time out on the server side.

I had to resort to grepping syslog and dropping firewall blocks on people trying old certificates. I used another script watching my http logs to unblock people who had created new certificates. “TLS Error: TLS key negotiation failed to occur within 60 seconds” is a good bit to select IPs for blocking.

You know you have too many clients connected if you see “MULTI: new incoming connection would exceed maximum number of clients“ in the syslog.

more articles