Jim's Depository

I'm sorry Internet, I'm part of the problem.

I made a video of my cat chasing the shadow of her tail. It got posted to a social media site. I have contributed to videos of kittens.

Using jquery to scrape HTML in a client browser can be tricky

If you find yourself making ajax requests for HTML pages, then using jquery to massage them or extract data before displaying it, you should be aware that you may be attempting to load the <img> resources from the page. This can be a performance killer or information leak.

Consider…

function parseSomeDudesPage( pageText) {
    var page = $(pageText);   // <<<<------   This is the problem
    ... do a bunch of stuff with page...
}

...
   $.get('http://somedudeswebsite.com/blah/blah.html', parseSomeDudesPage);
...

That line marked as the problem will parse the text into DOM nodes which will be in your main document. The browser may decide to start fetching the images contained in pageText, (Safari 8.0.2 does). Depending how you fetched the document you could be getting images, broken links, or security domain violations for pulling http: resources into your https: page.

I think the solution is to keep the nodes out of your main document by creating a new one. Reading this stackoverflow comment and not caring about old browsers in my application I went with…

function parseSomeDudesPage( pageText) {
    var page = (new DOMParser).parseFromString( pageText, 'text/html');
    ... do a bunch of stuff with page which is a new document and doesn't trigger <img> loading.
}

Since you have a new document, you won’t be able to move nodes from page to your main document, but I tend to create nodes anyway, so I don’t miss the ability.

Upgrading to Debian Jessie can render your system unbootable.

A quick warning: When you upgrade your Debian installation to Jessie, you may find the grub2 can no longer install itself and you are left with an unbootable system. (Been there, done that, managed to save myself.)

Preconditions:

Your disks have been MBR partitioned so far in the past that there isn’t a megabyte hiding between the MBR and your first partition.
Your /boot is sitting on something which requires more than a tiny amount of code to access. In my case ext3 on an LVM partition.

When grub2 tries to update you will get messages about core.img being too big. At this point you may be screwed.

Am I safe?

If you are MBR partitioned, then check where your first partition is. If it starts at block 2048, you have plenty of space. If it is 63, you are in trouble.

If you are GUID partitioned, make sure you have a 1MB partition for boot loader.

How I recovered:

Since I was using LVM I plowed all the data off of my /dev/sda spindle using pvmove (many times), that involved de-mirroring things since I only had three spindles in the machine. Then I could repartition the drive. I went to GUID partitioning, where you explicitly make a boot loader partition for the use of grub2.

Then I could create an LVM physical volume on the remaining space of /dev/sda and shuffle all my data back into mirrors. The process was more complicated since I went through and reformatted each of the drives in case one blows and the BIOS chooses another, but all told it is possible to perform without a reboot.

Refreshing <img> tags in 2015

I have a web site which displays a remote camera view. I like to update the <img> once a minute or so without forcing a page reload and all of its flicker. I’ve done many horrible things over the years, mostly related to adding a query parameter to the end of the URL and accepting that I will trash the cache and perform redundant downloads when the image is not changing.

The <img> tag is 21 years old, surely if it can drink in the USA it has had a .refresh() method added. brief check of specifications: and no.

stackoverflow is littered with questions and solutions about how to reload an <img>, most of them involve cache busting serial numbers in the query parameters.

Notable exceptions:

Some answers recommend using a hash tag serial, since this should leave the URL to the server unchanged. This works in some browsers, but doesn’t cause a reload in others. Sometimes there is a combination of HTTP Cache-control headers that will cause a reload, sometimes not. Too touchy to use.
You can use an <iframe> containing just the URI of your image to force a reload using an actual, supported mechanism! When that completes you can bludgeon your <img> tag enough to use the new version. See Method #4. I have dispensed with replacing the <img> element and just set the src to null and back. This keeps Safari and Chrome from flickering in January of 2015.

The <iframe> method is my new favorite. It requires Javascript, but I’m not sure how you would be deciding to refresh without Javascript, so that isn’t a loss. From Safari, it does cause two requests for the resource, but the second will be a 304 if you are using any sane caching mechanisms, and the first will only be a 200 if it really changed. (We are in 2015 and I don’t worry so much about the extra request since the servers are all SPDY and its not like its a new TCP connection or blocking anything else.)

Note to W3C: Add .reload() method which tells an <img> to reconsider the caching information and make a new request if required. I suppose a force boolean argument wouldn’t be too much. But nothing more complicated.

On Migrating femtoblogger from PHP to Go

After too many years of femtoblogger succumbing to PHP language drift, I’ve been idly reimplementing it in Go.

I’ve been through several iterations wherein I tried to force sophisticated abstractions onto Go, with rather poor success. I finally realized that Go deliberately eschews abstraction. I gave up abstracting and just wrote out the code in as many minor variations as required to do the job, and it was a much more pleasant experience.

In the end I’m looking at about 2,000 lines of Go (not counting imports I didn’t write) which has most of the functionality of the 3,000 lines of PHP. So, within a power of two, they appear to be a wash.

femtoblogger has gained some important ground:

It is statically type checked. Typos and errors won’t be lurking around to explode at run time.
I have an executable which will continue to run until I break it. No more will the web site disintegrate when I update PHP on the server because the language drifted or a Debian packager changed a setting.
I feel better about its security. PHP always made me slightly nervous.
I’ve ditched WYSIWYG editing for markdown. The HTML folk have had plenty of time to make editing work, it’s not my problem if they can’t get their act together without thousands of lines of Javascript repair code.

So here’s to another 7 years of femtoblogger. Who knows what I’ll rewrite it in when 2021 comes around.

Things I Learned about a pcduino v3

I purchased a pcduino v3 for a project. I specifically selected it for its Allwinner A20 processor with its dual Cortex A7 cores and SATA interface.

The vendor image I downloaded had a linux 3.4.79+ kernel which failed to work reliably with the SATA port and would freeze the computer with no debug output randomly, say every 10 minutes or so under heavy load. As is typical with ARM SoC units, the kernel source is a bit of a mess. In this case you have to get the kernel from Allwinner’s back ported android kernel, sunxi. There is some sunxi 7 support in mainline linux 3.17. Do not be deceived, it is not enough to work. I built 3.4.103++ from sunxi, which is a bit of a trial to get configured, there are a fair number of dependencies that you will only find when the compile breaks.

The 3.4.103++ kernel is working reliably for me with SATA, though my X display has broken. I’ll look into that later.

If there is a bootable SD card in the computer, it will boot from there. If not, it will fall back to its onboard NAND. I’ve elected to not touch the NAND and use it as my recovery method. Creating a bootable SD card requires a non-obvious trick. There is boot loader code which must be at block 8 and block 20 of the card. This means you cannot use a GUID partitioned card. The GUID partition table is in blocks 2 through 33.

You will need to build the these boot loaders, a stern googling for u-boot-sunxi-with-spl.bin will find all you need to know about it.

In partition 1 you will need to copy in the uEnv.txt, script.bin, and a uImage file. The uEnv.txt tells u-boot where to find your kernel and how to load it. The uImage is your kernel.

Some other things I learned along the way:

Linux no longer supports root=LABEL=MYDISK on the kernel command line. It looks like it supports partition UUIDs, but without a GUID labeled disk that isn’t going anywhere. There is a NT label fallback, but I couldn’t get it to work. I’ve had to resort to direct device names and the nondeterminism that brings in the face of a drive failure.
There is no meaningful documentation for the A20 chip except for a bunch of register names.
If there is a CPU temperature monitor in the A20, no one knows how to use it. There is one in the AXP209 PMU which may get exposed in /sys/ if you are clever or fortunate. (I added a heatsink and fan while fighting the system hang. It didn’t help, so CPU temperature is probably not an issue. Still, I’m leaving the heatsink on.)
The power connection to the board is unfortunate. They use a micro USB for power and ask for a 2 amp power source. The warning sign here is that micro USB connectors are only spec’d for 1.8 amps. Depending which of my 6 foot USB cables I use, I either get about 4.8V at 500mA, or 4.6V at 500mA. That is an uncomfortable voltage loss. I’d really have appreciated a couple of through holes where I could solder on a real header of some sort, especially since they suggest this board is suitable for 24x7 continuous use in devices. The schematic is large and disjoint, but I don’t think there is a good spot to pick this up.

You can monitor your incoming voltage and current use with:

cat /sys/devices/platform/sunxi-i2c.0/i2c-0/0-0034/axp20-supplyer.28/power_supply/ac/voltage_now
cat /sys/devices/platform/sunxi-i2c.0/i2c-0/0-0034/axp20-supplyer.28/power_supply/ac/current_now

For an idea of computer speed, a kernel compile using both cores takes about 50 minutes of wall clock time. My Core i7, (4 cores + hyper threading) does it in under 2 minutes. If you plan to do kernel work, cross compile.
You don’t need that initramfs that is in the default kernel config. If you are trying to install Debian it will even mess you up. Leave it out when you build the kernel.
You can get STL files for a pcduino v3 case. It is a little tight around the micro USB power connector and accessing the microSD card is about hopeless.
There are mounting holes, but be careful, there isn’t much clearance around them. It would be easy to make contact between a screw head and a component. I made tiny o-rings by slicing some insulation from #8 AWG wire to stand the screw head high enough to not touch components. Nylon screws with tiny heads would be a great idea.

View0 – A Simple Template System for Lua

I was unhappy with my options for a template system on a current project, so I have created a new one which may be useful to you.

The template language is inspired by Terrence Parr’s Enforcing Strict Model-View Separation in Template Engines in which he argues that Turing complete template languages are a mistake. Logic is best left to the model and controller, with the template engine acting as a view and simply converting from the model to the desired representation.

Having used View0 for a fair bit of source code generation, I have to agree.

View0 syntax leans toward meaningful words rather than cryptic symbols for the sake of non-programmers who might need to edit templates. There are only five directives.

You can find View0 and its manual at https://bitbucket.org/jimstudt/view0.

clang's source code analyzer is nifty

Being mostly a crufty old C coder at this point, I don't use much new stuff. But I have to say the source code analysis tools are pretty nifty.
When I hit the analyzer button in Xcode to run clang's analyzer, sometimes it finds things like this for me… (A leaked buffer in a diagnostic error for one of those "can't probably happen" error checks. Just the kind of place I can get sloppy about memory ownership.)

I particularly like how it explains itself, that it doesn't show off (it had to know that bgetstrn() returns allocated memory without retaining ownership, it figured that out somehow, but it doesn't feel compelled to tell me about it), and that it has never given me a false positive in my code.
It has three false positives lurking in the Lua runtime, but they are in some pretty crazy code.

Attachments

analyzer.png 46187 bytes

Intel GPU is Useless for OpenCL on Linux

You can use OpenCL on your Intel based linux machine, but buried in the fine print you will find that it only runs on the CPU, those 300 million transistors and 25% of your processor die area in your GPU are completely useless.

I feel a little silly for making sure my servers had HD4000 GPUs in case the day came when I needed an OpenCL boost. The day came. I’m not getting it.

I needed to cmoopse you one very small remark to be able to say thanks a lot again on the wonderful tricks you've featured in this case. It has been really strangely open-handed of people like you to make unhampered just what numerous people would've supplied as an electronic book to get some dough for themselves, most importantly considering the fact that you might well have done it if you wanted. These inspiring ideas also acted to become a great way to understand that some people have similar fervor similar to my very own to learn a whole lot more with regard to this issue. I am sure there are lots of more enjoyable periods in the future for folks who discover your blog.

Redundant ISPs from a Linux router.

I have a need to survive ISP outages, but am not large enough to have real things like BGP and serious internet connections, so I am using a telco and a cable company with a few static IPs on each.

There are various tutorials on the internet for how to cope with this, but they seem to primarily involve using iptables MARK to stain packets and then use the iproute2 functions to route them. I dislike conjoining these two tools. I am using source routing to keep everything straight, though there is a gotcha involving SNAT that needs attention when a link goes down.

Requirements:

Support more than one ISP.
Use the right source IP on each link to not trip packet spoof detection.
Survive a link going down and back up.
Lots of IPv4 private address machines on the inside need to be NATed on the way out.
IPv6 is mandatory. Using 6rd while my ISPs recover from being blindsided by that 20 year old RFC for IPv6.
Some servers live outside the firewall/router, I won’t speak any more of them, but they are there.

Non-requirement:

Load balancing. This can be addressed with your outgoing rules, but given the disparity in quality, there is little point in using the U-verse link for outbound connections if the Charter one is up. The inbound connections will still use it.
A single point of failure router is fine with me.

Strategy Overview:

Use a VLAN switch so I don’t need a flock of switches and multiple ethernet ports on the router and “outside the firewall” boxes.
Not required, but when you see decimal points in my ethernet device names, those are the VLAN ids.
Use source routing so that all packets go out the interface that matches their source IP address.
Use iptables SNAT to let the local machines out. Choose their SNAT address based on the outgoing interface. Let the routing rules do the routing.
Use conntrack to forget cached SNAT mappings when a link goes down or comes up. This is important!

VLAN Switch, 802.11q Is Your Friend

Go read about 802.11q if you are not familiar. With this you need only one switch. You can have as many virtual LANs as you like and configure on a port by port basis which LANs appear on that port. If you have gear that doesn’t do 802.11q you can set a single VLAN to show up there and work fine without any changes to that device. You will pay more for a “smart switch” with 802.11q support, but you are going to save on the number of switches, cabling, and ethernet cards. (e.g. in January 2013 I paid $220 for a 24 port gigabit 802.11q switch.)

You will have to configure your switch. My NetGear switch is configured through a web interface apparently writing by a maniacal sociopath, but it can be made to do the job.

The Source Routing

We are going to need two auxiliary routing tables to hold rules for when we know we have a U-verse address or a Charter address. These are going to get names which means we add lines to /etc/iproute2/rt_tables, (which is just a file mapping numbers to names)…

echo "200 att" >> /etc/iproute2/rt_tables
echo "201 charter" >> /etc/iproute2/rt_tables

When an interface comes up, we are going to add an ip routing rule to force packets with a Charter source address to look in that charter routing table and go out the right interface, likewise for AT&T… (Notice the “throw” rules. Some people duplicate their main table here, but I’d never keep that in sync, so I defer to the main table instead.)

# This is what makes source routing happen
ip rule add from 99.178.257.57/29 table att

# get a fresh start on the routing table
ip route flush table att
ip route add default via 99.178.257.62 dev eth2.4 table att

# get the RFC1812 private networks out, they don't want to go out this interface
# the "throw" will make them go back to your regular routing tables.
ip route add throw 10.0.0.0/8 table att
ip route add throw 172.16.0.0/12 table att
ip route add throw 192.168.0.0/16 table att

The SNAT For Our Private Addresses

Nothing new here, yet…

iptables -t nat -A POSTROUTING -o eth2.4 -s 172.16.0.0/12 -j SNAT --to-source 99.178.257.57
iptables -t nat -A POSTROUTING -o eth2.4 -s 192.168.0.0/16 -j SNAT --to-source 99.178.257.57
iptables -t nat -A POSTROUTING -o eth2.4 -s 10.0.0.0/8 -j SNAT --to-source 99.178.257.57

But wait! Now we have a problem. iptables connection tracking is going to learn these SNAT rules, and for instance, if you have a ping running, it will happily keep trying the dead interface after you take one down. The fix I’m using is to clear the SNAT connection tracking information with an interface goes up or down. I use this in my /etc/network/interfaces stanzas (install conntrack first)…

# We need to make NAT'd addresses choose a new path
# e.g. ICMP echo will be stuck on a dead interface if it was using this one
up conntrack -D --src-nat
down conntrack -D --src-nat

Choosing the Best Interface for Outgoing Traffic

You will want to use a metric on your default routes in order to choose the best one. (Alternatively you can get into load balancing, but my asymmetry is too high to care about that.)

I do this by not using the gateway declaration in my iface stanzas, but just do a up command instead…

#gateway 99.178.257.62 --- but we want an explicit metric, so we do it this way
up ip route add default via 99.178.257.62 dev eth2.4 metric 1 || true

… that is my shunned AT&T connection. I use a metric of zero on the Charter line so traffic prefers it, but will use AT&T if Charter goes down.

Now IPv6

IPv6 gets the same treatment, except you don’t have to screw with SNAT and conntrack, unless you really want to. Also, you will need some “-6” keystrokes. It helps to remember that those routing tables for att and charter are really four tables, two for IPv4 and two for IPv6.

I’ll just show you my Charter 6rd stanza, you can work it out from there.

iface charter6rd inet6 v4tunnel

# Force 6rd gateway to be on the Charter interface
pre-up ip route add 68.114.165.1 via 96.35.289.49 || true

# 2nd 32bits of this is my IPv4 address 
      address 2602:0100:6023:gd32::1
      netmask 32
      remote 68.114.165.1
      endpoint 68.114.165.1
      local 96.35.289.50
      tty 64
      up ip -6 rule add from 2602:100:6023:gd32::/59 table charter || true
      down ip -6 rule del from 2602:100:6023:gd32::/59 table charter || true
      up ip -6 route add default dev charter6rd table charter
      post-down ip route del 68.114.165.1 via 96.35.289.49
      up   ip -6 route add 2000::/3 dev charter6rd metric 5
      down ip -6 route flush dev charter6rd

What Is Wrong With This Strategy

When one of the ISPs is broken, I need to bring down their interface, otherwise traffic will happily still try to use it. There may be automated ways to do this, but I’m a simple barbarian and given the rarity of the events, I just use a little cron job that if it can’t see some portion of the internet out a particular interface, brings that interface down for a little while. I suppose playing with the default route metrics would be nicer, but like I said, simple barbarian. (I do have a nagging suspicion that if I were smarter about the load balancing it would “just work”. But I’m not.)

2025	February	2
2023	June	1
	January	1
2022	December	1
	November	2
	June	1
	April	2
2021	September	1
	August	1
	March	4
2020	November	12
2015	August	1
	June	1
	April	2
	February	2
	January	3
2014	December	1
	November	1
2013	February	1
	January	4
2012	December	1
	November	1
	October	1
	June	1
	May	1
	March	1
	February	1
2011	May	2
	March	2
	January	2
2010	December	2
	November	2
	October	2
	September	1
	July	1
	June	4
	May	1
	March	4
2009	September	2
	August	1