Harnessing mass fraud

The plague of reality television has spawned some unexpected phenomena over its decade-long life. Most interestingly, shows that allowed public voting demonstrated that people had both the desire and the means to organically organize to rig elections on a massive scale. Once again, the internet demonstrates its core competency, connecting strangers in weird ways, in this case through nexus sites like Vote for the Worst.

The question that’s been bugging me this morning: how to harness this ability? Any thoughts?

Encapsulating

Although I’ve used Duover for backups until now, I’ve decided to stop using it for two reasons. The first is that it seems to be floundering with the release of Leopard, making backups incredibly slowly, and generally flaking out. As an example, a daily backup from my kitchen machine took about 20 minutes under Tiger but, even with the latest Duover update, was taking over four days under Leopard. Not at all useful. Secondly, Time Machine is just really useful and cool.

So, I’ve just installed a 1TB Time Capsule backup device into my home network. It’s been a real breeze to setup, even for an Apple product. Simply just works. Even though I was pretty sure it would do what I wanted it to, I had a nagging suspicion that my network setup might trip it up, but this turned out to be groundless. My home uses two different wireless networks, one using 802.11g, to serve the older machines, and one using 802.11n to serve the newer machines at the best speed. (Hardware that runs 802.11n can also support 802.11g simultaneously, but doing so really slows down the 802.11n portion.) The additional speed on the 802.11n network makes a huge difference when streaming HD video to the Apple TV (though the g network can handle DVD level video just fine). My setup works basically like this:

Network diagram

I wasn’t 100% sure the kitchen machine (“Nexus”) would be able to see the backup service, but it works fine, just as a good network service should. As long as the machine and the device are on the same LAN, it appears to make no difference how it actually gets there, just as you’d expect. (That initial backup sure is slow, though.)

The end of the format war

Another hideously stupid format war that did absolutely nothing to help consumers is over: Blu-ray wins. Oh, it will be a while before HD-DVD figures this out, but it’s done. You’ll find a number of people saying this lately, largely because of the recent defection of Warner Brothers to the Blu-ray format. This, and things like Apple most likely including Blu-ray drives in the next generation of its machines, are important of course, but I’m calling the war for Blu-ray for a different reason: porn.

Early on, I guessed that HD-DVD would win the format war for the same reason that VHS beat the superior Betamax: it better met the needs of porn producers (cheaper to make, longer running lengths, etc.). And, indeed, HD-DVD appears to dominate pornography in the United States. After some exaustive research (link not work safe) on my part, I can confirm that finding porn titles in HD-DVD is much easier than finding them in Blu-ray. For some time, Sony was actively preventing porn from being available on Blu-ray, but not any more, starting with Debbie Does Dallas Again. So, it appears that porn is at least producing in both formats at the moment, even though HD-DVD is still (by far) more common in porn.

This, however, does not matter. Whatever role porn did or didn’t play in the VHS vs. Betamax war, it will turn out not to have much impact on the HD-DVD vs. Blu-ray war, which is the main reason I think this war is now over. There are two things that are different this time around that make this so.

First, porn now has many more outlets than it did in the VHS days. Back then, if you wanted porn, your only choices were theaters or videotape. These days, with rule 34 in full effect on the internet, porn is now almost impossible to avoid. Those looking for high-definition porn are much more likely to find it in some downloadable format than either of the DVD formats.

This pales next to the other reason, though. After even more painstaking research, I have now realized the truth: high resolution porn breaks the illusion. The quality is too good, turning porn into a festival of pimples, surgical scars and razor burn. Some companies are adapting to this, but these are likely to be companies that actually care about production quality, which most porn makers do not.

Taking porn out of the equation knocks a big…leg out from under HD-DVD. While it will still take far to long for HD-DVD to die, it’s now safe stick a fork in it for your purchasing choices.

Update: The death spiral is happening faster than I thought it would. Glad to see that companies that care about this war seem to think it is as stupid and useless as I do.

Another counterexample to open source

As reported by Fake Steve Jobs, an article recently penned by Jaron Lanier makes an argument in favor of closed source development. This is not necessarily an anti-open source stance, as Lanier claims it has its place, but…

…a politically correct dogma holds that open source is automatically the best path to creativity and innovation, and that claim is not borne out by the facts.

Why are so many of the more sophisticated examples of code in the online world—like the page-rank algorithms in the top search engines or like Adobe‚Äôs Flash—the results of proprietary development? Why did the adored iPhone come out of what many regard as the most closed, tyrannically managed software-development shop on Earth? An honest empiricist must conclude that while the open approach has been able to create lovely, polished copies, ithasn’t been so good at creating notable originals. Even though the open-source movement has a stinging countercultural rhetoric, it has in practice been a conservative force.

A couple of years ago, my friend MV mentioned another, more encapsulated example of how closed source can build better solutions. I haven’t seen it mentioned much, so will repeat it here.

The example comes from the very early days of the graphical user interface. Once you start to build a system that has “windows” that can move around, you have to contend with the idea that these windows overlap. Even if each window is a rectangle, it doesn’t take many windows to make some complicated shapes. Even two windows can do so. Consider this image of an early Mac desktop from the Apple Museum:

Mac desktop

The “System Folder” window is easy enough to represent, but how do you describe the shape of the visible portion of the “Mac System Software” window? What about the visible portion of the gray background? It’s just a collection of intersecting rectangles, but think about it for a second: how would you describe such shapes to a computer? Oh, and you only have 128K of Memory and an 8MHz, 16-bit processor. When building a GUI, you have to deal with this issue at some level. For example, something is painting the desktop background; how does it know not to paint over the windows? (For those that know a bit about graphics, double buffering doesn’t help you here, because a) you don’t have the memory and b) it is to slow on chips like this.)

The general concept for describing such shapes became known as “regions”. There are a number of different ways to implement regions. It was clear that Xerox PARC had one when the Apple team famously visited. It wasn’t at all clear what that implementation was, however, as it was closed source. Lacking access to Xerox’ methods, engineer Bill Atkinson took a look at the problem, figured out how they must have done it, and coded his version into the drawing system that would become QuickDraw.

It turns out, however, that Atkinson’s region code wasn’t really anything like Xerox’ code. It was much better. Better, in fact, than most other systems that came along, particularly the implementation used later by Windows. In an anecdote about Atkinson and regions, Andy Hertzfeld says that Apple “considered QuickDraw’s speed and deftness at region handling to be the most significant ‘crown jewel’ in Apple’s entire arsenal.”

This brilliant system (now supplanted by code that takes better advantage of modern hardware, particularly video cards) probably wouldn’t have happened if the Xerox code had been open source. Atkinson most likely would have started with their solution and refined it, and a bit of genius would have never been born.

Backing up a 1&1 root server

I have a simple need. I want to use rsync to copy various directories on a root server from 1&1 to my Mac. I set all this up this before, but a couple of days ago, the root server refused to reboot. After a lot of tinkering (and swearing) using the recovery systems and FAQs supplied by 1&1, I couldn’t fix it (also, see “An Aside” below), so re-imaged the entire machine. This completely rewrites the box, so the backup must be set up all over again. Setting up an rsync backup turns out to be more difficult than it needs to be, requiring a number of additional steps. Naturally, I didn’t write these down last time, so had to rediscover both the problems and the solutions again. This time, I’m recording them here so that a) I can find them when I need them again and b) on the off chance that they might help somebody.

I should mention that I am not a Linux guru by any stretch of the imagination. What follows will probably bore real Linux geeks to tears. Or, maybe just make them chuckle at my incompetence. Some or all of the following could be wrong or a bad idea. If so, please leave some comments (that’s the third reason I’m posting this).

The Problem

Using rsync to backup a machine makes backups much more quickly than using, say SFTP. This is because rsync only copies things that have changed since the last time you ran it. So, the initial backup pulls everything, but after that each time it runs, only the small number of altered files are transmitted over the network. This requires rsync software on both the source and destination machine, as they communicate to decide if a file needs transmitted, and this is where my problem is.

I re-imaged the server using a 1&1 supplied image that sets up the box using a Fedora distribution, all set up to use Plesk and the other standard bits that 1&1 provides. Unfortunately, rsync is not one of those bits, for reasons that are not clear. It is possible that 1&1 wants you to buy backup services from them instead. So, you need to install rsync on the server.

The “Fedora way” of installing software is to use a tool called yum, which “automatically computes dependencies and figures out what things should occur to install packages”. This tool is included in the 1&1 Fedora image, so the following command (as root) should do the trick:

yum install rsync

And here is where things get ugly. While the yum program is present, it’s configuration is messed up:

# yum install rsync
Setting up Install Process
Setting up repositories
core                      100% |=========================| 1.1 kB    00:00
http://update.onlinehome-server.info/distribution/fedora/linux/core/updates/6/x86_64/repodata/repomd.xml: [Errno 12] Timeout: 
Trying other mirror.
Error: Cannot open/read repomd.xml file for repository: updates-released

At this point, I could just manually install rsync and call it a day, but I really would like yum to be working for some other reasons. The error indicates a URL timeout, which means that yum is likely trying to contact a site that isn’t actually there. So, an obvious thing to try is changing yum to point to a different server.

Yum uses two main configuration concepts: a /etc/yum.conf file and a number of files in a /etc/yum.repos.d directory. A quick grep onlinehome /etc/yum.* shows the bad URL is in /etc/yum.conf. Looking at all the other repos in /etc/yum.repos.d, it isn’t clear if the two repos in /etc/yum.conf are even needed. They appear to be 1&1 specific, pointing to a server that 1&1 doesn’t seem to be paying attention to. Certainly, for rsync, they are probably not needed. So, let’s try telling yum to ignore those two. According to the config file, they are named “base” and “updates-released”. A look at the man page and try this:

yum clean all
yum --disablerepo=updates-released,base install rsync

This seems to work like a charm. So, the source for the backup should be ready to go.

There may also be another solution. The update server that 1&1 uses is inside the same firewall as the root server, so can’t be seen from the internet. Even from the root server, it appears that, by default, the root server can only get to the update server using ftp, not http. This is why yum times out when trying to connect to it. It could be that altering the config to use ftp URLs would work.

I have no idea why the install images provided by 1&1 are configured in such a way that they don’t actually function. I’ve sent them mail about this, but they have not replied.

The Destination

The destination of the backup information is a disk on my Mac, which is running Leopard. Rsync comes with Mac OS X, so should already be ready to go. I have set up a “webbackup” script, tailored to the specific sites I want backed up, and I was running this at noon each day as a cron job.

Or, I was. Until I installed Leopard.

Unbeknownst to me, doing an “upgrade” install of Leopard empties out the crontabs of all users, stashing copies in /var/cron/tabs. This deactivates your jobs without warning. This means that the webbackup I thought I had actually hadn’t been updated for several weeks. Fortunately, I managed to suck down a copy of my web folders and the MySQL data folder the hard way (see “An Aside”, below) before re-imaging everything.

Anyway, my backup script looks similar to the following:

#!/bin/bash

# The full path to the place on the local machine that will hold the backup
DEST="/Volumes/Backup Disk"

# The full path to the place on the local machine that will hold database backups
SQLDEST=${DEST}/sql

# The name of the target domain. This name is used both to connect to the
# target server, as well as the name of a directory created in ${DEST} to
# hold the backup data.
DOMAIN=yourdomain.com

# The username and server that are used to log into the target machine.
USER=user@${DOMAIN}

# The path to the directory on the target that will get echoed to the local.
# If you use a relative path here (no starting slash), it will be relative to 
# the home directory on the target machine. So, if you leave this empty,
# if will suck down the whole target directory. You can also use absolute
# paths.
USERPATH=

mkdir "${DEST}"
mkdir "${DEST}/${DOMAIN}"
mkdir "${SQLDEST}"

/usr/bin/rsync --recursive -t --delete --ignore-errors -e 'ssh' ${USER}:${USERPATH} "${DEST}/${DOMAIN}/"

# For each database, do the following
MYSQLDUMP="mysqldump --opt --quote-names -u dbuser --password=dbpassword database_name"
/usr/bin/ssh ${USER} "${MYSQLDUMP}" > "${SQLDEST}/tmp.sql"
if [ -s "${SQLDEST}/tmp.sql" ]; then
   mv "${SQLDEST}/tmp.sql" "${SQLDEST}/database_name.sql"
fi

Being intended for use on Macs, this script should work even for file paths containing spaces. It would use a lot fewer quote characters if you didn’t need to worry about that. You should be able to adjust this script to add additional database backups, extra domains, etc.

This script uses ssh, as does rsync. So, when you run it, most likely you will get asked to enter your password several times. This is irritating and, if the idea is to have this happen automatically, problematic.

It is possible to set up ssh such that keys can be shared between the target and local machines, using them for validation instead of a password. This is less secure, because any ssh connection from the same local user to that user/target combination will automatically connect without a password. If you are away from your machine while logged in, this can be a bad breach.

I create a special “backup” user on my Mac to do this kind of thing. This user has limited rights to the rest of the machine, and serves only the purpose of backing up stuff. Since I am almost never logged in as this user, it minimizes the threat of me accidentally leaving the machine still logged in has “backup”.

Once this is done, try running the webbackup script from the local machine by hand a few times. Once it works the way you want, put the script somewhere (referred to here as /path/to/webbackup. To add it into cron, you need to add an entry to your crontab. Log into the local machine account you will use for backing up and get the crontab into editable mode using crontab -e. The command I use backs up every day at noon and looks something like this:

0 12 * * * /path/to/webbackup >> /path/to/webbackuplog.txt 2>&1

An Aside

I mentioned at the start that I tried using 1&1’s recovery tools. These boot the system from a minimal image, rather than the OS on the box, allowing you to rummage around the box. This allowed me to suck the data that hadn’t been backed up off the machine before I re-imaged it, which saved my butt. Doing this requires that you manually mount the machines disks. They provide instructions on how to do this, but as of the writing of this post, these are now out of date. Their servers now use a RAID in mirrored mode (a.k.a. RAID 1), which can’t be mounted following their instructions. Following their document, your mount commands return errors saying the device is “already mounted or /mnt busy”. This error message is even semi-true. What seems to be happening is that the RAID is marking these drives as “in use” but the whole RAID is not mounted. So, you need to mount the RAID. This is similar to their instructions, but uses different devices. A forum entry suggested the command cat /proc/mdstat to display the names of the raid devices. In my case, these were /dev/md1 and the like. It turned out that these were set up with similar mappings to those described in the 1&1 instructions, so similar mount commands worked. The file systems were also autodetected, which helps:

rescue:~> mkdir /mnt/raid
rescue:~> mount /dev/mda1 /mnt/raid
rescue:~> mount /dev/hda5 /mnt/raid/usr 
rescue:~> mount /dev/hda6 /mnt/raid/home 
rescue:~> mount /dev/hda7 /mnt/raid/var
rescue:~> ls /mnt/raid/var/www/vhosts

Once you have these drives mounted, you should be able to use scp to suck the data you need off the machine, at the very least. Ideally, you should also be able to alter whatever files caused the problem that necessitated the recovery tools. In my case, the problem seemed to occur out of the blue, not from messing with config files or something, so after a few attempts to figure out what was going on, I just nuked the system.

iCrushYourHead

When a coworker was recently comparing experiences with the iPod Touch, he mentioned how grateful he was to the Kids in the Hall, as emulating them had trained his finger muscles to use the device’s zoom feature. For the uninitiated, zooming out on the iPod Touch involves touching two fingers to the screen and spreading them apart, while zooming in involves moving your fingers together. This is basically the same motion used in a series of Kids sketches involving a semi-crazy guy saying “I’m crushing your head”, while making a similar motion from a distance, like this attempt to crush the head of a statue of Buddha:

Crushing Buddha's head

All this head crushing and its relation to to the iPod Touch UI got me thinking about the perfect game for the device: iCrushYourHead. The idea would be that pictures of people would randomly drift across the screen at various speeds, and using the “pinching” UI gesture, you would have to crush their heads as they passed. The crushing would be animated with a cheesy accordion-fold type effect. As the game progressed, they would drift by faster and faster, and more would be on screen at once. Naturally, you’d need to be able to import photos of people who deserve head crushing.

The same coworker, hearing this idea, suggested another game, this one for kids. It is a port of those kids books where the page is divided into three sections, one for the head, one for the body and one for the legs, where the kid will mix and match various parts to make fun combinations. It would make use of another cool UI trick on the touch, the sliding scroll gesture where you can kind of “throw” a section of a screen and it will scroll with a sort of natural deceleration. It’s hard to explain if you haven’t used the iPod Touch, but it is very natural. In the case of this game, it would act a bit like the wheels of a slot machine, that you could spin independently with varying degrees of force. The number of choices for the body parts could also be vastly larger than a physical book would allow.

Feel free to send me royalty checks if you build these games.