wiredfool

Archive for the 'Programming' Category

Bash/unix file renaming

Of course, it’s possible but it’s not quite as easy as saying mv *.txt *.html.

So that I remember this:
for i in `find *.txt` ; do mv $i ${i/txt/html}; done

references: here and here

No comments

Phantom Signups

I’ve seen a dramatic uptick in signups to this blog, from what look like junk emails @ gmail, .ru, and other places. I’m a little confused, since there aren’t even attempts to spam me. The only thing I can think is that someone is building up a stash of wordpress logins for the next time that there’s a sql injection attack that can be performed by a logged in user.

What I’d really like to do is add a field to the signup page, simply asking: Why?

But then, some robot would probably try to convince me that it’s human.

No comments

Crypto

Somehow, on top of all the other things that happened on vacation, something close to my worst sysadmin nightmare came up. A break in OpenSSL/SSH. It’s complicated, mission critical, and it can’t be kept away from the users, at least in the SSL case. I’d rate this a 4/5 in panic level. (a 5 would be a remote root hole in one of these services.)

Oh, wait, I haven’t talked about the vacation. Flying with a sick kid is not fun. Nor with 2. Staying in a hotel with 2 sick kids and 2 sick parents is even less fun. But it did get better after a few days.

Then, debian stable’s random number generator was found to be a little weak, so that the keys generated were extremely predictable. Trivially even. Which means that any key generated using openssl on those systems is suspect, any dsa key used on one of the systems is suspect, and everything needs to be updated quickly without locking myself out.

I did have a couple of things in my favor — while I was using dsa keys, they were generated on OSX, so they weren’t instantly bad. And I use ip address filtering on ssh where I can and fail2ban where I can’t, so attackers either get 0 or 5 chances to get in before their packets are dropped. Out of all of this, I think that there were 3 keys that didn’t need to be replaced, because they were putty generated rsa keys.

Issue #1: I’ve got enough different machines and images that I wanted to use something a little faster than one at a time to do the updating. it turns out that Capistrano is a good way to do that, but it doesn’t work on the stock OSX Tiger install, nor on my ubuntu 6.06 machine. But, eventually I figured out that it does work from MacPorts. But ou have to compile a bunch of stuff, which is a little slow on a g4/1.2ghz.

Issue #2: For some reason, there’s one essential package, libssl0.9.8 that doesn’t update well on debian without a terminal. It has a prompt for which services to restart, and will hang there if run from Capistrano. So, I had to log into all the servers and images to do the actual update.

At least Capistrano sped up the new key deployment, and will probably speed up things in the future, but for this operation, I don’t think that I netted any time savings.

No comments

Obscure gpg options

I can never remember this when I’m looking for it.

If you have a signing key (not the primary) and an encryption key (the default key) for gpg, then when you need to sign a new key, the command is:

gpg --default-key ###### --sign-key ######

No comments

Mail.app bugs

With the new webkit with safari 3.1, I’ve noticed one really annoying Mail.app regression — not the one that Daring Fireball found.

When pasting in ascii text, into a plain text email, all formatting is lost, where by formatting, I mean whitespace is collapsed.

Quite often I paste in the results of sql queries, and they’re typically aligned in columns with a monospaced font. Always worked until the upgrade, now I get everything on one line with different amounts of whitespace.

Pain in the ass really.

No comments

Wordpress 2.5

The upgrade instructions for this version are pretty much the same as any other version.

So, what that means is that you install it, then wait 1-4 days, then install an urgent security patch?

I’m probably going to give it a shot tonight. Dunno if i’ll regret it in the morning though.

No comments

Debian Integrity Checking

I had the need to check a debian install for corruption recently when a newly installed and configured server started crashing under heavy IO activity. These days, when debian stable is kicking out kernel oopses, it’s most likely bad hardware, and if random wierd stuff is happening, look at the memory first. And surely enough, they found bad memory in the server. Since bad memory can hose a lot of things, I wanted to check the installed packages to make sure that there wasn’t any latent corruption.

I’d done this before on RedHat systems with a rpm command (rpm -Va verifies all installed packages against the manifest md5 sums), but dpkg doesn’t have an equivalent command.

But there is a package, debsums, that does exactly what I needed.

sudo apt-get install debsums
sudo debsums -l

Should show a listing of all the packages where there’s no hash on file.

debsums −−generate=nocheck −sp /var/cache/apt/archives Generates sums from the installed packages that are still in the cache.

Run sudo debsums -l again, This shouldn’t list any packages this time — if there are still some, you may need to redownload some with the command: sudo apt−get −−reinstall −d install debsums −l.

Then: sudo debsums −ca should give a listing of all the binaries that are different from the installed versions.

** Warning, as the man page notes, this is more a check for corruption and not a substitute for a malicious activity checker like tripwire.

No comments

New DB Server

As will become somewhat obvious over the next few posts, I’ve been setting up new database servers, and have crossed a sort of a threshold. IO bandwidth completely outstrips the ability of a single core to do anything interesting or useful with it. I’ve never had a database server that was not IO bound, and to be fair, this one probably isn’t once you factor in all the cores. I’ve also never had an 8 core system before.

This is a dual quad core server with low power chips, from Intel. The individual core’s aren’t speed daemons, but for the most part, I’m looking at lots of parallel operations so I’d rather have the parallelism than the individual core speed. The drive system though, it’s a LSI Megaraid with 512MB battery backed cache and 6 15K SAS drives in a raid 10 configuration. It’s good for better than 200 MB/sec continuous reads or writes, and about 130MB/sec per character writes. 1000 seeks/second. (and 450mb/sec reads on a 12 drive raid 10, and 200 meg/sec single character writes) Fast. Nice for DB loads. Nice for just about anything that needs speed more than space.

However, the parallelism rears it’s ugly head. This is the wide finder problem that Tim Bray has been talking about.

Gzip only runs on a single core, and while it’s throughput varies, it’s in the range of 4MB/second/core. If we could use all the cores, we still couldn’t keep up with the drives, but at least we’d be able to zip large backups reasonably fast.

Ssh, one core. Though, I’m seeing 45 MB/second/core there, it should be nearly possible to saturate the drive system and the dual gig-ethernet with the combined processor power. One of the use cases on the server is to blat the whole database from one to the other using rsync when reestablishing the secondary for fail over. 4x Faster would be nice.

Database dumps. Database Restores. Processor constrained again. Even on the old machine, a 3 yr old dual opteron, dumps are processor constrained at least when writing to a network or reading a toast table. This is annoying, since one non-negotiable bit of downtime is the time to dump from the old machine and load on the new one. It turns out that we can speed things up here to some extent by dumping each table on it’s own, with several in parallel. My tests are showing that for a database that’s dominated by 3 tables, the time to dump and reload becomes dominated by the time to dump and reload the largest table, rather than the entire database. Here that’s about a 25% savings, as there’s one massive table, one simply large, and quite a few under a hundred megs.

Recreating the indexes is another place where we redline one core, and hardly stress the drives. I think that I can probably manually split the index generation into separate threads, but I’d have to be careful to not split individual tables into different files.

These aren’t ordinary operations for a running database, but they are operations that tend to happen during down time when everyone’s looking at the clock and getting nervous. And it’s only going to get worse on the hardware side, since there’s no indication that anyone’s going to be rolling back to big honkin single cores. And the data’s certainly not getting smaller. So, it’s time for the software to get a little smarter.

No comments

Postgresql

If you happen to be upgrading postgresql through a dump/reload cycle and get lots of errors like “psql:db.dump:26268: invalid command \N”

It’s really simple. You probably haven’t loaded a schema yet, either by doing a data-only dump when you meant to do a full dump or by just forgetting to load the schema.

Also, at least in version 8.2, there’s a point in the schema dump where all the tables are in, but none of the constraints or indexes. That’s a real good place to split the file and do the data load.

No comments

EC2 Link Dump

These have been useful getting an ec2 instance running with some specially chosen software:

No comments

Next Page »