All about the system administration and application development behind a local linux-based company
I just finished a presentation at the Niagara Frontier LUG on Xen virtualization and how it applies to a managed hosting infrastructure. Here is a copy of the presentation for anyone who’s interested (sorry, it’s MS powerpoint. Yes, I do appreciate the irony).
It goes over some of the pros and cons of all virtualization, different types of virtualization, strategies for achieving levels of availability, and finally, some xen-specific configuration options and tips.
Edit: Mark asked me to clarify the benchmark parameters, so here’s an excpert of an email that I sent to another reader about this.
as an FYI, the benchmarks were done with:
core 2 duo 1.8Ghz
single 7200RPM 2.5″ driveThe image was done as a .img file on the same partition as the DomU, and the
partition was a separate DOS partition on the same drive. The only VMWare test
I ran used an image file. All of the OS installs were cloned from the same directory tree
(stay tuned for a future post on converting a Xen image to vmware)The Moodle benchmarks were a snapshot of a production moodle install,
where I copied the install to my test system, logged in and clicked a
few things, and replayed the session logs through jmeter several hundred
times with five concurrent threads.The images benchmark is transfer speed of randomly selecting 1 of 2500
10kb-40kb images. 8 concurrent users and 1500 iterations.The Mysql benchmark is the results from running all the tests in the
mysql benchmark suite.Of note, the image file generally performed a little bit better than the
raw partition. This is counter to what the Xen documentation and common
sense would say, and I think a lot of it has to do with my pretty
limited tests. The image file was ending up in memory cache, whereas
the block device wasn’t. I doubt the same comparitive performance would
play out in a production system where a lot more’s going on.
It often comes up where I’ve got a mail delivery issue (user n can’t send email to domain n.com) and it’s difficult to troubleshoot just with the info in the logs. One thing that sometimes yields useful info is directly telnetting to port 25 of the offending mail server. This way if it’s rejecting messages you can see the exact reject message, or it could be violating the protocol in some creative way.
I just ran into a succinct protocol description at http://helpdesk.islandnet.com/pep/smtp.php
Here’s their transcipt of a successful SMTP session. Be sure to notice the excra line break between the end of the headers and the beginning of the body.
telnet mail.islandnet.com 25
220 Islandnet.com ESMTP server ready
helo a.b.c
250 mail.islandnet.com Hello x [YOUR_IP_ADDRESS]
mail from:
250is syntactically correct
rcpt to:
250verified
data
354 Enter message, ending with “.” on a line by itself
From: Bugs Bunny
To: Daffy Duck
Subject: Loony Toons!Hi there!
.
250 OK id=1778te-0009TT-00
quit
221 mail.islandnet.com closing connection
Well, it’s all over with. I hope that everyone enjoyed the wild patching frenzy. Make sure you update your kernel to the CentOS-supported one at your earliest convenience. I’ll remove all the kernels except for the ones from the 53.1.4 tree soon to avoid confusion. Many thanks to everyone in the Redhat and CentOS teams for their quick and diligent responses.
From Centos-Announce
The following updated files have been uploaded and are currently
syncing to the mirrors:i386:
kernel-2.6.18-53.1.13.el5.i686.rpm
kernel-devel-2.6.18-53.1.13.el5.i686.rpm
kernel-doc-2.6.18-53.1.13.el5.noarch.rpm
kernel-headers-2.6.18-53.1.13.el5.i386.rpm
kernel-PAE-2.6.18-53.1.13.el5.i686.rpm
kernel-PAE-devel-2.6.18-53.1.13.el5.i686.rpm
kernel-xen-2.6.18-53.1.13.el5.i686.rpm
kernel-xen-devel-2.6.18-53.1.13.el5.i686.rpm
Redhat has released updated RPMs for RHEL 5.1 uncharacteristically quickly, in recognition of the seriousness and internet coverage of the issue: RHSA-2008-0129. I expect we’ll see a release from centos soon as well. Of note, this release does not fix the nfs issues that were present in 2.6.18-53.1.6.
At the suggestion of a Centos mailing list member, I’ll be posting RPMs from the 2.6.18-53.1.4 release soon, for people who need to run the earlier version because of NFS issues.
I’ve built the following RPMs for RHEL 5 that fix the vmsplice() exploit in RHEL machines. They are built off of the 2.6.18-53.1.6.el kernel, with the upstream patch from kernel.org.
I’ve tested them on i686 and x86_64 machines, however be aware that they have not undergone extensive QA, so I’m not responsible if they blow up your machine. That said, I’m pretty confident that no one will have any problems with them, as they are literally a one-line difference.
Update: Reminder to install these with rpm -ivh and not rpm -Uvh. Otherwise you’ll remove your old kernels, which you may need to fall back to,.
i686:
i688-PAE:
x86_64:
Source:
Xen, and several other RPMs are available at: erek.blumenthals.com/vmsplicekernels. Note that the PAE and Xen kernels are entirely untested.
As this kernel is an odd-numbered release, yum should pick up the official upstream patch as soon as it’s available, but in case they do their numbering differently or pick the same release number that I did, it’d be a good idea to double check that yum picks up the latest.
Let me know any experiences with this, especially any confirmations that it’s safe with PAE or Xen.
See http://www.milw0rm.com/exploits/5092 for proof of concept code.
I’ve verified this to work:
[erek@centosmachine src]$ uname -a [erek@centosmachine src]$ ./exploit ----------------------------------- Linux vmsplice Local Root Exploit By qaaz ----------------------------------- [+] mmap: 0x0 .. 0x1000 [+] page: 0x0 [+] page: 0x20 [+] mmap: 0x4000 .. 0x5000 [+] page: 0x4000 [+] page: 0x4020 [+] mmap: 0x1000 .. 0x2000 [+] page: 0x1000 [+] mmap: 0xb7fad000 .. 0xb7fdf000 [+] root [root@centos5machine src]# whoami root [root@centos5machine src]#
Ubuntu, Centos 5, and most Fedoras seem to be vulnerable. Centos 4 is not. I’m recompiling Centos 5 and FC 3 kernel RPMs with the appropriate patches, and will post them here in an hour or two. These are using the upstream kernel patch and I’ll know soon whether they conflict with any of the RHEL-specfic code. I doubt it does, as it’s a one-line patch.
And that’s the sound of 1000 admins running home from their Sunday afternoons to patch their boxes, and the sound of 1000 cell phones going off as their bosses read about this.
Update: Compiler is still going, and I’m heading out. I’ll post the rpms in the morning.
Some open source applications (ex. Request Tracker) have excellent practices when it comes to their database schemas, but a surprising number of them do not.
This probably stems from many open source authors being programmers first and DBAs second, so the procedural logic is closer to their design decisions than the database schemas. The other day, when I was reading the installation manual for a popular web calendar system, I came across the following:
Next, create the database user account that will be used to access the database.mysql –user=root mysql mysql> GRANT ALL PRIVILEGES ON *.* TO webcalendar@localhost IDENTIFIED BY ‘webcal01′ WITH GRANT OPTION; mysql> FLUSH PRIVILEGES; mysql> QUIT
Obviously giving root mysql access to a single web application is a grave mistake. This shows an absolute lack of the understanding of database administration.
So, with this example fresh in my head, I’m going to go through some practices that I’ve found to be helpful for maintaining extensible database schemas. As usual, the examples will be tailored to MySQL >4.1 and PHP 5, but there’s no reason they couldn’t be adapted to other languages. I’ve optimized these examples for robustness and auditability, without a lot of regard for disk usage. Frankly disk space is quite cheap, and throwing away information is rarely the best way to go.
This is in no particular order:
If you’re building a CMS, when a user modifies a page, you can either add a new row for the modified page, or you can UPDATE the current row. The difference is that inserting a second row makes rolling back the old page easy, and also allows you to generate a log straight from the database (”User “y” modified page “z”. Diff: …). Of course, this makes the SELECTs a little bit uglier:
SELECT pagename, content, title FROM pages
WHERE pagename = "foo" ORDER BY timestamp DESC LIMIT 1
versus
SELECT pagename, content, title FROM pages
WHERE pagename = "foo"
However, he first option is auditable and rollback-ready, and with a few simple modifications could allow administrator approvals, etc. The second option requires the user who edits the page to get it right the first time.
Similiarly, if your goal is to remove a user, you can either DELETE that user’s row, or you can add a boolean Disabled field, and UPDATE them to Disabled instead. This way you don’t have to be nearly as careful with referential integrity, as the user’s primary key still exists,and thus any tables that reference that id will stay consistent.
If you add a new tables and fields as your functionality grows, rather than modifying the older ones, you can roll back the application to the previous version, and it will
still work as it always did, since all of the data it expects will be in the form it was. Over time this can result in somewhat bloated database schemas, but good documentation and
cleanup can mitigate this.
SELECT * from foo;
May break your application when you add extra fields to the database, or if the order of the fields changes, however:
SELECT field1, field2 from foo;
will continue to work even as the database schema evolves.
An alternate way of accomplishing the same design goal is to use a wildcard in the SQL but access the fields by name in your application code. For example:
$result = $mysqli->query('SELECT * from foo;');
$foo = $result->fetch_assoc;
do_stuff_with_fields1_and_2($foo['field1'], $foo['field2'];
This strategy results in even easier application extension, as you can immediately use new fields in your application logic without having to modify your SQL queries. The disadvantage is that you may be pulling more information out of your database than you plan to use, expending extra memory and loading down the DBMS.
Here are two examples of PHP code that perform the same function (Error handling not included):
$searchstringescaped = $mysqiobject->escape_string($searchstring)
$result = $mysqli->query("SELECT foo, bar, bak WHERE bak like '$searchstringescaped'");
while(list($foo, $bar, $bak) = $result->fetch_row()) {
do_stuff_with_bar_and_bak()
}
Or, with prepared statements:
$stmt = $mysqli->prepare("SELECT foo, bar, bak WHERE bak like ?");
$stmt->bind_param('s', $searchstring);
$stmt->execute();
$stmt->bind_result($foo, $bar, $bak)
while($stmt->fetch()) {
do_stuff_with_bar_and_bak();
}
While the former may have fewer lines, the second one clearly separates the query logic from the parameters, and eliminates any possibility of SQL injection or double escaping.
(This requires innodb)
Any database statement can fail. It can be because of programmer error, hardware failure, or any other host of issues. When executing a group of statements, it’s almost always better for all of them to fail than for only some of them to fail. For example, suppose that you want to transfer credits from one user’s account to another. The naive solution would be:
UPDATE users SET credits = credits + 1000 WHERE id = 1;
UPDATE users SET credits = credits - 1000 WHERE id = 2;
However, this code has two problems. First of all, if there were a power failure at precisely the right time, it would result in both users having the thousand credits. Second, for a short amount of time, the credits would be in both accounts, and a third client viewing the accounts between the two updates would see inconsistent data. The proper way to do this would be:
START TRANSACTION;
UPDATE users SET credits = credits + 1000 WHERE id = 1;
UPDATE users SET credits = credits - 1000 WHERE id = 2;
COMMIT;
Or, if you’re using PHP 5’s Mysqli extension, you can use the extension’s interface to transactions like:
$mysqli->autocommit(false);
//Run some mysqli queries
$mysqli->commit();
Finally, if something in the business logic causes you to change your mind about running the transaction, you can call a rollback, which will cancel the last transaction:
$mysqli->autocommit(false);
$mysqli->query('UPDATE users SET credits = credits + 1000 WHERE id = 1;');
$mysqli->query('UPDATE users SET credits = credits - 1000 WHERE id = 2;');
$result = $mysqli->query('SELECT credits from users WHERE id = 2');
list($giverbalance) = $result->fetch_row();
if($giverbalance < 0) {
//Ooops, the giver now has a negative balance.
$mysqli->rollback();
}
I hope that this helps, in a small way, to further the practices of database developers, and I’d love to hear any more suggestions or comments people have on the topic.
Update (Feb 02 2007): Fixed several grammatical errors
There’s a linux worm currently spreading rapidly that exploits web servers. Finjan estimates that about 10,000 servers are affected. Nobody has confirmed how it’s getting root access, but once it is in, it installs a dynamic apache module that randomly sends java script code to clients. The javascript code exploits vulnerabilities in Quicktime, Yahoo Messenger, and others. It attempts to install Rbot, a malware suite on computers that access the sites, using a host of exploits including ones found in Quicktime, Yahoo Messenger, and Windows Media player.
An immediate way to test if you’re affected is to see if you can create an entirely numeric directory, and if you run into a file not found error, or the directory isn’t actually created, it means that you’re infected. This is a bug in the rootkit, and there are some reports coming in that it’s already been fixed by the attackers. A more robust way to check for the exploit is to run the following command:
tcpdump -nAs 2048 src port 80 | grep "[a-zA-Z]\{5\}\.js'"
and if you see some lines printed, it means that your server is sending infected javascript files. If your web server is particularly low traffic, you may want to run:
ab -c 10 -n 100 http://www.yourdomain.com/somefile.html
This will generate some traffic on your web server, so that there are some requests for tcpdump to pick up on.
I’ll post more if I hear any news about the nature of the underlying vulnerability. In the meantime here’s some further reading:
I’ve recently discovered clusterssh, a tool that opens up many xterm sessions and binds them all to one keyboard input. I use it for updating my servers or reconfiguring them all in the same way. For example, since most of my boxes run the same OS version, they all need package updates at the same time, so after I get a flurry of “Update available” emails I have a quick look at an eratta site to see what problems I’m fixing, and then I fire up clusterssh and run (for example) sudo yum update. Here’s a quick screenshot to show it in action:
![]()
It works with clusters of servers, so you’ll have a configuration file like (~/.csshrc):
clusters = web-servers all-servers special-servers
all-servers = erek@libra.blumenthals.com erek@aquarius.blumenthals.com erek@scorpio.blumenthals.com erek@webserver1.blumenthals.com erek@webserver2.blumenthals.com
web-servers = erek@webserver1.blumenthals.com erek@webserver2.blumenthals.com
special-servers = erek@libra.blumenthals.com erek@aquarius.blumenthals.com erek@scorpio.blumenthals.com
Obviously, it works best if you have ssh keys set up to all of your servers. Then you can just run cssh
By now it’s been reported all over the internet that Sun is buying MySQL, AB for approximately $1 billion. This is either the largest or smallest news of the new year. As of 2005, MySQL’s revenue was $40 million, and according to Sun’s press release, they have about 400 employees. To me, this doesn’t seem that it justifies the $1 billion price tag, so Sun is buying them for reasons other than a straight return potential.
On the cynical side, they could be looking to build it up to a second oracle, with a small free version and requiring a purchase for any of the newer enterprise features. I doubt this is the case, as Sun should recognize that much of the MySQL appeal comes from its image as an open free program with the availability of rock solid support. However, I wouldn’t be surprised if we see MySQL Enterprise’s base prices (currently $600-$4000/server/year or $40,000/year for a site license ) go up a bit to help pay off Sun’s investment and close the gap with Oracle (~$40,000/CPU).
First of all, quite a few online disucssions have centered around “Why not postgres?” I think that this is the easiest question to answer: Regardless of the technical merits of each system, MySQL is currently the web application leader. Furthermore, Postgres isn’t a company, so it can’t be bought. Sure, Sun could announce that they are putting serious development resources behind postgres, and offering paid support options, or they could go the EnterpriseDB route and fork postgres into a commercial and commercially supported product, but neither of those give them the same control of a stable and highly adopted database product that owning MySQL AB does for them. More importantly, it doesn’t expose them to MySQL’s impressive customer portfolio.
By buying MySQL, Sun is showing that they’re interested in becoming more like Oracle and IBM, the enterprise consulting company. Since many of the larger and data intensive technology companies (Baidu, Google, Facebook, Ticketmaster, Dunn & Bradstreet) use MySQL, they are buying themselves ins to the who’s-who of technology, and just the companies that wouldn’t have previously been Sun customers (none have a need for Java, they tend to use whitebox clusters, etc) However, now that Sun owns MySQL, there’ll be the opportunity to sell complete solutions to enterprises based upon the LAMP stack.
Several concerns come to mind: Since Sun has a vested interest in Solaris and their own hardware, optimizations and new features may come to Solaris and Sun storage systems before it makes it to the other platforms, or we may end up with a whole set of features that only work with Solaris and/or on Sun hardware. Also, Mysql AB was a known. We could count on them to provide a steady stream of new features for both the paid and unpaid versions, reasonably responsive unpaid support via mailing lists, and excellent (if verbose) documentation. With Sun, we can expect all these things and more to be provided to paying customers, but it remains to be seen what will happen to the community version.
I’m actually happy to see Sun emerging as one more of the behemoth consulting companies, as that was starting to become somewhat of a duopoly, but I just hope it doesn’t come at the expense of fixing something that wasn’t broken (MySQL AB).