Archive for March, 2006

The Virtues Of Wiki

Sunday, March 5th, 2006

For the longest time, I just didn’t “get” the Wiki phenomenon. I knew that supposedly it was a simplified syntax for content creation and had something to do with simpler collaborative works… but I just didn’t see the real value of it.

However, with the recent move to using Trac for project management, I’ve been starting to get a lot of mileage out of the Wiki portion of the application and I’m really liking it! I see now the two huge benefits of using a Wiki, even though they should have been obvious from the beginning.

First, it’s fast and easy. I know I just said that above, but the implication is huge. It removes the demotivating barrier to updating content that is simply “this is going to take a bit”. It becomes so easy to modify a page, there’s just no reason not to.

Second, collaboration is easy. Once again I realize that I said that above, but the key is that it’s not just easy… it’s trivial to allow multiple people to edit the same document without much risk in losing content or progress (build in historical diffs are standard fair). No worries about shared SVN access, just live updates of whatever is needed.

This has been so amazingly productive, that I’ve realized that I really need to move all of my websites to this format. I still want to use WordPress for blogging, but all of my “project” pages should be in a wiki.

The only catch? Well, the traditional CamelCase for making wiki links is a little ugly, but most importantly the URIs for content generated by a wiki are also a little ugly. That is probably the most important part for me. When you generate a lot of content that isn’t updated often, sensible URIs are critical. I’m going to be looking over MoinMoin now to see if it contains the features that I’m wanting in my website. That is, the ability to include external pages and customizable URIs. We shall see.

Spring Break Tickets Purchased

Saturday, March 4th, 2006

I have been postponing purchasing tickets for Spring Break on the hope that Dave would be joining me. Unfortunately the funds didn’t pull through and so I’ll be making the trip to revel in the St. Patrick’s Day festivities in Boston solo again.

At first tonight I had a little scare, when my previous scheduled itinerary spiked over a hundred dollars. However, thanks to some itinerary adjustments I was able to get the round trip down to an acceptable rate (albeit at the price of some very early morning flights).

Without any further ado, here’s the flight plan:

  • Outgoing - Thursday, March 16, 2006
    • United Airlines 772
      • Depart: 8:02am from Kansas City, MO (MCI)
      • Arrive: 9:29am in Chicago, IL (ORD)
    • United Airlines 532
      • Depart: 10:55am from Chicago, IL (ORD)
      • Arrive: 2:21pm in Boston, MA (BOS)
  • Return - Thursday, March 23, 2006
    • United Airlines 533
      • Depart: 10:00am from Boston, MA (BOS)
      • Arrive: 11:44am in Chicago, IL (ORD)
    • United Airlines 5806 (United Express/Skywest)
      • Depart: 12:40am from Chicago, IL (ORD)
      • Arrive: 2:13pm in Kansas City, MO (MCI)

Not a bad layout overall. This will be the first time I haven’t tried to squeeze two weekends into the trip, but unfortunately it made a huge difference in my ticket price (over two hundred dollars!). So I’ll be returning early this year to have a final weekend in Manhattan. (Maybe I’ll get back in time for the Copus show, added bonus!)

I’m bummed that I’ll be taking the trip by myself, but really it was a bit of a security blanket now that Leary is gone. However, now that I’ve resolved to go anyway I’m excited. Especially to see my friends that I met through Leary all these years and spend some time in the People’s Republik on my laptop drinking Woo Woos. :)

My Mom Buys a “Real” Computer

Saturday, March 4th, 2006

Yesterday was a pretty proud moment for me when I typed in the purchasing information on Apple.com for an Apple Certified Refurbished 12-inch iBook G4 for my Mom.  The specifications on the laptop are nice.  It features a 1 GHz G4 processor, 256 megs of RAM, a 30 GB hard drive, the combo drive, and more.

Not only is this a nice laptop that is going to serve my Mom’s needs fantastically, this is also the first time that my mom has set aside a significant amount of money for a computer.  Also, this is the culmination of a recent trend that has my Mom “coming up to speed” technologically.  Of course, this is a delight for me, because this means it becomes that much easier to keep in touch with my Mom than it has ever been since I’ve moved out.

With most of my friends, I take for granted the ability to SMS (text message) them whenever I feel like it.  For a little over a year now, my Dad has been up to par with phone technology and I’ve been able to text message him as well.  Adding my mom to the mix just makes things even more satisfying.

Besides the communication angle though, there’s something else I’m feeling that’s kind of unexpected.  Since around middle school, I’ve been spending a large portion of my time at a computer.  The personal computer has evolved from half toy, half frustration machine into the modern day Swiss Army Knife.  The advantages of computers in everyday life is a very common difference between the most successful in our society and those trailing behind.  Today, nearly the whole of human knowledge has been posted online and indexed by Google.  Today, information relevent to day to day life that was previously unattainable by the average person is literally at every connected person’s fingertips.  In about six days, my Mom is going to have all of those tools and all of that information available to her as well.

Really, this purchase is an investment in a toolkit that easily make a significant improvement in quality of life.  It also is going to present a lot of common ground between my Mom and I that we can discuss as I help her best utilize that toolkit and get done whatever it is that she wants to get done.  I’m really looking forward to it.

How To Fix it When DreamHost Brings Down the CPU Usage Hammer

Saturday, March 4th, 2006

Last night I found an interesting email in my gmail account. It seems that DreamHost had moved me from my usual server and put me on an “evaluation” server. Apparently I had been using too many CPU minutes on the shared server. And while this is certainly a very reasonable resource for DreamHost to be monitoring, I am kind of bummed that I didn’t get a warning message of some kind before I was moved. (Although, it’s certainly possible that I did get a warning message and missed it somehow…)

Step One: Make Some Guesses

So I’ve been moved to a “limbo” server and directed to a FAQ laying out exactly what’s going on. I’m certainly not looking to be any sort of troublemaker on the server, and to the best of my knowledge I don’t have anything running on my website/shell that would generate very much CPU time, but I immediately have some contenders.

My first guess (naturally) is the most recent addition to my website. I’ve installed two Trac instances for handling the Bunker Management System project and the Late-Night at Nichols: LAN Party project. They run in FastCGI mode using SQLite, so they were my first suspects.

My second suspect was the meeting monitoring software that I wrote for Mr. Plumb. Though that has been running “as-is” for a couple months now and I wouldn’t expect any change in CPU usage.

Step Two: Check the CPU Usage Logs

Unfortunately I wasn’t able to handle the situation immediately, because I needed to wait for a while for the resource consumption logs to be created for my user so that I could analyze what’s going on. But this morning when I hopped on to check things out, I was surprised by the results. The file at ~/logs/resources/bradshaw.sa.analyzed.0 had what I was looking for:

Process               CPU seconds      user   machine   count  average
php.cgi                 7415.0300   99.696%   30.896%    8961    0.827
scrape.py                 11.3800    0.153%    0.047%      53    0.215
calendar_copy.p            3.5800    0.048%    0.015%      79    0.045
trac.fcgi                  2.8200    0.038%    0.012%      13    0.217
trac.fcgi                  2.6100    0.035%    0.011%      33    0.079
notify.py                  1.9100    0.026%    0.008%      53    0.036
wget                       0.1200    0.002%    0.001%      13    0.009
bkms.fcgi                  0.0900    0.001%    0.000%      13    0.007
bash                       0.0400    0.001%    0.000%       1    0.040
scrape.py                  0.0200    0.000%    0.000%     159    0.000
sshd                       0.0200    0.000%    0.000%       1    0.020
ls                         0.0200    0.000%    0.000%       7    0.003
notify.py                  0.0100    0.000%    0.000%     159    0.000
----------------------------------------------------------------------
Total:                  7437.6500  100.000%   30.990%    9545
Average per day:        7437.6500    1 days
CPU percentage assumes 24000 cpu seconds per day total.

I emphasized the line that is clearly causing the issue with too much CPU usage. This isn’t exactly a solved case, however, only a great tip. First, this knocked out my original guesses of Mr. Plumb’s notifier, and the Trac installations. The lines for scrape.py and notify.py are the meeting notifier software, and summed they only come to 13.32 seconds. The trac.fcgi lines handle the Trac installations, and summed they come to 5.43 seconds. Much less than I expected!

But that problem line, the obvious over-user of precious cycles, is php.cgi. That doesn’t tell me a specific application, instead, that tells me that some php pages written to execute in CGI mode are using the large numbers of CPU cycles. Odd, I don’t have many of those.

Step 3: If we don’t know what the problem is by now, make new guesses.

Well, since my assumptions failed me, it was time to try and think of any php cgi pages that I have running that do anything significant. This was a little harder for me, because almost all (all?) of my personal projects run in Python, not PHP. So my list of possible culprits ended up being pretty short: WordPress, Gallery2, and PHP-Calendar.

But if I’ve already checked the CPU usage logs, how will I make any progress. I figured I had two options. First, the CPU Minutes FAQ on the DreamHost Wiki lays out the methodology to make all of ones PHP not run as CGI and move their usage into Apache instead. (Hopefully by process of elimination it would reveal the answer.)

Step 4: Check the web server logs

That’s a little much for me at this point, so I opted for option two: check the web server logs to see what part of the site is getting hit most. Here is where I found my golden goose.

4378 26.43% Mar/ 4/06 12:45 AM /ical/day.php
3316 22.93% Mar/ 4/06 12:46 AM /ical/week.php
3164 28.51% Mar/ 4/06 12:46 AM ical/month.php
950 2.34% Mar/ 4/06 12:44 AM /ical/print.php

Those are the top four lines of my Request Report. Bah! For some reason my PHP-Calendar installation is getting jackhammered like downtown Boston. Now to figure out why.

Checking the Host report shows this important line:

12259 74.07% 66.249.65.11

Jinkies! 74% of my traffic is requested by one IP address! But who the heck is it? My guesses were either myself, myself at work, or a search engine. Survey says?

[bradshaw@limbo-spunky3 resources] $ host 66.249.65.11
Name: crawl-66-249-65-11.googlebot.com
Address: 66.249.65.11

It’s Google. I’ve heard of this before. It turns out that virtually all calendaring applications have an infinite set of links that are easy for GoogleBots to follow. Turns out that my PHP-Calendar is a labyrinth for Google’s web crawler, and is probably a hog of CPU resources at the time.

Conclusion(ish) and Solution (probably)

Notice earlier I said “probably”. This is a classic case of correlation, and it’s very important to remember (in all facets of one’s life) correlation does not equal causation. But rationally I’m pretty sure that this is the problem.

So the solution? Well the correct solution is to actually have an appropriate robots.txt file to let Google know that my /ical folder really isn’t that interesting. However, I’m sort of lazy at the moment and haven’t been using that application effectively for a while now. So I just did a little chmod 000 ical and called it good.

Hopefully here shortly my CPU usage will come well under the 50 to 60 minutes that DreamHost allocates for their shared servers. If not, I might have to get a little more involved tracking down PHP applications that might be causing issues. But for now, I think it’s handled.

Great Advice for the AWOL

Thursday, March 2nd, 2006

There have been multiple occations where I’ve wrestled with going AWOL on classes and/or work.  It’s an idiosyncracy I’ve longed to shed, but it keeps cropping back on me now and then.

Linked from Planet GNOME, I’ve just read a great advisory article on just that occurance.  I’m both relieved to hear that it’s a somewhat “common” problem and intrigued to hear some solid advice on the topic.