eriksmartt.com/blog

  • arduino (3)
  • art (2)
  • austin (39)
  • automotive (15)
  • blogging (25)
  • books (10)
  • business (3)
  • code (15)
  • design (10)
  • diy (3)
  • django (8)
  • experience (17)
  • family (2)
  • film (4)
  • food (1)
  • for:optaros (2)
  • gadgets (11)
  • games (11)
  • garden (3)
  • green (5)
  • hack (13)
  • hardware (11)
  • hci (9)
  • life (13)
  • lifehack (11)
  • links (70)
  • linux (8)
  • living (3)
  • make (3)
  • media (7)
  • mobile (98)
  • music (2)
  • news (17)
  • osx (29)
  • outdoors (4)
  • privacy (2)
  • product-management (1)
  • python (74)
  • quote (3)
  • security (10)
  • society (20)
  • software (38)
  • spam (2)
  • syndication (5)
  • technical (30)
  • technolust (5)
  • transportation (12)
  • travel (25)
  • ubuntu (7)
  • web (66)

  • Search:
     

     

    A simple shell script for grabbing the current temperature from the command line

    In case someone else needs it, here’s a simple shell script I use for grabbing the current weather conditions by U.S. zipcode using the Yahoo APIs:

    function weather {
      zipcode=$1
      if [ -n "$zipcode" ]; then
        lynx -dump “http://weather.yahooapis.com/forecastrss?p=$zipcode” | grep -i condition | awk -F’ ‘ ‘{print $4 $5 $6}’ | awk -F’< ' '{print $1}' | sed 's/,/ /'
      else
        echo 'USAGE: weather <zipcode>'
      fi
    }
    

    The script uses lynx to grab the Yahoo RSS feed, piping the output to grep, which extracts the line containing the word ‘condition’. awk then pulls out some specific fields (delimited by spaces, then later by ‘<’), and sed converts the commas to spaces for prettier output. Obviously, the whole thing is fairly fragile, and changes to the RSS format (which happened maybe a month back) break the code. Checking the weather in Austin with the command: `weather 78701`, currently outputs: “Fair 67F”.


    Post Comment »


    Django “lorem ipsum” generator (and a new contrib.webdesign module)

    Django “lorem ipsum” generator (and a new contrib.webdesign module)

    The Django Web Framework project just added a new contrib.webdesign module with an amazingly simple, but incredibly handy first feature: a lorem ipsum generator. The idea is that a project’s base templates can include generated lorem ipsum for testing layout and page flow, but inheriting templates can override the generated text once real content is available.

    The lorem tag is used like this (via the contrib.webdesign docs):

    • {% lorem %} will output the common “lorem ipsum” paragraph.
    • {% lorem 3 p %} will output the common “lorem ipsum” paragraph and two random paragraphs each wrapped in HTML <p> tags.
    • {% lorem 2 w random %} will output two random Latin words.

    In practice, you might do this:

    templates/template.html:

    
    <html>
      <head>
        <title>{% block article_title %}{% lorem 5 w %}{% endblock %}</title>
      </head>
      <body>
        <div class="article">
          <div class="article_title">{% block article_title %}{% lorem 5 w %}{% endblock %}</div>
          <div class="article_body">{% block article_body %}{% lorem 4 p %}{% endblock %}</div>
        </div>
      </body>
    </html>
    

    And then inherit when you’re ready:

    templates/article.html:

    
    {% extends "template.html" %}
    
    {% if article %}
      {% block article_title %}{{ article.title }}{% endblock %}
      {% block article_body %}{{ article.body }}{% endblock %}
    {% endif %}
    

    Previously, I used to just paste lorem ipsum text directly into the main template (wrapped in block tags for overridding), but this new tag will let you skip the copy/paste routine. Very nice!


    Post Comment »


    Triggering a browser’s “Save As…” dialog using a custom Content-Type header

    My previous post, “Passing JSON via the X-JSON HTTP header with Django and Prototype“, contained an example on writing custom HTTP headers from a Django-based web application. Continuing with that theme, here’s another header trick that I use in one of my apps to force the browser’s “Save As…” dialog box when viewing a particular URL.

    The feature that I wanted was the ability to generate an XML file based on an HTTP GET request, but to have the browser open a “Save As…” dialog instead of attempting to render it (as would normally happen with XML in a modern browser.) The solution is to exploit the web browser behavior of not handling unknown mime types. A sample implementation (written in Python for the Django Web Framework) follows:

    def save_as_xml(request):
        import datetime
    
        current_time = datetime.now()
    
        response = HttpResponse('PUT THE XML HERE')
        response['Content-Type'] = ‘application/x-generated-xml-backup’
        response['Content-disposition'] = ‘Attachment; filename=export.%s.xml’ % (current_time.strftime(”%Y-%m-%d”))
    
        return response
    

    Setting the Content-Type header to a made-up type ensures that the browser will not attempt to render the file. The Content-disposition header provides the mechanism for suggesting the filename of the content to be saved on the viewer’s system. In this case, I’m using the standard `datetime` module to insert the date into the suggested filename.


    Post Comment »


    Passing JSON via the X-JSON HTTP header with Django and Prototype

    One of the demo sites I was working on this week needed to pass a small amount of JSON back with it’s page results. There are a few ways to do this (and I’d suggest this post, “Loading Content with JSON” as a starting point if you’re looking for ideas), but for simplicity, I decided to take advantage of the automatic X-JSON HTTP Header parsing feature in Prototype 1.5.0. (The Ajax.Request docs address this capability.)

    The sample code below demonstrates the use of the X-JSON header with an simple “sticky notes” web app. On the client-side, the JavaScript is quite simple. The second variable in the onSuccess callback handler will be automatically initialized using the data in the X-JSON header:

    function display_note(id) {
        new Ajax.Request('/api/note/' + id + '/', {
            method: 'get',
            onSuccess: function(transport, results) {
                alert("Note(" + results['id'] + “) `” + results['title'] + “`: ” + results['body']);
            },
            }
        );
    }
    

    To handle this request, I’m using Django on the server with the following URL pattern:

    (r'^api/note/(?P\d+)/$’, ‘views.get_note’)
    

    The `get_note` method implementation looks like this: [NOTE: For production use, you'll want some exception handling, but I removed the error handling to simplify the example.]

    def get_note(request, id):
        # Fetch the Note from the DB:
        note = Note.objects.get(pk=id)
        # Create the response object (with some dummy text for now):
        response = HttpResponse('Check the X-JSON header.')
        # Manually set the X-JSON header using the JSON generated from the Note record:
        response['X-JSON'] = cjson.encode(note.__dict__)
        # Return the response object:
        return response
    

    If you’d like to use this technique on your own sites, there are couple points to remember:

    1. You can’t return an empty HTTP Response regardless of there being an X-JSON header. If the response is empty, the browser will hang waiting for content to arrive.
    2. The X-JSON header should only be used for small payloads. Don’t stuff more then 8kb in your headers. If you’re sending more then that, move the JSON to the body of the response.
    3. The cjson and simplejson encoders don’t handle Django DateTime fields. For objects with DateTime fields, write an alternate method for converting the object into a dictionary before passing it to the json encoder.


    Post Comment »


    Moving my blog from WordPress to Django; Part 2: Migrating the data

    In Part 1 of this series, I described some of the motivation, and the components being used to build a new blog for myself. In this (lengthy) post, I’ll address the solution I used to move my content archives from WordPress to the new app.

    Installing new blog software is generally easy, but if you have legacy content that you need to preserve, the ability to move content between systems becomes of utmost importance. Fortunately, it’s quite common for popular software to provide import/export features; Having good tools to migrate content reduces switching costs, making it easy to try new software without fear of content lock-in. Unfortunately, with a home-grow blog platform, these tools need to be written from scratch.

    For my soon-to-be-launched Django-based blog, importing content from my WordPress installation was an early priority — there’s only so much testing you can do with lorem ipsum posts. In tackling this content migration, I considered the following four options:

    1. Support the legacy database schema.
    2. Export and Import at the database level (ie., SQL dump, some text file munching, and SQL imports.)
    3. Write an adapter layer to pull from the existing database and insert into the new database.
    4. Export the content into a neutral format, and import from that format.

    Regardless of the approach taken, I also added one important requirement: The import solution had to be so easy (and easily repeatable) that I would never hesitate to make a change to the database models when needed. Naturally, it’s nice to freeze the model once you have a stable release, but during development, even the database model should be open to agile iteration. I’ve worked on systems where every model change meant writing accompanying SQL scripts to alter the tables, and while effective, it wastes time, and I wanted the option to simply export, wipe the database clean, and re-import whenever needed. (And preferably by simply running a single script.)

    I finally settled on option #4, to export into a neutral format (XML), and write an importer for that format; However, I did briefly consider each of the above options:

    1) Supporting the legacy (WordPress) database schema sounds nice on the surface. This would allow the two systems to share the same database (thus eliminating the need to migrate content at all), while making it extremely easy to run the systems side-by-side (perhaps even balancing traffic between the two to test the deployment.) The downside though, is that the custom application would need to maintain the data relationships that WordPress was relying on. It’s certainly doable, but on further investigation, I found that I didn’t actually like everything about the WordPress schema; There was a bit too much de-normalized data that I didn’t want to keep around.

    2) Exporting and Importing at the database level would essentially involve a mysqldump, some sed/grep/perl magic, and a SQL import into a new database. This would get the job done, but could very well lead to endless hours of tweaking regex patterns; and the end result would basically be throw-away code.

    3) Writing an adapter layer was actually the most tempting at first. I knew that Django contained a tool for generating model definitions based on an existing database schema. If this worked for the WordPress database, then all I would need to do is write a thin layer to fetch content from one model and stick it into another. Sure enough, the `inspectdb` tool did do a good job, and I got so far as having routines for pulling posts and comments before realizing that this also wasn’t as reusable a solution as I wanted. Complicating matters was the need to do all this magic in a single database, since the Multiple Database Support branch of Django is still in development/testing.

    With the above options scratched off the list, I went in search of a means to export directly from WordPress into a neutral format. With a little googling, I found some posts about an export/import feature that might be “in development” in the WordPress tree, but I found no documentation on the feature. Fortunately, a few more searches turned up the “WordPress XML Export” plugin, which sounded like an effort to backport the exporting feature to early versions of WordPress. After first installing the XML Export plugin, I found that it didn’t actually work with the version of WordPress on my server, but a quick look through the source code revealed a hardcoded version check that was easy enough to modify. With that change made, the plugin has run like a champ ever since.

    The XML Export plugin outputs the full contents of a WordPress blog into a WXR file (WordPress eXtended RSS), which is an RSS 2.0 file, extended with a wordpress export namespace so that it can include extra metadata and comments.

    With the content archives now in a massive RSS file, the next task was to write an importer. To parse the XML, I decided to use ElementTree for it’s simplicity in getting the job done. Pulling the file into ElementTree is a one-liner (when wordpress_xml_file is a File object):

    tree = ET.parse(wordpress_xml_file)

    The entries can be easily iterated:

    for item in tree.findall("channel/item"):

    Extracting the basic elements was also straight-forward (which I stuck into a Dictionary):


    results['link'] = item.find(”link”).text
    results['pubDate'] = item.find(”pubDate”).text
    results['summary'] = item.find(”description”).text
    results['body'] = item.find(”{http://purl.org/rss/1.0/modules/content/}encoded”).text
    results['post_date'] = item.find(”{http://wordpress.org/export/1.0/}post_date”).text
    results['post_date_gmt'] = item.find(”{http://wordpress.org/export/1.0/}post_date_gmt”).text

    Extracting the Categories/Tags was only slightly more work:

    
    results['categories'] = []
    
    categories = item.findall(”category”)
    
    for c in categories:
        results['categories'].append(c.text)
    

    Pulling the comments was the only messy part of the process. The list of comments is easy enough to fetch…

    comments = item.findall("{http://wordpress.org/export/1.0/}comment")

    …but extracting the actual comment text is a little more work because some comments may contain child nodes. For example, a comment containing a hyperlink, bold tag, or any other HTML will be truncated if you simply use the `.text` attribute. To crawl the comment text and child tags, I used the `getiterator()` method, while concatinating `.text` attributes to assemble the full comment text. While doing this, I also decided to filter out any HTML tags from the comments, which made the process fairly simple:

    
    tmp_comment_list = []
    
    comment_tag = comment.find(”{http://wordpress.org/export/1.0/}comment_content”)
    
    for comment_tag_child in comment_tag.getiterator():
        tmp_comment_text = comment_tag_child.text
        if tmp_comment_text: tmp_comment_list.append(tmp_comment_text)
    
    the_comment['body'] = ‘ ‘.join(tmp_comment_list)
    
    results['comments'].append(the_comment)
    

    By writing an importer for the WXR/RSS 2.0 format, this not only solves the problem at hand, but also sets the groundwork for a reusable RSS importer. IMO, this potential reuse adds additional value to the solution (as opposed to one-off SQL munching or custom adaption layers), which makes it worth any additional work that might have gone into it. With a little re-factoring, the same system could also be extended to support the Movable Type Import Format, making the software very easy to setup and evaluate.

    In Part 3, I’ll skip some of the development details and jump into the server issues, with a focus on why the new blog hasn’t launched yet. The answer lies heavily in the challenge of running a Python-based application server in shared hosting environments. The common lack of mod_python, the RAM hit, etc., all add to the complexity in adopting Django.


    4 Comments »


    Resetting a Django environment

    For one of my Django-based projects, I decided to setup an automated functional-testing system using Selenium to add content to the Admin tool and verify that it works in the site. In order to use this in a “continuous-integration”-like manner, I needed a way to automate the tear-down, initialization, and setup of a fresh installation of the app.

    I use a few more tricks to get this all working, but I wanted to share a couple scripts I wrote to handle the database re-initialization. I gather from some of the Django discussions that similar functionality may be working it’s way into the mainline already, but for the time being, here’s what I’m doing.

    I broke the process into two scripts, not because it’s the best thing to do, but because doing the first part as a shell script made sense, and doing the second part in Python was easier.

    This first script take a brute-force approach at pulling the database settings from the project’s settings.py file, and using them to delete the existing database and create a new one by driving the command-line ‘mysqladmin’ tool. (There’s also some voodoo done elsewhere which results in the script using a different database name if it’s in the testing environment, but that’s for another post.)

    
    #!/bin/bash
    
    # Extract the user/passwd from the settings file
    username=`grep DATABASE_USER settings.py | awk -F\' '{print $2}'`
    password=`grep DATABASE_PASSWORD settings.py | awk -F\' '{print $2}'`
    database=`grep DATABASE_NAME settings.py | awk -F\' '{print $2}'`
    
    echo 'Clearing the database...'
    echo 'y' | mysqladmin --host=localhost --user=$username --password=$password drop $database
    mysqladmin --host=localhost --user=$username --password=$password create $database
    
    echo 'Setting up the database and test account...'
    ./dbinit.py
    
    echo 'Done.'
    

    This second script (called ‘dbinit.py’, and called from the script above) uses pexpect (an Expect-like module for Python) to drive the ’syncdb’ function of Django’s manage.py tool. When using pexpect, the thing to remember is that you have to “expect” the full, and exact string that the child process outputs. I got hung up on this at first, which is why you’ll see me using the more crude “.*” pattern below:

    
    #!/usr/bin/python
    
    import sys
    import pexpect
    
    child = pexpect.spawn('python manage.py syncdb')
    child.logfile = sys.stdout
    
    #child.expect('Would you like to create one now.*:')
    child.expect('.*:')
    child.sendline('yes')
    
    child.expect('Username.*:')
    child.sendline('SOMEUSERNAME')
    
    child.expect('E-mail address:')
    child.sendline('SOMEUSERNAME@foo.com')
    
    child.expect('Password:')
    child.sendline('NOTSOSECRETPASSWORD')
    
    child.expect('Password.*:')
    child.sendline('NOTSOSECRETPASSWORD')
    
    child.expect(pexpect.EOF)
    

    With these scripts in place, not only have I been able to setup an automated testing solution, but I also use them in early development when I’m still flushing out a data-model. This approach allows me to quickly reinitialize an environment — although you should use with caution since it also deletes all content from the database.


    Post Comment »


    OS X: Opening man pages in Preview

    Admittedly, this is perhaps more of an interesting trick rather then a needed feature; However, if you’ve ever wanted to print man pages or simply read them in a nice, anti-aliased document view instead of within the Terminal, here’s a tip you might like. The following bash script (and credit goes 100% to my friend Victor, who is sans-blog) will format and open man pages in Preview:

    
    #!/bin/bash
    
    cmd=$1
    if [ -z $cmd ]; then
        me=`basename $0`;
        echo “Usage: $me command_name”;
        exit;
    fi
    
    man $1 > /dev/null 2>&1
    if [ $? -ne 0 ]; then
        echo “No man page for $cmd”;
        exit;
    fi
    
    man -t $cmd|open -f -a /Applications/Preview.app
    

    On my box, I called the script ‘manpreview’ and dropped it in ~/bin/ for easy access. Once you `chmod u+x` it (and have ~/bin/ in your path), you’ll be able to do fun things like `manpreview tcpdump` for some extended reading.


    2 Comments »


    Aptana Web IDE

    Earlier this week I was looking for a nice HTML editor for Eclipse to help ease life when using PyDev with a Django project. I didn’t have much luck, other then finding a few syntax coloring tools that were HTML aware. That changed today when I found Aptana: The Web IDE. It’s a free, open source IDE for HTML, JavaScript, and CSS, built on Eclipse (available as a stand-alone application, or an Eclipse plugin) that offers target-browser aware code assist and syntax checking. The site includes some great screencasts to demo the product (and an interesting use of a .tv domain name.)

    Though it’s officially unsupported on Eclipse 3.2 (they only support 3.1), it seems to work just fine in my environment.

    (Via eHub)

    [Minor update: Aptana ran fine on my OS X machine, but crashes hard on my AMD64 Ubuntu Dapper box running Eclipse 3.2.]


    Post Comment »


     

    A few books I'm reading now:

    A few books I'd recommend: