Entries Tagged 'development' ↓

Authenticating Requests for the Phanfare API in Python

Phanfare has decided to implement authenticated requests in a way that feels a bit clunky in design and a bit tricky to figure out how to get working as it isn’t well documented.

On the initial Authenticate request call, they send back a cookie string in XML that you are supposed to use in sending an HTTP Header, Cookie, with each subsequent request.

Cookie: phanfare2=COOKIE_STRING; domain=.phanfare.com

Being a Python library, pyphanfare uses urllib’s URLopener utility class for the HTTP calls. In order to send a cookie with a URLopener originated request you’ll want to do the following:

cookie_str = 'phanfare2=COOKIE_STRING; domain=.phanfare.com'
opener = URLopener()
opener.addheader('Cookie', cookie_str)
xml_str = opener.open(URL).read()

Phanfare API Versions

This past weekend I found myself getting a bit frustrated in working on pyphanfare in that as I was referencing the documentation for the the API, I clicked on a link at the bottom of the page that said “API” and got a different document describing a slightly different API with a completely different authentication scheme.

Phanfare API 1.0 Phanfare API 2.0

Comparing the two API documents, I could not distinguish whether or not they were in fact two distinct different API versions, or if one was obsolete and if so, which one. After posting some messages on the forums and sending a few emails, I got a response informing me that there are two distinct different API versions.

Also the text on the 1.0 API was updated to make it clearer — great response time to updating the documentation:

Picture 1.png

When Phanfare released it’s overhaul, Phanfare 2.0, it also was accompanied with a different API. The accounts that have been upgraded need to use the new API, the sites that haven’t been updated must use the old API. That being said, the API for 1.0 will be shutdown in June 2008, so I assume that 1.0 accounts will be upgraded then as well.

So, it looks like I need to take a step back on pyphanfare and reexamine my approach. I will ditch the effort to support the 1.0 API and focus on getting the 2.0 version out and usable.

Link: Shell Scripting Tips

David Pashley writes a good post on various tips on making a more bullet proof shell script.

I use shell scripting from time to time, especially when bootstrapping EC2 instances, but more like a DOS batch script, and not really taking advantage of the wealth of flow control and error checking doodads that exist in the Bash environment.

I especially thought the exit if anything breaks trick was a good tip as I tend to write long automation scripts, that once something fails the rest of the statements will either fail or are irrelevant or cannot be trusted to have completed successfully. You accomplish this functionality by simply putting a single line at the top of your Bash script:

set -e

That’s it! Check out the link to the post for more scripting goodness.

Semi Automated Code Quality Checks

I am nearing completion of my first milestone release for pyphanfare where the definition of done equals a working method for each API method published by Phanfare. There will no doubt be optimizations and improvements in the small python library after this point, but want to get it functional first.

To help keep me in line with PEP 8 and good testable code, I have come up with the following strategy and I am really liking so I thought I’d share.

I use pylint and doctest to validate my code prior to each commit (haven’t figured out how to wire up a pre-commit hook in git yet, or else the title of this post would be Fully Automated instead of Semi Automated). I have put these commands into this little script:

#!/bin/bash
# pre_commit.sh
pylint -f colorized -i y --ignore=tests pyphanfare
python run_tests.py

Now prior to each commit, I run:

. pre_commit.sh

and verify that all tests passed and I have a score of 10.0/10 in pylint. My pylint score of 10.0/10 is not completely accurate as I cheat where I deem appropriate and put the comments in the code to disable certain messages that crop up as offending but I don’t feel the need to change:

# pylint: disable-msg=MSG_ID

I have a Lighthouse project tracker set up that is integrated with GitHub. So please stop by and let me know of anything you’d like to see out this library or if you’d like to help out.

Relocated Projects: Google Code to GitHub & Lighthouse

A while back I made a move to GitHub from Google Code only to switch back when discovering I didn’t have a way to track work items (among other things).

Well, I have changed my mind again, primarily because I really want to start using git. So, I have cleaned up old stale projects that I started and know that I’ll never work on and migrated three of the ones that I am still am interested in developing:

  • pyphanfare - a python wrapper around Phanfare’s API
  • pyvimeo - a python wrapper around Vimeo’s API
  • pycalais - a python wrapper around Reuters Calais API

All three projects on GitHub are integrated into my Lighthouse account which has recently been upgraded to support Open Source projects free of charge:

Altman Software on Lighthouse

Anyone interested in joining me on any of these projects? I’d love the help!

Keeping ContentTypes and Permissions Updated without syncdb

In managing releases to a production django project, I find myself using python manage.py sqlall APPNAME to put the sql needed in a script for any new models, and then writing the ALTER or DROP statements by hand to also go into the sql scripts.

One thing that not running syncdb on production leaves out are the creation of the content type and permission records (if using django.contrib.auth and django.contrib.contenttypes). It was recommended to me on #django that I look at each of these apps management.py module to see where they hook into the post_syncdb signal.

In doing so I discovered some methods that I could just call from python and keep both my permissions and content types up to date:

from django.core.management import setup_environ
try:
    import settings
except ImportError:
    import sys
    sys.stderr.write("Couldn't find the settings.py module.")
    sys.exit(1)

setup_environ(settings)

# Add any missing content types
from django.contrib.contenttypes.management \
    import create_all_contenttypes 
create_all_contenttypes()

# Add any missing permissions
from django.contrib.auth.management import create_permissions
from django.db.models import get_apps
for app in get_apps():
    create_permissions(app, None, 2)

Enjoy.

Try / Except Performance in Python: A Simple Test

I was discussing the wisdom of using try/except throughout Python code today with someone and there were a couple of points that I felt would be quick and easy to verify or debunk.

I wrote a quick little script to time a a set of functions testing the use cases of not knowing whether an element in a dictionary exists or not prior to referencing it. There are a set where it doesn’t exist and set where it does. The numbers are interesting:

The case where the key does not exist:
1,000 iterations:
with_try (1.562 ms)
with_try_exc (2.166 ms)
without_try (0.233 ms)
without_try_not (0.201 ms)

100,000 iterations:
with_try (168.793 ms)
with_try_exc (223.589 ms)
without_try (24.877 ms)
without_try_not (20.992 ms)

1,000,000 iterations:
with_try (1571.420 ms)
with_try_exc (2228.899 ms)
without_try (250.723 ms)
without_try_not (219.819 ms)


The case where the key does exist:
1,000 iterations:
exists_with_try (0.154 ms)
exists_with_try_exc (0.141 ms)
exists_without_try (0.216 ms)
exists_without_try_not (0.220 ms)

100,000 iterations:
exists_with_try (15.647 ms)
exists_with_try_exc (15.165 ms)
exists_without_try (22.302 ms)
exists_without_try_not (23.364 ms)

1,000,000 iterations:
exists_with_try (158.330 ms)
exists_with_try_exc (158.038 ms)
exists_without_try (233.005 ms)
exists_without_try_not (237.813 ms)

From these results, I think it is fair to quickly determine a number of conclusions:

  1. If there is a high likelihood that the element doesn’t exist, then you are better off checking for it with has_key.
  2. If you are not going to do anything with the Exception if it is raised, then you are better off not putting one have the except
  3. If it is likely that the element does exist, then there is a very slight advantage to using a try/except block instead of using has_key, however, the advantage is very slight.

There is obviously a lot missing from this analysis and it really doesn’t provide anything of earth-shattering revelation — using exception handling logic where you don’t expect to encounter many exceptions is going to be more expensive. It just seemed like a good excuse to write a little Python. :)

For those interested here is the script I wrote to arrive at the numbers above:

import time

def time_me(function):
    def wrap(*arg):
        start = time.time()
        r = function(*arg)
        end = time.time()
        print "%s (%0.3f ms)" % (function.func_name, (end-start)*1000)
        return r
    return wrap


# Not Existing
@time_me
def with_try(iterations):
    d = {'somekey': 123}
    for i in range(0, iterations):
        try:
            get = d['notexist']
        except:
            pass

@time_me
def with_try_exc(iterations):
    d = {'somekey': 123}
    for i in range(0, iterations):
        try:
            get = d['notexist']
        except Exception, e:
            pass

@time_me
def without_try(iterations):
    d = {'somekey': 123}
    for i in range(0, iterations):
        if d.has_key('notexist'):
            pass
        else:
            pass

@time_me
def without_try_not(iterations):
    d = {'somekey': 123}
    for i in range(0, iterations):
        if not d.has_key('notexist'):
            pass
        else:
            pass



# Existing
@time_me
def exists_with_try(iterations):
    d = {'somekey': 123}
    for i in range(0, iterations):
        try:
            get = d['somekey']
        except:
            pass

@time_me
def exists_with_try_exc(iterations):
    d = {'somekey': 123}
    for i in range(0, iterations):
        try:
            get = d['somekey']
        except Exception, e:
            pass

@time_me
def exists_without_try(iterations):
    d = {'somekey': 123}
    for i in range(0, iterations):
        if d.has_key('somekey'):
            pass
        else:
            pass

@time_me
def exists_without_try_not(iterations):
    d = {'somekey': 123}
    for i in range(0, iterations):
        if not d.has_key('somekey'):
            pass
        else:
            pass


print "The case where the key does not exist:"
print "1,000 iterations:"
with_try(1000)
with_try_exc(1000)
without_try(1000)
without_try_not(1000)

print "\n100,000 iterations:"
with_try(100000)
with_try_exc(100000)
without_try(100000)
without_try_not(100000)

print "\n1,000,000 iterations:"
with_try(1000000)
with_try_exc(1000000)
without_try(1000000)
without_try_not(1000000)

print "\n\nThe case where the key does exist:"
print "1,000 iterations:"
exists_with_try(1000)
exists_with_try_exc(1000)
exists_without_try(1000)
exists_without_try_not(1000)

print "\n100,000 iterations:"
exists_with_try(100000)
exists_with_try_exc(100000)
exists_without_try(100000)
exists_without_try_not(100000)

print "\n1,000,000 iterations:"
exists_with_try(1000000)
exists_with_try_exc(1000000)
exists_without_try(1000000)
exists_without_try_not(1000000)

Update: Recently I updated my blog to process posts as if they were written in markdown. Just updating the raw post to markdown syntax.