Entries Tagged 'python' ↓
July 9th, 2008 — development, django, python
In my continued experimentation with the newforms-admin branch of Django, I wanted to figure out how to order fields of an inline. Looking at the documentation for inlines I saw there was not an ordering field.
I had thought that there was but it turns out I was just mistaken in the object hierarchy. InlineModelAdmin inherits from BaseModelAdmin the same as ModelAdmin — I was thinking that InlineModelAdmin inherited from ModelAdmin.
Therefore, I had to determine a way to accomplish this via some spelunking through the code. At first, I thought I’d just create my own template by inheriting or copying the tabular.html template. After looking at it, I determined that I didn’t want to figure out how to do a regroup on inlineadminformset or if even that was the proper way to try to order things in that template.
After a few minutes in the code I realized that I could just subclass BaseInlineFormset and specify the fields that I wanted to order by (in this example I will use starttime and endtime:
from django.contrib import admin
from django.newforms.models import BaseInlineFormset
class MyOrderedFormset(BaseInlineFormset):
def get_queryset(self):
qs = super(SessionInlineFormset, self).get_queryset()
return qs.order_by('start_time', 'end_time')
class MyOrderedInline(admin.TabularInline):
model = MyModel
extra = 1
formset = MyOrderedFormset
That’s it. Now after that section of inlines are ordered by starttime and endtime.
July 3rd, 2008 — development, django, python
Introduction
Managing database changes in a team environment working on a django project can be complicated. I would imagine that there is no one size fits all solution and it would depend on team size and configuration, production database size, etc.
What I will outline here is a solution I developed for a small team that I work with on a django based web application. So far it has worked really well for allowing us to streamline database changes so we can stay in sync, deploy easily, and add small tweaks to the database (indexing, data manipulation, etc.) that is semi-automated.
The idea is based partly on Rails database migrations (though not has sleek and well put together) and a database versioning system that was used on a team that I have worked on previously using MS SQL in a corporate environment and a fairly decent team size. In one sense it is not nearly as well put together as either of these two solutions, but at the same time, it is pretty much hands off and has been working well for us for a number of months now.
Why not just use syncdb?
As nice as python manage.py syncdb is when the project is getting started and is still in early phases, it becomes less and less useful, mostly in handling schema changes. Since syncdb only creates a model if it doesn’t already exist in the database, we found ourselves outputting sql with python manage.py sql APP_NAME and then making adhoc ALTER TABLE scripts from the CREATE statements and passing them around to each other and then trying to remember the order to apply them in production.
Furthermore, syncdb left us without a way to alter data, add indexes, remove indexes, etc. Granted that wasn’t it’s intent and I am certainly not trying to beat up syncdb — it does what it does well.
The Solution
In order to address this, it seemed the most natural thing was to have some structured and ordered way to write SQL scripts that were applied in a uniform manner so that it was repeatable. This way we could run on our development databases, on test databases, and in deployment on production all the same way.
First step to achieve this was to version the database. This is accomplished with a simple table that will keep track of what it has executed along with some other metadata primarily for reference purposes.
CREATE TABLE `versions` (
`version` VARCHAR(200) NOT NULL,
`date_created` DATETIME NOT NULL,
`sql_executed` LONGTEXT NULL,
`svn_version` int null
);
In order to populate this table and keep versions consistent, the naming scheme of the sql scripts should follow something that lends itself to easy sorting:
YYYYMMDD-##.sql
where pound signs are zero-padded integers of the changes for the day. In a team environment, it is probable that two people might be working on a database change at the same time and therefore would be use 01. Whoever committed first would be able to keep the 01, the other developer would get a conflict message and need to rename his script to 02.
These scripts live in a db/ folder in our project. The name of this folder is not important, but it is important to keep all the sql scripts in a folder not mixed in with other files.
To tie it all together there is s simple python script that gets the list of applied changes from the database, gets a list of files in the current directory in order, filters out the ones that have been applied and then processes the remaining scripts. In processing the scripts, it opens each script and splits script down into individual statements (splitting on the semicolon). After execution of an entire script file is complete, the versions table is updated to reflect the applied version.
Running python upgrade.py by itself will simply print to standard output all the statements it plans to execute, so that they can be reviewed. Running python upgrade.py --execute will actually execute the scripts.
If an error was found in executing a statement, processing stops immediately.
You can find upgrade.py in Django Snippet 849.
Improvements
There are lots of improvements to this script that I’d like to add if I got around to it. The ones I’d like to see most is support for “rollbacks”. Comments and suggestions are most welcome!
June 29th, 2008 — development, django, python
Introduction
I’ve recently (past week or so) been digging into the newforms-admin branch of django. I am really looking forward to this code line getting merged into trunk as there are tons of great work in this branch. I especially like the separation of the admin code from the models.
Well, I’ll get to my point.
The Problem
I needed to be able to edit child records in the admin that were children through the use of GenericRelations. I was told about a solution on Google Code called django-genericadmin, but it seemed dead and I could not get it to work.
The thing that seemed the closest to working was a patch on ticket 4667 by Honza Kral and a variation on this patch in the Django Snippet 765. Neither of these worked for r7771 of branches/newforms-admin.
The Solution
I put the 765 snippet in a file called generic.py in the root of the django project I was working on and fiddled with it until I got it working. I ended up needing to change an import, fix some variable names, and change the argument list order in one of the init methods. I also added the canorder and candelete options to the GenericInlineModelAdmin class.
After I got it working isolated in generic.py, I then added the code to the django/contrib/admin/options.py file and created a new patch (after testing locally of course). This patch has since been added to ticket 4667. I also posted the full version that I had running in generic.py in snippet 832 in case someone wants to use it without patching their newforms-admin branch.
Example of How to Use
In order to get a sense for how to make use of this functionality, imagine you have a songs app that you use to model and record details about songs, you have built this generically so you can reuse it wisely in a number of different projects. One project in particular has a need to to reference song data.
You are building a home media inventory system and decide you want to catalog both your albums as well as your DVD collection, however, you want the ability to record what songs exist on both your albums as well as your DVDs as you are a fanatic about soundtracks.
So you have the following models (abbreviated for clarity):
# Media Collection App
class Album(models.Model):
...
class DVD(models.Model):
...
# Song App
class Song(models.Model):
...
Since you want to enable your Song model to relate to two other models generically, you decide to use Generic Relations like so:
from django.contrib.contenttypes.models import ContentType
from django.contrib.contenttypes import generic
class Song(models.Model):
...
content_type = models.ForeignKey(ContentType)
object_id = models.PositiveIntegerField()
content_object = generic.GenericForeignKey()
and then on the models that you want to have 0 to Many child records that are songs:
from myproject.song.models import Song
from django.contrib.contenttypes import generic
class Album(models.Model):
...
songs = generic.GenericRelation(Song)
class DVD(models.Model):
...
songs = generic.GenericRelation(Song)
Now that’s all you need to do if you didn’t care about editing the data in the admin, however, since that is not the point of the article, here is what you need to do in your admin.py in the app that contains Album and DVD:
...
from myproject.song.models import Song
from myproject.generic import GenericTabularInline
### OR ###
### from django.contrib.admin.options import GenericTabularInline
class SongInline(GenericTabularInline):
model = Song
extra = 2
ct_field_name = 'content_type'
id_field_name = 'object_id'
class AlbumOptions(admin.ModelAdmin):
model = Album
inlines = (SongInline,)
class DVDOptions(admin.ModelAdmin):
model = DVD
inlines = (SongInline,)
admin.site.register(Album, AlbumOptions)
admin.site.register(DVD, DVDOptions)
Now you should be able to add/remove songs when editing an Album or DVD in your admin.
Update: Fixed typo in code sample above in setting the GenericForeignKey() — thanks FunkyBob!
April 27th, 2008 — development, pyphanfare, python
Wikipedia defines Duck Typing as:
In computer programming, duck typing is a style of dynamic typing in which an object’s current set of methods and properties determines the valid semantics, rather than its inheritance from a particular class. The name of the concept refers to the duck test, attributed to James Whitcomb Riley, which may be phrased as follows:
“If it walks like a duck and quacks like a duck, I would call it a duck.”
Coming from a strong background in C#, it took awhile for this feature of Python to seem useful. I was use to creating interfaces and creating a proper inheritance chain implementing these interfaces and/or defining abstract base classes, all to have a good set of mock objects to use in unit tests.
I really love the ability to quickly whip out a duck typed class in Python to make simple mock objects so that I am testing my code in a unit test and not boundary objects (e.g. urllib). I don’t need to test whether my request is properly handled by the server at the urllib endpoint and that the response is properly read and processed. I need to instead make sure that my request conforms to a published specification and that the expected result/response from the server is processed properly. This also allows my tests to run offline.
Here is an example of some mock objects that I use in pyphanfare:
# pyphanfare/tests/mocks.py
from pyphanfare.tests.testdata import data
class URLopener:
def open(self, url):
params = url.split('?')[1].split('&')
for param in params:
tokens = param.split('=')
if tokens[0].strip() == 'method':
return URLhandle(tokens[1].strip())
def addheader(self, *args):
self.headers = args
class URLhandle:
def __init__(self, method):
self.method = method
def read(self):
return data[self.method]
pyphanfare.tests.testdata is just a Python module that contains test data in the form of a dictionary object indexed by method name that I am calling on the API having a value of XML that is expected in the response.
April 27th, 2008 — development, pyphanfare, python
Phanfare has decided to implement authenticated requests in a way that feels a bit clunky in design and a bit tricky to figure out how to get working as it isn’t well documented.
On the initial Authenticate request call, they send back a cookie string in XML that you are supposed to use in sending an HTTP Header, Cookie, with each subsequent request.
Cookie: phanfare2=COOKIE_STRING; domain=.phanfare.com
Being a Python library, pyphanfare uses urllib’s URLopener utility class for the HTTP calls. In order to send a cookie with a URLopener originated request you’ll want to do the following:
cookie_str = 'phanfare2=COOKIE_STRING; domain=.phanfare.com'
opener = URLopener()
opener.addheader('Cookie', cookie_str)
xml_str = opener.open(URL).read()
April 23rd, 2008 — development, pyphanfare, python
This past weekend I found myself getting a bit frustrated in working on pyphanfare in that as I was referencing the documentation for the the API, I clicked on a link at the bottom of the page that said “API” and got a different document describing a slightly different API with a completely different authentication scheme.
Phanfare API 1.0
Phanfare API 2.0
Comparing the two API documents, I could not distinguish whether or not they were in fact two distinct different API versions, or if one was obsolete and if so, which one. After posting some messages on the forums and sending a few emails, I got a response informing me that there are two distinct different API versions.
Also the text on the 1.0 API was updated to make it clearer — great response time to updating the documentation:
When Phanfare released it’s overhaul, Phanfare 2.0, it also was accompanied with a different API. The accounts that have been upgraded need to use the new API, the sites that haven’t been updated must use the old API. That being said, the API for 1.0 will be shutdown in June 2008, so I assume that 1.0 accounts will be upgraded then as well.
So, it looks like I need to take a step back on pyphanfare and reexamine my approach. I will ditch the effort to support the 1.0 API and focus on getting the 2.0 version out and usable.
April 11th, 2008 — development, django, python
In managing releases to a production django project, I find myself using python manage.py sqlall APPNAME to put the sql needed in a script for any new models, and then writing the ALTER or DROP statements by hand to also go into the sql scripts.
One thing that not running syncdb on production leaves out are the creation of the content type and permission records (if using django.contrib.auth and django.contrib.contenttypes). It was recommended to me on #django that I look at each of these apps management.py module to see where they hook into the post_syncdb signal.
In doing so I discovered some methods that I could just call from python and keep both my permissions and content types up to date:
from django.core.management import setup_environ
try:
import settings
except ImportError:
import sys
sys.stderr.write("Couldn't find the settings.py module.")
sys.exit(1)
setup_environ(settings)
# Add any missing content types
from django.contrib.contenttypes.management \
import create_all_contenttypes
create_all_contenttypes()
# Add any missing permissions
from django.contrib.auth.management import create_permissions
from django.db.models import get_apps
for app in get_apps():
create_permissions(app, None, 2)
Enjoy.
January 17th, 2008 — development, python
I was discussing the wisdom of using try/except throughout Python code today with someone and there were a couple of points that I felt would be quick and easy to verify or debunk.
I wrote a quick little script to time a a set of functions testing the use cases of not knowing whether an element in a dictionary exists or not prior to referencing it. There are a set where it doesn’t exist and set where it does. The numbers are interesting:
The case where the key does not exist:
1,000 iterations:
with_try (1.562 ms)
with_try_exc (2.166 ms)
without_try (0.233 ms)
without_try_not (0.201 ms)
100,000 iterations:
with_try (168.793 ms)
with_try_exc (223.589 ms)
without_try (24.877 ms)
without_try_not (20.992 ms)
1,000,000 iterations:
with_try (1571.420 ms)
with_try_exc (2228.899 ms)
without_try (250.723 ms)
without_try_not (219.819 ms)
The case where the key does exist:
1,000 iterations:
exists_with_try (0.154 ms)
exists_with_try_exc (0.141 ms)
exists_without_try (0.216 ms)
exists_without_try_not (0.220 ms)
100,000 iterations:
exists_with_try (15.647 ms)
exists_with_try_exc (15.165 ms)
exists_without_try (22.302 ms)
exists_without_try_not (23.364 ms)
1,000,000 iterations:
exists_with_try (158.330 ms)
exists_with_try_exc (158.038 ms)
exists_without_try (233.005 ms)
exists_without_try_not (237.813 ms)
From these results, I think it is fair to quickly determine a number of conclusions:
- If there is a high likelihood that the element doesn’t exist, then you are better off checking for it with
has_key.
- If you are not going to do anything with the Exception if it is raised, then you are better off not putting one have the
except
- If it is likely that the element does exist, then there is a very slight advantage to using a try/except block instead of using
has_key, however, the advantage is very slight.
There is obviously a lot missing from this analysis and it really doesn’t provide anything of earth-shattering revelation — using exception handling logic where you don’t expect to encounter many exceptions is going to be more expensive. It just seemed like a good excuse to write a little Python.
For those interested here is the script I wrote to arrive at the numbers above:
import time
def time_me(function):
def wrap(*arg):
start = time.time()
r = function(*arg)
end = time.time()
print "%s (%0.3f ms)" % (function.func_name, (end-start)*1000)
return r
return wrap
# Not Existing
@time_me
def with_try(iterations):
d = {'somekey': 123}
for i in range(0, iterations):
try:
get = d['notexist']
except:
pass
@time_me
def with_try_exc(iterations):
d = {'somekey': 123}
for i in range(0, iterations):
try:
get = d['notexist']
except Exception, e:
pass
@time_me
def without_try(iterations):
d = {'somekey': 123}
for i in range(0, iterations):
if d.has_key('notexist'):
pass
else:
pass
@time_me
def without_try_not(iterations):
d = {'somekey': 123}
for i in range(0, iterations):
if not d.has_key('notexist'):
pass
else:
pass
# Existing
@time_me
def exists_with_try(iterations):
d = {'somekey': 123}
for i in range(0, iterations):
try:
get = d['somekey']
except:
pass
@time_me
def exists_with_try_exc(iterations):
d = {'somekey': 123}
for i in range(0, iterations):
try:
get = d['somekey']
except Exception, e:
pass
@time_me
def exists_without_try(iterations):
d = {'somekey': 123}
for i in range(0, iterations):
if d.has_key('somekey'):
pass
else:
pass
@time_me
def exists_without_try_not(iterations):
d = {'somekey': 123}
for i in range(0, iterations):
if not d.has_key('somekey'):
pass
else:
pass
print "The case where the key does not exist:"
print "1,000 iterations:"
with_try(1000)
with_try_exc(1000)
without_try(1000)
without_try_not(1000)
print "\n100,000 iterations:"
with_try(100000)
with_try_exc(100000)
without_try(100000)
without_try_not(100000)
print "\n1,000,000 iterations:"
with_try(1000000)
with_try_exc(1000000)
without_try(1000000)
without_try_not(1000000)
print "\n\nThe case where the key does exist:"
print "1,000 iterations:"
exists_with_try(1000)
exists_with_try_exc(1000)
exists_without_try(1000)
exists_without_try_not(1000)
print "\n100,000 iterations:"
exists_with_try(100000)
exists_with_try_exc(100000)
exists_without_try(100000)
exists_without_try_not(100000)
print "\n1,000,000 iterations:"
exists_with_try(1000000)
exists_with_try_exc(1000000)
exists_without_try(1000000)
exists_without_try_not(1000000)
Update: Recently I updated my blog to process posts as if they were written in markdown. Just updating the raw post to markdown syntax.