Class based views

We discussed these at a recent Vancouver Django (May 2009) meetup and it was also a pattern I used on a recent project.

Traditionally in Django a URL points to a function. So something like this is probably what you familar with:

from django.conf.urls.defaults import patterns

urlpatterns = patterns('',
    (r'^hello-function/$', 'recipe_9.views.hello_function')
)

And a view that looks like this:

from django.http import HttpResponse
        
def hello_function(request):
    return HttpResponse("Hello world!")

As it turns out, Django only needs something that is callable, so this can be a class. Let’s just show the difference. The URLs:

from django.conf.urls.defaults import patterns
from views import hello_class

urlpatterns = patterns('',
    (r'^hello-class/$', hello_class()),
)

The class:

from django.http import HttpResponse

class hello_class:
    def __call__(self, request):
        return HttpResponse("Hello world!")

Oh and let’s not forget some tests to prove this works:

from django.test import TestCase                
from django.test.client import Client

class tests(TestCase):
    def testFunction(self):
        clt = Client()
        res = clt.get("/hello-function/")
        assert res.status_code == 200
        assert res.content == 'Hello world!'
        
    def testClass(self):
        clt = Client()
        res = clt.get("/hello-class/")
        assert res.status_code == 200
        assert res.content == 'Hello world!'

What’s the advantage of this? Well since it’s a class you get all the advantages of being a class, doing things on the __init__, subclassing, overriding the __call__ and so on. Let’s take an example. An extension of the Kenyan project for me recently was a similar project in another country. Similar, but of course, not the same. There were differences in the text and how certain situation are handled. So I made all the views point to classes (at this point I will add I also altered our URLResolver)… anyway all the requests in these projects do that same thing:

  • parse the user input into a form
  • validate the user input
  • process it
  • return an response

This logic of processing the request in this order has been pushed into a class. All my views inherit from it. So now the code for country X only deals with the specific parts for that country. When I add country Z to the mix, I just need to change the specific parts for that country.

Whilst this may sound a bit specialised, many applications can use this. For example one of the first applications I have takes every request and then checks the account information on that request (it doesn’t use session for other reasons). So the first line of every view is a call to go and get that information. Making a part of template request processor sort of works, but you have to do work to dig that back out.

In the end making it a class based view adds only about 3 lines of code but makes you app much more reusable and adaptable in the future.

References:

Useful Django API’s

Two useful Django API’s that I’ve found today.

Model to Dict

A useful little fella hiding away in the forms module. Ever want a dictionary of your model and it’s fields? Then this one will give you that. Useful for mapping a model into a big string. For example in a recent project we are sending a message about the facility.. pass in a facility into a string and you’ll get:

>>> "%(id)s" % facility
Traceback (most recent call last):
  File "<console>", line 1, in <module>
TypeError: 'Facility' object is unsubscriptable
But with add in a model_to_dict call and you’ll get back a dictionary suitable for mapping:
from django.forms.models import model_to_dict
>>> "%(id)s" % model_to_dict(facility)
'1'

Finding models

This one requires a few assumptions. Supposing you want to find an model called Log in your project. You don’t care what the model is, what application it is in, just that there is something called Log. We would presume that Log had all the appropriate API and made sense in the project. 

In my scenario, I need to find the table name of log, so that I can run some custom, raw sql on it. I did not want to hardcode the table name. So I needed to find log, to access the db_table. The class from django.db.models.loading import AppCache provides some very useful stuff.

from django.db.models.loading import AppCache

class _models:
    def __init__(self):
        app = AppCache()
        for m in app.get_models():
            setattr(self, m.__name__, m)

models = _models()

This creates a instance models that contains a pointer to the model definitions. I called this resolve.py, sounded like a reasonable name. So now I can do:>

>>> from resolve import models
>>> models.Log
<class 'apps.sms.models.base.Log'>
And I’ve got the Log, regardless of where it came from.

Caching and signals

Adding in cached objects on signals.

Was reading this great talk on Django and performance from EuroDjangoCon. There are some good points in there, one I wanted to play with quickly was the caching framework and signals. 

I’m a big fan of signals and the idea of setting and deleting the cache as objects change was just something I had to quickly play with. Nothing too revolutionary here, but it simply sets and gets the cache when the object changes.

from django.db import models
from django.core.cache import cache
from django.db.models.signals import post_save, post_delete

def _key(model, id):
    return "%s.%s" % (model.__name__, id)

class Car(models.Model):
    make = models.CharField(max_length=255)
    
    def key(self):
        return _key(self.__class__, self.pk)

def create_cache(sender, **kw):
    cache.set(kw["instance"].key(), kw["instance"])

def delete_cache(sender, **kw):
    cache.delete(kw["instance"].key())

post_save.connect(create_cache, sender=Car)
post_delete.connect(delete_cache, sender=Car)

So each time the object is created or deleted, the cache gets updated (and thanks to being signals, simple to reuse for any class). And here’s a test case.

from django.test import TestCase
from django.core.cache import cache
from models import Car

class CacheTest(TestCase):
    def setUp(self):
        car = Car()
        car.make = "Avensis"
        car.save()
        
        self.car = car
        self.id = car.pk
        
    def testCar(self):
        key = self.car.key()
         
        assert cache.has_key(key)
        assert cache.get(key).make == "Avensis"
        
        self.car.make = "Auris"
        assert cache.get(key).make == "Avensis"
        self.car.save()
        assert cache.get(key).make == "Auris"
        
        self.car.delete()
        assert not cache.has_key(key)

As an improvement (and indeed the next recipe in my book) it does this through an ObjectManager so that you can just call:

Car.objects.cache(id)

And get the cached object back, setting the object in the cache if it’s not already there. There are some problems, including possible race conditions, as slide 30 of the presentation does point out. But it’s an interesting start and definitely something I need to play with a bit more.

References

Django, Malnutrition, SMS and Kenya

How I ended up doing one of most interesting and important projects I’ve been involved with.

A while back I got home from dinner to find a quick Skype chat from a friend, Nate Aune that basically went “you free to go to Kenya to do some Django - oh and you have to leave on Sunday” (that was in about 4 days time). The response was “is this for real?”. But as it turns out it was and about 6 days later I was leaving on a plane to Kenya, to work on a Django project.

The person behind this project is Matt Berg, who works at Columbia University on the Millennium Villages project. This particular project was already in the works and was based on excellent work from Schuyler Erle, I was to provide support on site and add in some new features.

The goal of this particular Django project is to help health care workers resolve cases of malnutrition (and as it turned out malaria and other diseases) faster and help more children get treatment faster. To do this, the project uses cell phones.

So when a health care worker using the project finds a child with malnutrition, they enter the details into a their cell phone and send a text message to a phone number. For all you twitterers out there, you will realize we hit the downside, those details entered ideally need to be under 140 chars.

Set up is a modem listening to that number, it feeds into a Ruby daemon called spomsky which is part of the RapidSMS project. That pushes a request over to a RapidSMS project written in Django and that’s where it get’s relevant to this blog.

RapidSMS is a cool project that “is UNICEF’s open source platform for data collection, logistics coordination and communication allowing any mobile phone to interact with the web”. You can get the source here: http://rapidsms.org/.

Once the message comes in, it hits a Django process, unlike most Django web applications there are two processes running. The first is a RapidSMS process called route and this is a process that connects to spomsky and listens for text’s coming in. The second is the standard Django runserver which provides a web interface for managers to see what is going on.

So the message is evaluated, for example: does this child one of the levels of malnutrition that warrant attention. If it does then we send a message back to the health care worked asking for the child to be sent to the clinic. It will also route messages to the clinic and to the local health care co-ordinator.

This routing is done using a very clever decorator that allows you to assign what text message goes to what method, eg:

 @keyword(r'n(?:ote)? \+(\d+) (.+)')
    @authenticated
    def note_case (self, message, ref_id, note):         
        [...]

Sending out a message is a simple in RapidSMS as doing:

    message.respond("Thanks")

In this example, you can send in a text message that reads (for example):

note +12 has malnutrition and needs care now

Where +12 is the child’s ID. The + was chosen to signify an ID, because on the particular phones used, the + is easy to get to without hitting too many buttons. Real messages would be more complicated and involve use of a MUAC measurement for example:

muac +26 7.5 2150 1.4 n

That rather cryptic text would give you back (new lines added for readability):

MUAC> SAM Patient requires OTP care. 
  +26 MADISON, M, F/4 (Sally). 
  MUAC 75 mm, 2.1 kg, 140 cm

These responses would go to health care facilitator’s or co-ordinator’s or to the clinics as relevant. Not only then does this give people the notification to follow up and make sure that the children get the treatment, but allows recording of the data.

On the back end in the office, there’s that second Django process which serves out a front end, that allows them to search, find and get data on the users. As that UI evolves it will let the users look up that data by clinic, district and so on.

This might sound simple, but for all those incoming messages, we wrote a lot of tests. I’ve done some mission critical code before, but knowing that I could miss a child with a serious medical condition, made me even more paranoid than normal. RapidSMS includes the really nice ability to write tests in a simple way.

For example here’s an example of an incoming malaria test of a child from phone number 7654321 (signified by the >). It is then sent to back to the reporter and also to a supervisor at 7654322 (signified by the <)

7654321 > mrdt +26 y n f
7654321 < MRDT> Child +26, MADISON, Molly, F/4 has MALARIA. 
    Child is 4. Please provide 2 tabs of Coartem (ACT) 
    twice a day for 3 days
7654322 < MRDT> Child +26, MADISON, Molly, F/4 (None) has MALARIA. 
    CHW: @jdoe 7654321

And that’s it. For about 6 days I worked with Matt adding in new features, new forms and new tools for the health care workers. On one day I got the chance to go out into the field and watch the health care workers get trained. It was great to see the health care workers pick this system up quickly and get used to using the system. The first few times sending messages took a little bit of time, but after a while they picked up quickly and really got the hang of the system and really seemed to appreciate it.

A brief aside here, the use of cell phones for this is absolutely brilliant. In Kenya there are lots and lots of cell phones. I won’t go as far to say everyone has one, but lots do. One image that struck me was a farmer clearing a field and taking a call on his cell phone at the time. Africa skipped the whole “stringing copper wires from poles” step that Western societies took. In Canada on every street corner there’s a Starbucks. In Nairobi it’s a cell phone store or pre-paid top up.

And that’s it, a few days later I was on a plane back over to Canada (well 4 planes). Leaving some people with the task of driving adoption and training users in the field how to use the project.

Of all the projects I’ve done, it’s got the potential for being the most important project I’ve done - and that’s really cool. And the real potential of RapidSMS spreads way beyond just this project. The ability to set up a sophisticated distributed system quickly using SMS has so many possible applications. It could great for inventory management, disaster management, anything. With RapidSMS and Django it becomes a real possibility to set this system up quickly. Of course these sorts of systems have existed in the past, but I doubt they have been set up as quickly and easily as RapidSMS allows.

A big thanks by the way to Matt and Jess and other Millenium Villages people for looking after this muzungu while I was in Africa. Also thanks to Schuyler for doing lots of great work and helping out on RapidSMS, spomsky and git.

Note: The personal details of getting to Kenya and back and can be found over on my personal blog, the focus here is on the technical side. Further if you looking for details on the efficiency of malnutrition programs or this program in general, I don’t have them either, I don’t know anything about that, Django stuff only.

Turning off django signals

While I’m on the subject, how to turn off signals that you don’t want on.

The first and most obvious signal to turn off is the one asking you for a username and password on syncdb, gosh that’s annoying. You can do this by calling: 

python manage.py syncdb --noinput

But if you want to do this in Python you can:

from django.contrib.auth import models as auth_app
from django.contrib.auth.management import create_superuser
from django.db.models.signals import post_syncdb
post_syncdb.disconnect(create_superuser, 
       sender=auth_app, 
       dispatch_uid = "django.contrib.auth.management.create_superuser")

The part that was annoying was figuring out just how to get it to disconnect, to find that dispatch_uid I had to go and read the source, not onerous at all. My previous post was about signals vs save. To all that I’ll add an annoyance with signals: if you load in fixtures the, signals get called. This is annoying because I wanted to do the following:

  • load in a set of users from fixtures
  • load in a set of profiles from fixtures

Unfortunately all the profiles failed, because the profiles are all created when the users are imported. To solve this on my test or reset scripts I need to turn off the creating a profile signal:

from django.db.models.signals import post_save
from django.contrib.auth.models import User
from users.models import create_profile # this is where my signal is registered
post_save.disconnect(create_profile, sender=User)

I will chalk that up as definite disadvantage for signals, perhaps there’s a more cunning way around that.

Signal vs overriding save

I’m a big fan of signals and wonder why people are so adverse to using them, preferring overriding save instead.

Probably every day on the Django IRC channel, someone asks about how to set a user on a model or about setting the time automatically. Fortunately thanks to this excellent article, we are all familiar with the former. The latter is easy to do as well, without having to resort to using the now deprecated (maybe) auto_now and auto_now_add.

Here’s how to do it overriding the save:

from django.db import models
from datetime import datetime

class Todo(models.Model):
    text = models.CharField(max_length=255)
    completed = models.BooleanField(default=False)
    timestamp = models.DateTimeField()
    
    def save(self, *args, **kw):
        if not self.id:
            self.timestamp = datetime.now()
        super(Todo, self).save(*args, **kw)

And here’s how to do it with a signal:

from django.db import models
from datetime import datetime

class Todo(models.Model):
    text = models.CharField(max_length=255)
    completed = models.BooleanField(default=False)
    timestamp = models.DateTimeField()
 
def add_date(sender, instance, **kw):
    instance.timestamp = datetime.now()

models.signals.pre_save.connect(add_date, sender=Todo)

Setting up a signal is surprisingly easy and one line less of code. Is it more expensive from a CPU point of view? Not really. Here’s a script that ran the two models:

from chapter_2.recipe_3.models import Todo
from time import time

start = time()
for x in range(0, 10000):
    todo = Todo()
    todo.text = "Get some milk %s" % x
    todo.save()
    
diff = time() - start
print "Time: %s" % diff

Running this with signals:

Time: 6.97824287415

Running this with save:

Time: 6.53473687172

I re-ran it a few times and never got a significant difference between the two. Sometimes the signals were faster. I think I can be reasonably sure that signals (in this simple scenario) do not add significant performance cost. So what’s the problem with signals?

The advantages:

  • The signal can be applied to any number of models, without having to repeat override save all over the place. Admittedly in the above example, if you made a base class and override save there, you wouldn’t have the problem.
  • Can be applied to models you don’t necessarily have access to or want to modify.
  • Overriding methods make me uncomfortable. The save is pretty obvious, but it’s easy to end up with a model that has a bunch of magic methods. Renaming one of them will cause grief as you are no longer overriding the correct one. And once I forgot to call super in the save, but that’s just my own incompetence.

The disadvantage:

  • You can end up with a lot of signals and forget what signal occurs when (easier to lose track than lots of save’s in the model inheritance)

Since creating a new signal and assigning to just one specific model is very easy, the argument “use save when its specific to the model” seems moot. It’s just as easy to make a signal as the save.

Where to put signals?

  • If it’s a signal related solely to one model, I recommend on the model
  • If it’s a signal that is used on multiple places, I recommend a signals.py module (just make sure to add it to __init__.py so that’s imported)

Further reading

Content mirror and this site

How this site is configured and some more details on content mirror.

I’ve been asked for this a couple of times, so here’s some details on how Content Mirror works on this site.

This site’s main editing interface for editors (myself) is done in Plone. Plone is mirroring it’s content into a Postgres database, which is being synced to a Django site, you are then reading it in something served by Django.

Server configuration

My Plone site is running in a Virtual Machine on my Mac. It is in fact an out of the box Plone 3.x, that’s easy to install and set up. To that installation I added in Content Mirror and psycopg. For this I’m using Postgres, it would work with any database, but I much prefer Postgres.

Next I created a Postgres database locally. I did this locally because I wanted to develop the site and wanted to install and play with everything locally. Developing locally is fast and simple.

So I’ve got Plone and Postgres installed locally with Content Mirror pushing content to Postgres. With that up and running I tried it out by adding in a few documents, publishing them and checking that everything worked.

So here’s how the “About page” of Djangozen, you’ll notice it is vanilla out of the box Plone:

Plone view of About

Building a front end

Once I’ve got everything going into the database, I can start to add in a front end. For this I chose Plango and then started modifying the heck out of it. Plango is here and is a front end in Django. It has a simple URL resolver that simply grabs the request, then looks that up in the database. So /about becomes a lookup in the database for a piece of content with no parent and an id of about.

There’s certainly some opportunity for optimization in those URL look ups, but we’ll about that later. The nice thing about having that Django front end is that it’s a snap to add in comments, simple openid login and so on without doing any Plone development at all.

The plugin’s section is added to and edited by end users, so that’s a Django app (that’s embarrasingly simple).

Deployment

Deployment of this site was actually very easy. I have a Webfaction account I use for various non-critical things. It’s cheap and easy to set up - I’m running 4 sites on it at the moment for under $10 a month.

The Django code went into SVN. And then I made Django app on my server, copying down the code I’d developed locally. Then I added in Postgres database.

To move data up is pretty simple, I stopped the local postgres and made an ssh tunnel. Restart Plone and it’s now writing to the remote postgres over the ssh tunnel. My script looks something like this:

sudo /etc/init.d/postgresql-8.3 stop
ssh -L 5432:server:5432 andymckay@server
sudo /etc/init.d/postgresql-8.3 start
 

The first time the connection is made, just run the bulk deployment script and the content is uploaded. After that all content will be incrementally deployed.

Gotchas

  • There’s currently a bug in Content Mirror that means the ATEvent content type has none of the extra fields (such as startdate, enddate, location…). Only I seem to get this issue at the moment however.
  • Changing the Postgres that the Plone site is talking to a lot can cause Plone to give errors, so try not to flip the Postgres database too often via the ssh tunnel.
  • If you have a lot of content that is being added to the site on the front end, you loose a lot of the Plone advantage. For example I want people to be able to add events to Django zen and have a review step… there isn’t a backwards step, I can’t review content in Plone.
  • Got to watch out for review states for things like navigation, RSS feeds, search and so on, if you want only published content to be viewed.
  • Things will be written to the database without Django knowing about. So, for example, Plango has a full text search. A cron job runs every 5 minutes, looks for things that have been updated and adds or removes them from the full text search, as needed. A database trigger works well, until I found Webfaction doesn’t allow them.
  • You need to fight Kupu to get it not to resolve uid’s in HTML so that if you embed an image in a document (such as this one) you can unresolve it.
  • Not everything in Plone is going to be translated, but that’s part of the cost.

Summary

For Djangozen it’s probably overkill. Although writing this blog post in Kupu, adding images etc has been quite a pleasure compared to the Django admin interface. But now most of the above gotchas have been resolved, it’s quite simple and quick to set this up and end up with a good site in no time.

Update: this doesn’t happen any more, it’s in tumblr now.

Keeping user profiles in sync

Django provides a user profile as a way to extend a users information. This isn’t created automatically for you however when you create a new User, but it is pretty simple to do this.

The easiest way to do this is to use signals. By listening to the post_save signal on users, you can create a user profile on a save.

When the user object is deleted, the profile will be deleted, because the delete cascades. To use this, don’t forget to register in the class UserProfile into AUTH_PROFILE_MODULE (see settings at the end):

from django.db import models
from django.db.models.signals import post_save
from django.contrib.auth.models import User
from django.core.exceptions import ObjectDoesNotExist

class UserProfile(models.Model):
    user = models.ForeignKey(User, unique=True)

def create_profile(sender, **kw):
    user = kw["instance"]
    if kw["created"]:
        up = UserProfile(user=user)
        up.save()

post_save.connect(create_profile, sender=User)

And here’s a test to demonstrate it working.

from django.test import TestCase
from models import UserProfile
from django.contrib.auth.models import User

class UserTest(TestCase):
    def testUser(self):
        u = User.objects.create(
             username='admin',
             email='andy@clearwind.ca')

        u.save()
        
        assert u.get_profile()
        assert User.objects.count() == 1
        assert UserProfile.objects.count() == 1
        
        u.delete()
        
        assert User.objects.count() == 0
        assert UserProfile.objects.count() == 0

Since this got asked on the django IRC channel I figured it had enough reason to justify a blog post.

References:

Django at PyCon

A wrap up of the happenings at PyCon.

Sadly I didn’t get to PyCon this year. In fact I haven’t ever been to PyCon, just what used to be the “Python Conference” which I think has been overtaken by PyCon.

Looking at the chat and IRC logs makes me realise how many people I know there and wish I’d been able to go. Next year. After being around the Zope, then Plone, then Django communities, it’s amazing how many people I knew and how many people those communities touched. 

It also seemed this year that Django was a big component, perhaps due to the uptake in its popularity over the last year. Some links from what I’ve found:

Some of the blog posts I found:

  • James Bennett blogs on an ORM panel which included Django
  • Alex Gaynor’s PyCon wrap up
  • Jacob gave a talk on “Real World Django” (which on the live monitoring point I would like to add in Arecibo)

For the sheer humour and gall of standing up in front of an audience and letting anyone on IRC type you have to watch this talk. It’s not really about Django and is a bit off towards the end, but Ian Bicking is worth listening too. And the IRC chat is great.

The most important thing though was of course:

Site launched

Djangozen is now live in some sort of state, hopefully working.

There’s nothing more boring than that first “oh this site is live” post. Usually because the normal developer person has gone into marketing mode and really thinks the other people care about the site. I know you care. How could you not with new Django sites going live every day.

Well here’s another. 

It covers things I want to know about Django above and beyond what’s on the main website. There’s lots of other sites out there that cover this stuff, but I wanted to start off with two things:

  • Plugins. There’s a great site out there called djangoplugables, it has loads of products on it. However there’s a few features I wanted. You can’t add plugins or releases to it  (i think it only indexes google code), there’s no categorisation and no RSS feed. Once I’ve got tagging working, those features are all here.
  • Clear Django. This is a book I’m starting to write that is a recipe style online book. It hasn’t got far as of the time of writing but it will all be online.

Of course all this is just empty words, with sites coming and going all the time, the real test is to come back in 6 months and see if I’m still here and doing this. Then you can really start to take notice.

In the meantime please bear with me as I finish off the features, clean up the typos and clean up the user interface.