South’s Design
I had a lot of interesting chats with people about South last week at EuroDjangoCon, and several eyebrows have been raised at me both parsing models and then storing their definitions as dicts.
While this is a design that is partially born of backwards compatibility, it was mostly a conscious design choice I had to make, and so I'd like to now describe why I made it.
Firstly, for those who aren't aware, South's magic autodetection and auto-writing-migration code is based off of a file called 'modelsparser.py', which does what it says on the tin; it opens models.py, parses it, and extracts definitions directly.
Now, my first response to this is also 'ew', but there's a reason for it: custom fields. If it wasn't for those, introspection of the existing fields would work just fine, given a long list of special cases (I in fact have a file called modelsinspector.py sitting around that does just that - if you're interested, it's in the 'noparsing' branch on BitBucket. It doesn't do much, but it proves the concept).
However, with custom fields in the game, you have two options: hope they follow Django conventions and try to introspect common options, or put the onus on custom field authors to add a hook that makes it spit out a definition. While I would obviously choose the second one (possibly added to some of the first), it simply doesn't provide the drop-in compatability I want South to provide.
However, as Alex Gaynor and several others have surmised, I could just pickle everything and stick that at the bottom. No introspection needed, no weird namespace errors because the models dictionary is simply a collection of eval-able strings.
However, I don't like pickle (the Python kind; the delicious food kind, on the other hand, is more than welcome in my household). The security issues don't matter here - after all, we're passing raw strings to eval instead - but it has two major features I don't like. Firstly, it's not at all human-readable - and while I don't imagine people editing their migrations by hand, I like to be able to see what is going on, and I like the fact that South still doesn't depend on you using startmigration - a design choice I'm very proud of.
Secondly, I've been burned by pickles in the past. If you remove a module that a pickled thing uses, pickle will crash and burn, and if you've not done them right, you won't know quite where the error happened. (For the purposes of argument, of course, I'm assuming I'd do it right, so this isn't really a point. Still...)
There's yet another option (see, my job isn't easy) - custom-pickling, which Alex suggested last Wednesday; it consists of roughly freezing the __dict__ of the field. This has some advantages - it's more human-readable - but you get the errors similar to my current approach (where datetime.datetime wasn't unfreezing if people didn't have datetime imported).
My main gripe, with both this and pickle, is that the internals of fields aren't guaranteed to be consistent between releases of Django, just the external API, and so South uses a method that lets it use the external API - in my mind, this is less hassle, since that's the thing that's really not going to change very much (and even if it does, you can easily see what South is trying to call, and run that in a shell yourself to see where the problem lies).
So, that's why models are frozen as a massive lump of dictionaries and triples. As with many things, it probably wouldn't be necessary if models had been done slightly differently (so there was a method that always spat out a nice, portable way to reincarnate a field), but with no reason for that previously, I would have been thoroughly impressed by the core developers' foresight if it existed.
As for parsing versus introspection... I'm still divided. Parsing always has, and still feels, wrong, but introspection might just be less reliable. If you're interested, go peek at the branch I mentioned above, and see what you make of it.
And now, I will stop blathering and return to my small hole at the base of the mighty Django Mountain, where I am attempting to fit square pegs into round holes. Good day.

comments
South is very useful.
Thanks for giving us this wonderful tool.
Indeed, thanks very much for South.
There's another use case that introspection handles easily and parsing borks: fields that magically add other fields to the model; i.e. a MarkupField that adds a _html field, or some of the db-translation options out there.
Sorry for the double-post. I also meant to say that I hear the argument about internal field API vs public API, but I'm not sure it's really much of a concern. django-evolution stores internal field attributes, and that has occasionally caused problems with updates to Django that changed Field internals (i.e. the move to making .unique a property), but they were minor and easily fixable. For my use cases it never caused anything like the hassle that South's parsing approach has.
That's true Carl, of course, and one of the downsides - django-multilingual is but the latest in a series of such incompatabilities. As I said, I'm investigating introspection, and I have no qualms about changing if it ends up being cleaner.
Could you pickle the models and then include a human readable version as a comment for reference?
We could, Ben, but that appeals against my sense of non-repetition, as well as negating the editability aspect of it.
Still, it makes pickles more digestable, I agree. I'm focusing on trying to move away from parsing first, though, that's the bigger of my two elephants in the room, I think.
Have you considered using jsonpickle to freeze models?
http://jsonpickle.googlecode.com/svn/docs/index.html
If you don't like everything being stored in a string, you could adopt it to generate dictionaries and tuples.
But maybe outputting to a DSL would be even better than that? At least then, you'd be able to catch syntax errors in hand-hacked model definitions...
Sorry if this is a bad place to post a question, but I found this page looking for info about how South handles custom field types...
Specifically I'd like to use django-tagging, and I saw it uses a new 'TagField' type. Any idea if this will work okay with South?
Paul, you should really use the mailing list or the IRC channel!
Still, for django-tagging in particular, mercurial tip now has introspection rules for it built-in. There's some pages in the docs on custom fields, too.