The missing documentation for django.utils.datastructures

By : Thejaswi Puthraya

Note

django.utils.datastructures is intentionally not documented by the django core devs because it is an internal API and is liable to change without any notice. This file is not governed by django's lenient backwards-compatible policy. You have been sufficiently warned!

With the note out of the way, let's look at the interesting datastructures in this file. You may ask why we should learn about those when we shouldn't be using them? Reading code is the best way of learning and this file has some beautiful code.

MergeDict is the first of the datastructures in the file. It provides a dictionary like interface but can look up from multiple dictionaries provided during the initialization.

Here's an example:

>>> md = MergeDict({"foo": "bar", "moo": "cow"}, {"abc": "def"})
>>> md["foo"]
'bar'
>>> md["abc"]
'def'
>>> md.get("abc")
'def'
>>> md["xyz"]
KeyError:
>>> md.items()
[('foo', 'bar'), ('moo', 'cow'), ('abc', 'def')]
>>> md.keys()
['foo', 'moo', 'abc']
>>> md.values()
['bar', 'cow', 'def']

The MergeDict is used within django in attaching values with a form widget and in request.REQUEST.

The built-in dictionary does not maintain the order of the items but the SortedDict is a subclass of the built-in dictionary that maintains the keys in exactly the same order they were inserted.

Here's an example:

>>> dd = {"foo": "bar", "moo": "cow", "abc": "def"}
{"abc": "def", "foo": "bar", "moo": "cow"}
>>> sd = SortedDict((("foo", "bar"), ("moo", "cow"), ("abc", "def")))
{"foo": "bar", "moo": "cow", "abc": "def"}
>>> dd["xyz"] = "pqr"
>>> dd
{'abc': 'def', 'foo': 'bar', 'moo': 'cow', 'xyz': 'pqr'}
>>> dd["lmn"] = "ghi"
>>> dd
{'abc': 'def', 'foo': 'bar', 'lmn': 'ghi', 'moo': 'cow', 'xyz': 'pqr'}
>>> sd["xyz"] = "pqr"
>>> sd
{'foo': 'bar', 'moo': 'cow', 'abc': 'def', 'xyz': 'pqr'}
>>> sd["lmn"] = "ghi"
{'foo': 'bar', 'moo': 'cow', 'abc': 'def', 'xyz': 'pqr', 'lmn': 'ghi'}

The SortedDict is fairly widely used inside of django generally to build a hierarchy (like models and it's parents), maintaining the order of the form fields while iterating etc.

In python 2.7, a new datastructure was introduced that mimics the SortedDict in the collections module and is called OrderedDict.

MultiValueDict is a dictionary subclass that can handle multiple values assigned to a key.

Here's an example:

>>> dd = {"abc": "def", "foo": ["bar1", "bar2"]}
>>> dd["foo"]
['bar1', 'bar2']
>>> mvd = MultiValueDict({"abc": "def", "foo": ["bar1", "bar2"]})
>>> mvd["foo"]
'bar2'
>>> mvd.getlist("foo")
['bar1', 'bar2']
>>> mvd.getlist('blah')
[]
>>> mvd.getlist('abc')
'def'
>>> mvd.setlist('xyz', ['pqr', 'ghi'])
>>> mvd
<MultiValueDict: {'xyz': ['pqr', 'ghi'], 'abc': 'def', 'foo': ['foo1', 'foo2']}>
>>> mvd.appendlist('xyz', 'ijk')
>>> mvd
<MultiValueDict: {'xyz': ['pqr', 'ghi', 'ijk'], 'abc': 'def', 'foo': ['foo1', 'foo2']}>
>>> mvd.update({'xyz': 'lmn'})
>>> mvd
<MultiValueDict: {'xyz': ['pqr', 'ghi', 'ijk', 'lmn'], 'abc': 'def', 'foo': ['foo1', 'foo2']}>

The MultiValueDict is used in binding data to request.POST, the files to request.FILES and in the get parameter parsing.

The ImmutableList is an immutable datastructure that raises errors when it is attempted to be mutated.

Here's an example:

>>> il = ImmutableList(['foo', 'bar', 'abc'])
>>> il += 'lmn'
AttributeError: ImmutableList object is immutable.
>>> il = ImmutableList(['foo', 'bar', 'abc'], warning='Custom warning')
>>> il[1] = 123
AttributeError: Custom warning

The ImmutableList is used in request.upload_handlers to prevent modification after the request.POST or request.FILES have been accessed.

The DictWrapper is a subclass of the built-in dictionary that prefixes the keys. It takes a dictionary, a function and a prefix as arguments. If a specific key lookup begins with the prefix then the value is passed through the function before it is returned.

Here's an example:

>>> dw = DictWrapper({'foo': 'bar', 'moo': 'cow'}, lambda x: x, 'abc_')
>>> dw['foo']
'bar'
>>> dw['abc_foo']
'bar'
>>> dw['xyz_foo']
KeyError: 'xyz_foo'
>>> def post_process_value(value):
...     return "The value is " + value
>>> dw = DictWrapper({'foo': 'bar', 'moo': 'cow'}, post_process_value, 'abc_')
>>> dw['foo']
'bar'
>>> dw['abc_foo']
'The value is bar'

The DictWrapper is used in quoting names for SQL queries with the key prefix.

Hope you enjoyed learning about these hidden gems and how django works under the hood but take the note on the top into consideration.


Related Posts


Can we help you build amazing apps? Contact us today.

Topics : django internals

Comments

Jacob Kaplan-Moss

This is great! Would you be OK with me adapting this to put in the official Django documentation? I'd credit you, of course. We've been wanting to include this sort of internals documentation, but have never really had the time.

commmenttor
Randle Taylor 1st Nov., 2012

Nice post! I wish I knew about SortedDict before today. I have often wanted to use collections.OrderedDict in my Django apps but avoided it to maintain compatibility with Python versions < 2.7.

commmenttor
Thejaswi Puthraya

Hello Jacob, I've got a green signal from Shabda regarding the usage of this in the documentation. I was under the impression that these were intentionally undocumented.

I would be more than happy to document other internals as well when I dive into the source code.

commmenttor
Samat

A heads up: Python 3.3 includes collections.ChainMap, which is pretty similar to Django's MergeDict.

commmenttor
Thejaswi Puthraya

Samat, thanks for the heads up. The Collections module is an underdog!

commmenttor
test website

Thanks for every other informative blog. The place else may just I am getting that kind of information written in such an ideal approach? I have a project that I'm simply now operating on, and I have been at the look out for such info.

commmenttor
Grant Jenks

For the SortedDict type, consider using one of Python's many dictionary implementations that maintains the keys in sorted order. These implementations support faster get/set/iter operations. The sortedcontainers module (http://www.grantjenks.com/docs/sortedcontainers/) is a pure-Python and fast-as-C implementation that's fully tested and documented. There's also a performance comparison (http://www.grantjenks.com/docs/sortedcontainers/performance.html) that benchmarks several popular options against one another.

commmenttor
Jordan 2 Generation

The missing documentation for django.utils.datastructures - Agiliq Blog | Django web app development

commmenttor
© Agiliq, 2009-2012