Enhanced python-cjson

NOTE: This enhanced version is released under the same LGPL licence as the original module. Please do not contact the original author (Dan Pascu) regarding to this version. Send your bug reports to python@cx.hu. Thank you.

How to install on Linux
How to install on Windows
Example (not as complicated as it seems)
Throughput measurement, comparision with simplejson


How to install on Linux

Make sure to install the following packages:

binutils gcc libc6-dev linux-kernel-headers python-dev

These are Debian package names. Install the equivalent packages on other distributions.

You can compile and install python-cjson by issuing

python setup.py install
as root or with sudo. There should be no errors. The cjson.so library file will be copied into the site-packages directory. The library file may contain debug symbols depending on your default compiler options. You can strip the library to save a bit memory:
strip cjson.so
On Debian it is created to the python's main site-packages directory and not under /usr/local. The library file may be moved to the corresponding site-packages directory under /usr/local. You can test the library by running testjson.py. The build directory can be safely deleted after installation.

How to install on Windows

Please drop me a mail if you need binary releases for older Python releases (2.3 and 2.4) under Windows. Do not forget to write your exact python version. The minimum required Python version is 2.3.

To compile C extension modules for Python 2.4 or 2.5 with free tools you can use Giovanni Bajo's GCC 4.1.2 MINGW installer (I haven't tried it) or an older method using the official MinGW installer (this worked for me):

For Python 2.4 and 2.5 on Windows:

1. Install MinGW from http://sourceforge.net/projects/mingw/
2. Add C:\MinGW\bin to the system PATH (use the System applet from the Control panel)
3. Build your extension with --compiler=mingw32 argument:

python setup.py build --compiler=mingw32
or put a distutils.cfg file under C:\python\lib\distutils dir (or where you installed python) containing the following entries:
compiler = mingw32
After that you can install extension modules as usual (without the --compiler flag):
python setup.py install

Example: (not as complicated as it seems)

import re
import cjson
import datetime
# Encoding Date objects:
def dateEncoder(d):
    assert isinstance(d, datetime.date)
    return 'new Date(Date.UTC(%d,%d,%d))'%(d.year, d.month, d.day)
json=cjson.encode([1,datetime.date(2007,1,2),2], extension=dateEncoder)
assert json=='[1, new Date(Date.UTC(2007,1,2)), 2]'
# Decoding Date objects:
def dateDecoder(json,idx):
    if not m: raise 'cannot parse JSON string as Date object: %s'%json[idx:]
    return (dt,m.end()) # must return (object, character_count) tuple

data=cjson.decode('[1, new Date(Date.UTC(2007,1,2)), 2]', extension=dateDecoder)
assert data==[1,datetime.date(2007,1,2),2]

Download example.py

Note the extension keyword arguments.

Comparision of python-cjson-1.0.3x5 and simplejson 1.7.1

Test environment:

Throughput measurements:

simplejson 1.7.1

Test data: tuples in dicts in a list, 603887 bytes as JSON string
Encoder throughput: ~747 kbyte/s
Decoder throughput: ~272 kbyte/s
Test script modifications required to measure simplejson instead of cjson: simplejson imported, then cjson.encode calls are replaced by simplejson.dumps, cjson.decode calls are replaced by simplejson.loads.

NOTE: The simplejson page states, that 1.7.1 contains optional C code to speed up encoding. This is not used in the current test. A comparision including the speedup component will come soon. Stay tuned...

python-cjson 1.0.3x5 - compiler: gcc 3.4.2 mingw-special (MinGW 3.81)

Test data: tuples in dicts in a list, 603886 bytes as JSON string
Encoder throughput: ~9199 kbyte/s
Decoder throughput: ~9215 kbyte/s

python-cjson 1.0.3x5 - compiler: C compiler from Microsoft Visual C++ Toolkit 2003

Test data: tuples in dicts in a list, 603886 bytes as JSON string
Encoder throughput: ~9199 kbyte/s
Decoder throughput: ~8776 kbyte/s

It's interesting that the free gcc compiler builds a slightly faster decoder than the MS compiler. The striped pyd (DLL) files are 17k for the MS compiler and 22k for gcc, so the MS compiler uses less memory (and possibly less code cache). Decoder throughput may differ due to some loop unrolling optimization or so, but I did not verify it.

Please remember, that python-cjson requires a C compiler but simplejson uses only the standard library. Simplejson could be a better choice if you want portability, but I recommend python-cjson for performance critical applications, such as servers and frequent data conversion tasks.

NOTE: The results above may not reflect the real-world performance of these packages, but shows a clear difference. Python-cjson is more than 10x faster in encoding and more than 30x in decoding, at least when used against this data. Similar results found when using other realistic data sets. Using extension functions with python-cjson and passing much non-JSON-standard data can affect average performance but does not slow down processing of standard JSON data.

I've done performance testing on a dual Xeon 2.8GHz server with Debian Linux, and got excellent results:

Test data: tuples in dicts in a list, 603886 bytes as JSON string
Encoder throughput: ~9545 kbyte/s
Decoder throughput: ~16556 kbyte/s

Back to my python goodies page.