Upgrade to Pro — share decks privately, control downloads, hide ads and more …

PyCon Ireland 2013 - Denormalizing the Django ORM

PyCon Ireland 2013 - Denormalizing the Django ORM

(via http://python.ie/pycon/2013/talks/denormalizing_the_django_orm/)
Speaker: Sean Blanchfield (http://twitter.com/seanblanchfield)

An ORM can make database programming a pleasure. Unfortunately, they can also land you in a lot of trouble when you go live and find out that you can handle about 1 request per second. In this talk, I share our experience porting Utopia-Game.com to Django. We fully normalized a huge database schema, and discovered too late that each page load was generating hundreds of queries. Unable to refactor all the code in production, we dusted off our meta class magic skills, and extended the Django ORM to create the DenormalizedModel, which walked and talked like a normal Django Model class, but which sucked all the class's foreign key relationships in to one big BLOB field.

In this talk I'll walk through the problem and our solution, and share the code with the audience.

PyCon Ireland

October 12, 2013
Tweet

More Decks by PyCon Ireland

Other Decks in Technology

Transcript

  1. BACKGROUND Acquired Utopia (web game) from Jolt Online Gaming. Refactored

    20,000 lines of poorly written Delphi code into Python/Django. Database consisted of thousands of flat files in Windows filesystem. Begin R := PL(P,L)+AmtAdjust; AmtAdjust := 0; If R > 0.5 then R := 0.5; R2 := LD[L].Eff[Id].Amt/100; If L <> Lib then R2 := R2*Utilized(P); R3 := R2 * (1 - R * 2); If R3 < 0 then R3 := 0; R2 := (R2 + R3) / 2; R := R * R2 * 100; R2 := LD[L].Eff[Id].Max/100; If R2 > 0.01 then If R > R2 then R := R2; End; End;
  2. THE PROBLEM We fully normalized the database. 34 fully-normalized tables.

    Ended up with 400+ DB queries per page. 2+ seconds processing time per page on dev machines.
  3. CACHING? Data in a game is hot. 20%-50% of requests

    are POSTs. Started caching common data in request object, so it's retrieved just once per request. Still had 100+ DB queries per page.
  4. SOLUTION Denormalize data to structure it around access pattern -

    i.e., a player record. NoSQL still immature - chose to throw it into a BLOB in MySQL Now faced with refactoring 20K lines of Python on a deadline :-| IDEA: Make the BLOB field behave like the ORM.
  5. METACLASS MAGIC The BLOB contained a dict of nested lists

    of dicts - data that was previously in related tables. Can generate ORM-like attributes on Django models that retrieve and set the contents of the BLOB.
  6. EXAMPLE USAGE import denormalize document_schema = { 'num_wins': 0, 'dragons':

    [ { 'target': 0, 'dragon_type': '' } ] } BaseClass = denormalize.make_denormalized_base('KingdomBase', document_schema) class Kingom(BaseClass): #class Kingdom(models.Model) # Usual Django model stuff name = models.CharField(max_length=64)
  7. RESULT: LISTS OF DICTS ACT LIKE RELATED OBJECTS. k =

    Kingdom.objects.get(pk=1) print k.name print k.num_wins k.dragons.objects.create(target=2, dragon_type='blue') print k.dragons.objects.all() d = k.dragons.objects.get(target=2) d.dragon_type = 'green' d.save() k.dragons.filter(dragon_type='green').delete()
  8. BUT HOW? Generated base class wires up dictionary element to

    proxy objects. Proxies for ints, strings etc are trivial. Nested dicts are wired up to a proxy object that acts like a Django model object. Lists are wired up to a proxy object that acts like a Django RelatedManager.
  9. BENEFITS 1. About 3 day's development 2. 100+ queries per

    page reduced to 3 to 10 3. Greatly reduced scope for deadlock 4. Allows seamless mixture of relational and denormalized design within Django ORM. 5. Remained flexible through later development.