A Python and XML Blog


Mapping is Bad

Posted in Uncategorized by pythonxml on September 5, 2007

Lately there has been a good deal of discussion regarding Object Relational Mappers for Python. The discussion (from what I can tell) stemmed from using things like Elixir with SQLAlchemy and introducing more problems. I think this issue is almost identical to the different marshalling libraries for XML. The real issue is a perceived deficiency in some core data model compared to the programming model. There is a concern that having to think in two different conceptual paradigms will greatly slow development. Another aspect of this, maintenance of a system with different languages and models sprinkled around. While there are some tough issues you can run into when constantly switching between models, the cost of normalizing each model to the programming paradigm is extremely costly.

When thinking about this issue, I always come back to casting. Before I understood generics in C#, I was constantly using generic collections where I had to constantly be considering the type of the object. Even though I eventually realized my issues were solvable, it became clear that constantly having to reshape information in order to use it is time consuming. One thing someone can do is to create a translator or interface to more easily make transitions between types. This is exactly the path an ORM takes as well as an XML marshalling library. It aims to solve the problem of translating some database or XML to the model of the programming language, which is more often than not, and object oriented model.

Another path is to look at the problem from a build management standpoint. Instead of thinking of how to constantly translate data to one paradigm, try programming according to the data. This is the natural pattern for XML when you use tools such as XPath and XSLT. For example, in the XML as data case, you can just query the document via XPath for some value or run an XSLT on the file to change it to what you need. At this point the problem is not how to deal with the data, but rather how to deal with your build system including potentially many different models. At this point the question comes closer to should I name my folder “xslt” or “transforms” and how do I resolve those file locations.

This second case is not a trivial issue of course, but it is one that has been solved. In Python, we have the package resource tools such as eggs, easy_install, and setuptools. C# also has compiled transforms and the ability to save an XSLT as a dll, which means you could work with it from the GAC or use it via traditional Visual Studio project inclusion. None of this is perfect, but by changing the problem space from being an issue of translation between data types, the issue is working with files. This problem has been around for quite a long time so there are plenty of ways to find solutions that can fit within any application.

That said, tools such as SQLAlchemy and Amara provide a great way to get the simple stuff done quickly while staying out of the way when things get complicated. The cost is a slightly less optimized API, but overall the benefits are huge because the maintenance question is already answered with the fuller featured API. It can be tough at times to switch models all the time, but with the web being what it is and developers already feeling pressure to understand more languages and paradigms, accepting the challenge only seems like a good first step to eventually finding better solutions to constant transitions.

Leave a comment