Often we face a problem where and how to store our domain objects. This decision may be of great importance later. In this article, I’ll present you an option of storing serialized domain objects in a database. We’ll look at XML object serialization in a relational database. We’ll get to know the pros and cons of such an approach.
When we may want to store a serialized object in the database?
The most popular approach of storing objects in relational databases is a table with a column per property.
Let’s assume we have an object like this:
Standard approach would be to create a table with 5 columns:
All looks clear and simple. Yet, what if we have a complex object with dozens of properties which are not always primitive types but custom types? If we don’t use ORM code-first approach to generate database tables the process of creating tables, columns and mapping will be time-consuming.
There might be an idea to create a table with one xml column and store serialize object there. This might look like this:
When would you consider using that approach?
Objects to store are big so it’s time-consuming to maintain database schema and code mapping
Limited time for development
Objects rarely (or never) need to be modified
Interaction with the database doesn’t need any mapping so code is significantly reduced.
Reading an object might look like this:
Object obj = Deserialize(xmlFromDatabase);
string xmlToSave = Serialize(obj);
I’m involved in an enterprise project that uses such an approach for some big and complex domain objects. I’d like to share with you what are downsides of such an approach that we face on during application maintenance.
Worse performance when interacting with the database
Reading and deserializing from the XML column will always take more time than reading from separate database columns. Loading a collection of objects might become a slow operation with this storage model. So, you’re limited to loading 1 object at a time. We want modern applications to be as fast as possible. That means that we would accept such a storage model only in back-office applications. When there is no risk that the client will leave the page because it is too slow.
No option to change serialized object class or property names
Let’s assume we have a property called Number on a class named Order. At some point, it turned out that the more appropriate name for this property would be Identifier. With a serialized object, we can’t rename this property as we’d lost old values stored in a database.
It’s even worse if we want to change the type of property. If we change non nullable type to nullable, int to string, boolean to enum, it’d result in serialization runtime error and object can’t be deserialized at all.
Such constraints make the development process harder as you cannot adjust your model to constantly changing business requirements. This might get especially problematic in big enterprise applications when appropriate names are important. If we use Domain-Driven Design the rule of ubiquitous language can easily be violated.
Difficult data fixing
Often, we have some data in the database that we want to modify (for example data fix scenario). To update serialized data, we need to write queries that change xml. This makes the task much more time consuming than it’d be if we could use the standard T-SQL Update statement. A simple query to update xml property might look like below:
replace value of (/Property1/text())
with "some value"
WHERE SOME CONDITION
It gets even more complicated if we want to modify xml collections.
Difficult search on a serialized column
Let’s say we have some production data already. There might appear a business need to search by data stored in a serialized column. So, we have 2 options, either we use XML query that might have a bad performance or we write a SQL script to migrate existing serialized data into a new column. A simple task might become troublesome with this storage model.
A full object must always be sent/received from the application
With this storage model, the easiest way to return the data to the client is to return full xml. There are downsides of this approach. If we don’t need all properties, we end up with worse performance. Moreover, we expose all the properties to the client even though we might not want to do this. It’s also an issue the other way around. The easiest way to update is that an application sends a full object to a database.
If we need to avoid these downsides there is an option to select or update required data by XPath queries. This requires additional, not always straightforward work tough.
We might decide to store some properties in a separate column (for example to enable faster database search queries). To avoid duplication, we should get rid of this property from serialized xml. This, however, requires additional effort and might not always be done. We end up with storing the same data in 2 places: specific column + xml. The first disadvantage is that we have an unnecessary database space allocated. Second is the need for keeping these 2 values in sync during updates. It’s easy to imagine someone making a data fix and updating only 1 place. We’d end up having inconsistent data.
As you can see there are many disadvantages by storing a serialized object in the relational database. Especially when application code constantly changes. Therefore, choosing it as a solution must be considered very carefully. Future requirements (that we don’t know yet) might be hard to achieve if we’ve some production data serialized in a database.
It might be a valuable option tough for data storage tables when the client only reads the content and performance is not of great importance (like reporting).
We should choose other solutions for such storage models. Some of them are JSON documents storage (available now in MSSQL) or NoSql databases. These give us better flexibility and performance than the model that we discussed.
We process cookies and make them available to Google Analytics (a service provided by Google, Inc.) to improve the performance of the website, to learn your preferences about using it and to tailor it to your needs. The data will be anonymised before being transmitted. If you do not agree to this, you may disable cookies in your browser. If you do not change your browser settings, you accept the fact that it saves cookies.