We have seen
about the normalization and different levels of normalization in previous post.
A normalized database design will store the information in more number smaller
and separate tables. When you need to retrieve particular information from a
normalized database you need to perform join operation, when you have more
number of relationships among tables for particular information, you need to
perform more join operation; it may further reduce the retrieval speed of the
database. Here comes the denormalization, we are violating the normalization
selectively in order to reduce the data retrieval time of a database. But at
the same time over denormalization also leads to redundancy and inconsistent
data, so it is data modeler’s response that, after denormalization the database
should not become inconsistent.
In this post
let me explain you about denormalization technique with an example. Let’s
consider the user model from the previous post, reproduced below for easy
understanding.
USER MODEL IN NORMALIZED FORM |
Technique 1: moving all super type data fields into each subtype.
In this
technique we move all the data fields from the super type into each subtype,
here relationships also moved from super-type to subtype. The following figure
displays the new updated model.
MOVING SUPERTYPE DATA FIELDS INTO EACH SUBTYPE |
In the above
model, you can see that the supertype User entity removed and the data fields
of the USER entity moved to each subtypes COMPANY and PERSON. This technique
enforces only the business rules containing the subtypes since the supertype is
removed from the model. When applying this technique of denormalization you
should keep in mind that, this model does not allow you add new subtypes in
future without altering the data model.
Technique 2:
moving all the data fields of each subtype into supertype.
In this
technique, all the data fields of the each subtype moved into supertype and the
subtypes are removed from the model. After applying this technique to our User
model, the new updated model will look as follows:
MOVING SUBTYPE DATA FIELDS INTO SUPERTYPE |
Note that in
the above mentioned model, we have created a new data field User Type in USER entity,
since we merged data fields of both PERSON and COMPANY so we need something to
distinguish each of these properties. The data field User Type contains the
values such as “Person” and “Company”. In this model we can enforce the rules
of the supertype USER but we cannot enforce for subtypes PERSON and COMPANY
since we removed both of these from our model.
Technique 3: denormalization with directly accessing parent entity.
In this
technique we apply normal database denormalization techniques but we are
allowing access to the parent entity directly. For example, consider our USER
model, if business people need report including following details:
- What is the total number of users by City?
- What is the total number of users by State?
- What is the total number of users by Country?
- What is the total number of users born in the year 2015?
- What is the total number of users registered in the month of January?
In a normalized database, we need to use join operation to answer the mentioned question from the child entity to parent entity. In order to solve this performance issue, we can model the one-to-many side of the relationship as follows. For understanding better, I applied the both the technique 2 and 3 in this model.
DENORMALIZATION WITH DIRECT ACCESS TO PARENT ENTITY |
After
applying this technique we got the above displayed updated model, in which you
can see that three entities such as CITY, STATE and COUNTRY grouped into an
entity called GEOGRAPHIC LOCATION. In that entity we created a new data field
called Geo Level Code, which is used to distinguish the level of geographic
location, which contains the following values such as “city”, “state”, “country”.
Similarly the data field calendar level contains the values “Date”, “Month” and
“Year”.
Technique 4: Same data fields in the same entity:
Before the
explaining about this technique, let us ask our self, what if user having only
two Phone numbers? Do we really need to model phone number as a separate entity?
In order to answer these questions, we should know our requirement very clearly.
Since we have the requirements where the USER have only two phone numbers, but
what if he needs to add many number of phone numbers in future? Also what if
the business needs to track the history of phone numbers used by a User? In
those cases having normalized separate entities are good options, since we can
add requirements any time in future without altering the data model.
When a user
have only two phone numbers and no additional phone number facilities allowed
in future, we can add the phone number data field in the USER entity itself,
instead of creating a separate entity. So after applying technique 2, 3 and 4
our new denormalized data model will look as follows:
SAME DATA FIELDS IN THE SAME ENTITY |
Technique 5:
copying data field from an entity into one or more entity into one or more entity:
Recall the
technique three where we grouped all the geographic location entities into an
entity called GEO LOCATION and created a Geo Level Code and similarly created
CALENDAR entity and Calendar level. There
the data model provides us option to select the level of geographic location
and calendar and we removed relationship among the separate entity and created
a new entity. Now let’s assume, your business require only Country level data
most the time and needs state level and city level very rarely, in that case,
in order to retrieve the data we need navigate to Geo location table very
frequently. Technique five help us to make this process very less time
consuming and increase the speed of retrieval by simply copying the most
frequently needed Country data field to User entity, while remaining data
fields and relationship untouched. Here the relationships are not removed only
the data fields are copied. Similarly, for example, business needs only month
level data, by applying this rule our updated denormalized data model may look
as follows:
COPYING DATA FIELDS |
Note that,
in our example, we have applied more than one technique to our model at a time,
but you should remember that, these techniques are not step by step process;
you can apply the required technique as per the requirements. It completely
depends on your business requirement. I hope this post gives you a basic
understanding of denormalization technique. If you have any doubts or if you
needs to share your thoughts kindly write down at the comments section.
Awesome article! I want people to know just how good this information is in your article. It’s interesting, compelling content. Your views are much like my own concerning this subject. бонус
ReplyDeleteI just have to share my testimony on this Forum.. The feeling of being loved takes away so much burden from our shoulders. I had all this but I made a big mistake when I cheated on my wife with another woman and my wife left me for over 4 months after she found out.. I was lonely, sad and devastated. Luckily I was directed to a very powerful spell caster Dr Emu who helped me cast a spell of reconciliation on our Relationship and he brought back my wife and now she loves me far more than ever.. I'm so happy with life now. Thank you so much Dr Emu, kindly Contact Dr Emu Today and get any kind of help you want.. Via Email emutemple@gmail.com or Call/WhatsApp cell number +2347012841542 Website (https://emutemple.wordpress.com/)
ReplyDelete