Hierarchical versus relational database


Was my obvious first “google” preparing for this post. Very first came an interesting article from Narendra Naiudu from Pune in India posted 2005, explaining the shortcomings of the hierarchical model. I hope to pass him next time I do the search with Google.

I have seen it happen many times, when things getting tricky, you end up compromising trying to model the world as a hierarchy..

imageIn the mainframe era of the 70:is IMS was the model to structure the world – in hierarchies. Two decades later when OLAP went into sizing problems when calculating sums for all possible levels, hierarchy was again needed as a compromise. And now 20 years later the same occurs again, managing “Big data” the need for this narrow model is called for again. When you go into performance problems with computers, this is the rescue. “The world is hierarchical”, which sometimes is easy believed if you are naive, and sometimes might be an acceptable compromise.

imageWorking with DB design for decades I would say – it is a sacrifice to do. Far too many models fail.. The most classical example is, ok you have one boss, so this could be modelled with “H”.  But what about Mr X who works in two projects as well.. He has a boss for his ordinary job, but with 25% he has another boss as well. But sales districts are such, except for Mr Y, whose nephew owns a shop outside his district, so he is of course responsible for that customer anyhow. But as Mr Naindu explains the setup of a disk and its directories could be modelled perfectly as a hierarchy, and with that I agree totally. You love how your files are organized, don’t you?

imageFAT was a compromise and we still have to live with it!! You know where you have stored your email in its folders? Most of what you observe in the world, the hierarchical approach means many compromises. Ok, “Bill of material” is good, but what more? Military might also be an exception where “map applies if the reality does not match”. A son could only have one father, you say, or? If you take other examples the compromises you got to accept sooner or later makes your database incapable of keeping up.

Don’t get trapped! Be aware of shortcomings when the model is “H”!!

The crowd yielding “NoSQL” is loud and verbal.

The reason for using simplified methods for Big data is basically technical (i.e. performance and development costs). It is much easier to program for parallelism and loosely couple processors in a hierarchy and ignore basics like integrity, usefulness and confidence. Who really needs that?