I’m trying to explain the normal forms to someone and to be honest the fact that I can’t think of an easy way to explain them probably says a lot about how well I understand them.
I have a pretty good idea what they are, but can anyone put them in very simple English? All of the web sites and books I’ve found talk about recursive determination from the primary key… and stuff like that.
The scary thing is I’ve been an Oracle DBA for almost eight years and it’s not a question I can answer off the top of my head. Here’s what I found in a reference.
Basically you’re removing redundancies that can be inferred by relationships. My rule of thumb it if you can’t understand the schema it’s under normalized. If the relationships start slowing performance to the point of unusability it’s over normalized.
It’s nice to know I’m not the only one having trouble putting this in laymen’s language. The best I’ve come up with so far is to give a few examples with tables to show what the advantages of the forms are.
This is the way I work it myself. And I want to stress the fact that a database can be overnormalized. One of the problems I constantly have with one of my co-workers is that he’s taken a lot of classes, but hasn’t had to design a whole lot of practical databases.
So in summary:
If you have repeating data in one table, normalize to cut those fields out with a one to many relationship.
One to one relationships should be thought about carefully. If you ever have to put all the data back together again i.e. you’re parsing a text message or something it’s a huge pain.
Keep in mind what end results are desired. Then normalize to fit those requirments. If the user is going to want to see queries or reports based off of data that you’ve normalized down to 4 nested children, you’re going to be hating life trying to get a simple way of displaying it.
There is a sentence that concisely expresses the first three normal forms. I learned this when I first learned about normalization. The sentence is “The data should relate to the key, the whole key, and nothing but the key”
I doubt that the fullest possible normalization is ever optimal. Depending on the situation. Eliminating all redundancy means that to accomplish any meaningful information retrieval you always need to join a bunch of tables in your query.
A little bit of a hijack, but It seems to me that normalization is being quickly moved away from. When I first learned the DBA stuff it seemed like it wasn’t considered a proper RDB if it wasn’t in Third normal form. But now Oracle has nested tables and Varrays that they seem to be promoting heavily. Which are a direct violation of Nomalization. Systems analysis and design books seem to still be pushing third normal forms, but recent database books just skim the deffinition of normalization and move on
As I was taught it, the “objective of normalization is to transform this large schema into a collection of smaller schemas that, as much as possible, preserve all the functional dependencies while removing the possibilities for anomalies.”