Data normalization questions

ultrafilter · February 26, 2006, 9:04pm

I’ve run into a couple tables where I’m having some trouble figuring out how to properly normalize them. Both of them have the basic structure of (PK, F1, F2) where the meaning of F2 depends on the value of F1. I thought there’d be a clear textbook answer for this, but I haven’t been able to find anything.

There are some differences between the two situations. In the first table, for certain values of F1, I expect F2 to be NULL, and for other values I expect F1 to be non-NULL. In the second table, if F1 has a particular value, there’s a second record with a different specificed value of F1, and I need the data in both of those. In both cases there’s extra data along for the ride, but it doesn’t depend on anything but the primary key, so I don’t think it’s important.

chrisk · February 26, 2006, 9:38pm

ultrafilter:

I’ve run into a couple tables where I’m having some trouble figuring out how to properly normalize them. Both of them have the basic structure of (PK, F1, F2) where the meaning of F2 depends on the value of F1. I thought there’d be a clear textbook answer for this, but I haven’t been able to find anything.

There are some differences between the two situations. In the first table, for certain values of F1, I expect F2 to be NULL, and for other values I expect F1 to be non-NULL. In the second table, if F1 has a particular value, there’s a second record with a different specificed value of F1, and I need the data in both of those. In both cases there’s extra data along for the ride, but it doesn’t depend on anything but the primary key, so I don’t think it’s important.

You’re right, this isn’t a textbook situation, though as someone with some relational experience, I have a few tips:

“for certain values of F1, I expect F2 to be NULL, and for other values I expect F1 to be non-NULL.” Do you mean you expect F2 to be non-null? If not, I’m not quite sure what you mean here.
If my guess was correct, then this is fine. Being able to ‘guess’ about whether F2 has relevant data in it or not from the value of F1 is fine, and in fact perfectly normalized. Having ‘nulls’ where you can’t predict them is less desirable, in fact, as those generally represent missing data, although sometimes the absence of data, unflagged by context, is not a particular problem.

“if F1 has a particular value, there’s a second record with a different specificed value of F1, and I need the data in both of those.”

This sounds undesirable in a relational database… I can’t find a normal form about it, but I would certainly recommend moving the fields of the second record into the first record, as additional fields, if this is at all possible. If not, then you haven’t explained it well enough for me to make another suggestion.

“Both of them have the basic structure of (PK, F1, F2) where the meaning of F2 depends on the value of F1.”

This is something to be wary of, though sometimes it’s the best thing to do. If you have a situation where different fields of F1 flag several clearly different ‘secondary fields’, then sometimes it’s better to put all of those fields into the same table - or to break the table up into several different tables, each of which has the appriopriate fieldname.

I hope these suggestions have been somehow helpful.

Topic		Replies	Views
Database question: difference between normalized and denormalized data? Factual Questions	19	7858	August 11, 2005
Why does SQL assume you don't want to indlude NULLs in queries? Factual Questions	46	13064	May 29, 2010
Data Normalization Forms of relational databases Factual Questions	8	1018	June 22, 2003
Extract... data... Gnnnn... life... force... draining... away... The BBQ Pit	12	1820	April 2, 2003
Help Needed - Normalizing data for an Access Database Factual Questions	10	954	January 9, 2004

Data normalization questions

Related topics