Let’s say that I have a database (and, by coincidence, I do!) of cartoons,linked to a table of the DVDs they appear in, and that linked to the people who do the commentaries on each cartoon for that DVD.
So, if I do a query for that DVD, the cartoons and the commentators, it might return something like this:
DVD Title Commentators
-----------------------------------------------------------------------------------------
The Simpsons Season 10 Lard of the Dance Mike Scully
The Simpsons Season 10 Lard of the Dance Ron Hague
The Simpsons Season 10 Bart the Mother Matt Groening
The Simpsons Season 10 Bart the Mother Mike Scully
The Simpsons Season 10 Bart the Mother George Meyer
But now, lets say that I had a cartoon that didn’t have any commentators. The SQL doesn’t return anything since there’s nothing to link it to at the end. Is there anyway to have the SQL return at least one entry, even if the last one has a null value for the commentator?
Like this:
DVD Title Commentators
-----------------------------------------------------------------------------------------
The Simpsons Season 10 Lard of the Dance Mike Scully
The Simpsons Season 10 Lard of the Dance Ron Hague
The Simpsons Season 10 Bart the Mother Matt Groening
The Simpsons Season 10 Bart the Mother Mike Scully
The Simpsons Season 10 Bart the Mother George Meyer
The Simpsons Season 10 Lisa gets an A NULL
What Spud said; what you want is an outer join. So if your original query was something like:
Select DVD, Title, Commentators
from tableDVDs
join tableCommmentators on tableDVDs.dvdID = tableCommentators.dvdID
it’d change to something like:
Select DVD, Title, Commentators
from tableDVDs
LEFT OUTER join tableCommmentators on tableDVDs.dvdID = tableCommentators.dvdID
CAVEAT: I’ve only had one cup of coffee and I don’t know your table structure or the exact brand of SQL you’re using. But outer join is definitely your friend here.
If I were building this, I’d have 3 tables: one for DVDs, one for Cartoons, and one for Commentators. Not only does that de-duplicate the data (meaning: reduced storage, decreased query time, more effective memory caching), but it makes it a lot easier to answer questions like: “how many DVDs did episode 12 appear on?” “How many episodes did Bob Bobson comment on?” “Bob Bobson gave us a new headshot, which DVDs do we need to change the cover of?”
Anyway, we don’t know how the tables are laid out based on the OP anyway.
I think you’d actually need 4 (or even 5, depending on how you choose to model it) tables - 3 as you describe, and a fourth to hold all instances of a commentator on a cartoon on a DVD. Joining tables also count as tables, even if they can’t be related to a physical entity!
Incidentally, normalisation doesn’t always mean decreased query time, because it increases the number of joins that have to be made. There are good arguments for denormalisation in certain large databases, but in this case the answer is clearly to go 3NF.
Something like (Making WAG on how you connect your three tables)
SELECT DVD.Name, Cartoon.Episode_Title, ISNULL( Commentator.Name, 'None')
FROM DVD
INNER JOIN Cartoon
ON DVD.Key = Cartoon.DVD_Key
LEFT JOIN Commentator
ON Cartoon.Key = Commentator.Cartoon_Key
Cartoon_DVD_Link (connects cartoon key to DVD key)
DVD
DVD_Commentators_Link (connects DVD key to commentators link)
Commentators
Here’s another issue I came across while trying to implement the OUTER JOIN. A cartoon may have commentators on it from different DVDs. So, not only does the DVD_Commentators_Link have to match the DVD, it also has to match the specific cartoon.
Right now the way I’m doing it is backend programming where it looks for the double match (cartoon and DVD) but I was hoping there was a simpler way of doing it.
At first I wasn’t sure why this was a complex SQL Query… your clarifications have answered that question.
Normally the only reason I put an intermediate linked table into the mix if I have a many to many relationship.
Is there a 6th table that is the Commentators_Cartoon_Link or does the DVD_Commentators_Link include the Cartoon key as well? I’m assuming it has both.
This is venturing away from ANSI SQL to something more SQL Server specific, your flavor of DBMS may implement differently.
SELECT DVD.Name,
Cartoon.Episode_Title,
CASE
WHEN DCL.Commentator_Key IS NOT NULL
THEN SELECT Commentator.Name
FROM Commentator
WHERE Commentator.Key = DCL.Commentator_key
ELSE 'None'
END AS Commentator_Name
FROM Cartoon
INNER JOIN Cartoon_DVD_Link CDL
ON CDL.Cartoon_Key = Cartoon.Key
INNER JOIN DVD
ON CDL.DVD_Key = DVD.Key
LEFT JOIN DVD_Commentators_Link DCL
ON DCL.Cartoon_Key = Cartoon.Key
AND DCL.DVD_Key = DVD.Key
To allow for the most flexibility I’d set it up using 5 tables. The basics are the DVD table with the primary identifier as DVD_ID, the TOONS table with the primary identifier as TOON_ID, and finally the COMMENTATOR table with the primary identifier as COM_ID. These tables can also contain data that is unique to that item in the table. DVD could include format, date purchased, cost, etc. TOONS could have plot summary, guests, production date, etc. COMMENTATOR can have BIO, Birth/Death Dates, Photo, etc.
Then I would have two link tables. DVD_TOON link would just have the DVD_ID and TOON_ID. This way you can have the same cartoon on multiple DVD’s. Then you could have the DVD_TOON_COM link table which would connect the commentator(s) to the TOON on the Specific DVD.
That setup should give you the flexibility to do pretty much whatever you need.