Sometimes I have trouble with classes not because I don’t understand the material but because the teacher uses a metaphor that doesn’t work for me. The metaphor makes my brain go “wait, wot?” and then spin on trying to make the metaphor fit the topic when I can list all of these reasons why it doesn’t and I lose track of what I’m supposed to be learning and when my brain comes up for air, the teacher is on Chapter 10.
Case in point, did a beginning SQL class at work this morning. I’ve done SQL classes in college and a few times since then and finally realized why I have so much trouble remembering the difference between inner joins and outer joins and left inners and right inners and blahblahblah…
This concept is usually taught by going back to set theory with Venn diagrams. I understand set theory and am really good with the logic. But it doesn’t work for me as a metaphor for table joins. Why? I’m a spacial thinker and usually build 3D maps of things in my head as I learn them. I’m also very much a literalist. Venn diagrams don’t work for me as a way to visualize a database. For things like database tables that don’t have a place in the physical universe, I prefer to visualize them as they’re described in the DBA schema representation. So when I need to learn the concept of a join, I would visualize the result set of joining the two schemas.
As a literalist, I also have trouble with the terms left, right, inner and outer. Table A is not really to the left of Table B. It could be above it, or below it, in front of it or behind it in 3 dimensional space as much as to the left of it. It breaks down as a metaphor especially when you have a complex query that joins more than two tables. Is Table A to the left of Table B but to the right of Table C? Obviously not, so it’s confusing to try to use those relative terms.
So I never remember the difference between a left inner join and a right outer join because those terms seem illogical to me and don’t help me visualize the concept. Anybody else struggle with this kind of thing?
Sure, I think everyone has situations where a certain visualization works easily for them and others find it difficult or counterintuitive. Since I’ll take any chance to bring up Feynman, here’s part of an interview where he discusses something like this: Video
It is in the SQL statement, which is what matters. I suppose you might try a mental visualization of the statement itself, as a single left-to-right chain of tables.
It’s not if the schema has long names and uses aliases, so that the query gets long. then people tend to list them one line at a time (with indents that don’t work here), in which case the tables are one above the other.
SELECT A.USER_ID ID,
A.FULL_NAME NAME,
A.USERS_AGE AGE,
A.CHECK_AMOUNT AMOUNT
FROM SCL_SPRING_CUSTOMERS A,
WEF_MACHINE_ORDERS B
WHERE A.ID = B.CUSTOMER_ID;
SELECT
A.USER_ID ID,
A.FULL_NAME NAME,
A.USERS_AGE AGE,
A.CHECK_AMOUNT AMOUNT
FROM SCL_SPRING_CUSTOMERS A, WEF_MACHINE_ORDERS B
WHERE A.ID = B.CUSTOMER_ID;
You have to use spaces to indent, but it's better than nothing!
Back to the original discussion, though - I've always found it relatively easy to visualise LEFT or RIGHT joins as reading through the SQL statement from right to left, but then I tend to use the ANSI syntax while writing queries, so my statement would be
SELECT
A.USER_ID ID,
A.FULL_NAME NAME,
A.USERS_AGE AGE,
A.CHECK_AMOUNT AMOUNT
FROM
SCL_SPRING_CUSTOMERS A
LEFT JOIN
WEF_MACHINE_ORDERS B
ON A.ID = B.CUSTOMER_ID;
To me, that makes intuitive sense - there's one table that's the primary selection, and then you're joining onto other tables, choosing to retain all or some rows from both tables, depending on the situation. I agree that with a huge number of joins (which we often get in data warehouses) it can get messy, particularly to debug.
It also has the advantage that it allows a clear separation of join predicates (using ON) from filter clauses (using WHERE), whereas the old syntax has a jumble of it all. Please see http://www.orafaq.com/node/2618 for more discussion of this, although in this, as in all things syntax, YMMV.
How in the world does that statement without the join called out explicitly even work? Is that some kind of implied inner join?
I have a feeling ANSI syntax is a best practice kind of thing; not strictly required by the language, but I do believe you’d get your ass kicked if you tried to pass off that ambiguous stuff in a professional IT setting.
But anyway, while I can visualize joins and what-not, I tend to think of it in more of a narrative sense than anything else. So rather than trying to imagine some sort of visual metaphor for how tables A and B are left-joined, it’s more of a “all of table A, and the corresponding records for table B where they match on the join columns” A right join is the same; all of table b, and the corresponding ones from table A where their join columns match" Inner joins are more like “only the ones in table A and table B where their join columns match”
I guess it’s probably more a matter of having done a LOT of professional SQL coding over the past decade and a half, so that a lot of it is more or less intuitive by now.
[QUOTE=bump]
How in the world does that statement without the join called out explicitly even work? Is that some kind of implied inner join?
[/QUOTE]
Oracle will let you get use this (or it certainly used to) it’s an inner join on the id columns in the where clause. I never saw an explicit JOIN in a statement until I saw MySQL queries. For an outer “join” in the where clause you put a (+) after the column where you wanted the outer join.
Yes, but you’ll still need inner (mandatory) vs outer (optional) joins even then.
I hate those terms. “Inner” just means the match has to be found. “Outer” means that one or both sides are optional. I have no clue where inner and outer come from. In fact I only include “outer” if both sides are optional; otherwise it’s implied by the “left” or “right”.
And yeah, left vs right just means which side of the word “join” they appear on when typing, and tells you which side must be found:
A left join B: you always have to have A (the one on the left), but you don’t always have to have B.
A right join B: you always have to have B (the one on the right) but you might not have A.
A outer join B: you have to have A, or B, or both.
Personally I only use left joins, just as a convention of most mandatory goes first, less mandatory goes next and so on.