SQL question

I write quite a bit of SQL code, but I’ve never come across this before:

INNER { OPTION SWAP_ORDER=“True”, SEMIJOIN=“True” } JOIN

and googling isn’t getting me anything that makes sense…what the heck is the SWAP_ORDER option? Everything else about the query looks straightforward but I can’t figure out what this option is doing and if it’s something useful. Any SQL experts here know what it does?

Is this any help

Not really. That’s the only site I can find, and I have no idea what they mean by “the order of the join”.

At first I thought that inner joins weren’t commutative, but I think that they are, and this is just a bad example.

I find this reference helpful. It would make a difference for right and left joins, but then you could just change a single keyword instead of adding an option.

Maybe it makes a difference to the query plan, so it’s included for performance reasons.

It’s a query tuning thing. The order of (inner) joins can affect query performance but not the actual results you get. Though the optimizer doesn’t just run things in the order you have them written, sometimes it comes up with an execution plan that is very far from optimal. In that case, it might help to give it some guidance as to the order of your joins. So the swap_order you’ve run into is probably the result of someone trying to speed up a slow query.

Yeah, that’s not really a helpful manual reference. The only thing I can think of is that:

select * FROM table1 INNER { OPTION SWAP_ORDER=“True”} JOIN table2 on table1.field1 = table2.fieldB

is equivalent to

select * FROM table2 INNER JOIN table1 on table2.fieldB = table1.field1

That is, the fields from table2 will be returned towards the left, and the fields from table1 towards the right. Can’t really figure out why this would be important, even in “queries with complex joins.” (At least, not for anybody who can write out an explicit select list…)

ETA: Hmm… a few of the other opinions that appeared while I was typing this might be on better tracks…

And I’m thinking your idea might be the right one. It doesn’t sound practical, but re-reading the manual entry, I’m inclined to think this is what it’s for.

Am I missing something on this page? I’m not finding any reference to the SWAP_ORDER option there. I get the difference between left/right, inner/outer, etc., as I use them all the time. I’m just trying to see how this piece of code is doing anything meaningful.

It was just a reference I use a lot. It was helpful to point out that switching the order of the join won’t affect the result set, but as **Mithras **pointed out, it probably affects the query plan

So how is that command more effective than just typing the correct join in the first place? I tried the query that way, and then the same query but replaced it with an inner join and the execution time was almost identical—the 2nd way was about 0.2 seconds faster.

From bob++'s link, these options appear to be Cognos-specific. IME, Cognos can make some pretty inefficient calls to the database, and I guess these options were added to provide a little help.

Also in my experience, the optimizer in most RDBMS does a better job than I can on inner joins, and most slowness issues in those queries could be fixed with an index.

So, I’m having trouble coming up with an non-completely complex example where this SWAP_ORDER flag would make a real difference.

I think, though, since the SEMIJOIN flag is also set, that there might be something going on. It looks like the SEMIJOIN will pre-fetch the values from the right side of the join and, if the list is small enough, convert the join to in IN clause. So,
INNER {OPTION SWAP_ORDER=“true”, SEMIJOIN="true} JOIN on a.id=b.id
would become, more or less
WHERE b.id IN (1,2,3,5,10,30)

This could probably provide some improvement in conditions where the number of rows in a is much smaller than the number of rows in b.

It does feel, though, like something where someone went to a site like stackoverflow and said “my query is slow-HALP!”, and found out about this from someone who used it once in a esoteric edge case, and it sort of worked, so now they try it whenever query perfomance needs a boost.

I think if we came across something like this on my team during code review, we’d try to re-write the query to make it a bit more understandable what is going on.

The guy who wrote this does a lot of work in Cognos, so that makes some sense to me. Unfortunately he’s out on disability for the next few months, so i can’t ask him what he was trying to do.

It’s going to greatly depend on the query - I’m sure we’ve both written queries that took minutes when written one way and seconds when written another, even though they return the same result set. It could be that for your, probably fairly simple, test case, there’s not much of a difference, but for a complex query with subqueries and dozens of where clauses and billions of database rows, there’s a big difference, and rewriting it would be a pain.

This sounds like the original programmer could make better use of those options consisting of program lines beginning with

I only use SQL Server and Oracle now and those optimizers are so good you don’t have to tinker with them. As Aktep said, usually all you need to do is to add an index.

In the old days you would try strange things like changing the order of tables in the FROM clause, or WHERE a.key=b.key AND a.key=c.key AND b.key=c.key