Simple MySQL query question.

I need to pull records from one table, only if the entry is referenced in another table.

Basically for the purposes of this, table 1 looks like:

Key Name
1 Spring
2 Summer
3 Fall
4 Winter

Table 2 looks like:

Key Name Type Table1_Key
1 Snow 1 4
2 Harvest 1 3
3 April 2 1
4 Santa 2 3

I need to pull only the records from table #1 that are associated with specific “Type” values in table two. So, for example, I might want to call all lines from table #1 that are associated with Type 1 in the second table (so, entries 3 and 4 in the first table, but not 1 and 2).

So, the results would be:

3 Fall
4 Winter

Does this make sense?

Without access to a DB where I can create a scratch table for testing (so probably wrong somehow :smack: ):

SELECT key, name from Table1
WHERE key in (SELECT DISTINCT table1_key from Table2 where type=‘1’);

If these tables are larger than the demos above, you’ll definitely want an index on Table2.type!

I’ve found subqueries to be slow in mySQL, so I’d do it with a join:

select table1.key, table1.name
from table1, table2
where table2.table1_key=table1.key and table2.type=1;

You’re probably right… and I managed to miss the “My” part of MySQL in the OP :o (so I was offering a generic SQL query.)

Small tables; table 1 is about 15 entries, and table 2 has maybe 200.

So, right now I had this:

SELECT category.category_id, category.category_name FROM category
. . . which called all entries from table 1.

Now I should have:

SELECT category.category_id, category.category_name FROM category WHERE category.category_id IN (SELECT DISTINCT product.category_id FROM product WHERE type=‘1’)
??

So, I’m pulling the relevant entries from the table ‘category’ where the category_id matches to the category_id field in my ‘product’ table, where the product has a type of 1?

Could you explain the DISTINCT command for me? I’m an SQL newbie, learning it as I go.

Oh, gotcha! That one makes quick sense. :slight_smile:

The DISTINCT operator eliminates repeated values in your result, leaving you with distinct values.

Let’s say you had a table Customers, with the following entries:

ID Name
01 John
02 Mike
03 John
04 Bill
05 John

“SELECT Name from Customers” would get you a result with 5 rows, whereas “SELECT DISTINCT Name FROM Customers” would return only 3 rows:

John
Mike
Bill

Duplicate values have been dropped.

Re: the original question, a join will be much quicker than a subquery, so I’d use that. If you want to get all values from table2 where they match values in table1, and organise them according to the key, you need to add a GROUP BY clause to the join. I don’t think this is what you want, so I’ll refrain from posting the exact syntax here, but grouping is a useful thing.

The DISTINCT in the subquery doesn’t change the meaning up the query, and shouldn’t change the result. The IN operator returns TRUE if the value being searched for (in this case, Table1.key) is in the set returned by the subquery, regardless of how many times it appears there.

On the other hand, if you want a join with exactly the same results as the subquery you will need a DISTINCT in the outer query:

SELECT DISTINCT Table1.key, Table1.name
FROM Table1, Table2
WHERE Table1.key = Table2.Table1_key
AND Table2.type = ‘1’

Without the DISTINCT the query can return duplicate rows. Either you need the DISTINCT in this query or your application has to be able to cope with duplicates. DISTINCT processing takes time, and might make the join query no faster than the subquery.

Many database systems will use join processing for this type of subquery (the most sophisticated database systems convert almost all subqueries to joins). I don’t know whether MySQL does this. I will say, though, that I believe it’s good policy to write queries for the semantics you want and leave performance tuning for a later stage of development. In this case, although MySQL may be slower at subqueries than it is at joins, it might not matter in the particular application (with the amount of data you say you have, the difference in performance might be negligible). I believe it’s better to write queries that describe the results you want in the clearest way possible, and only twiddle with them when necessary.