PDA

View Full Version : Is this legitimate use of SQL?

Mangetout
05-07-2003, 02:53 AM
I have a problem to solve;
I have a stock of postage stamps of four different denominations - 1p, 5p, 19p and 27p
Postage is charged in bands:
Second class postage for an item weighing >150g<200g is 54p - the most efficient way to make this up is to use 2x27p stamps
First class postage for the same item costs 72p; one way to make this up would be 1x27p + 2x19p + 1x5p + 2x1p (6 stamps in 4 denominations), but another would be 3x19p + 3x5p (6 stamps in 2 denominations), yet another would be 2x27p + 3x5p + 3x1p (7 stamps in 3 denominations.
Efficiency is here defined as the firstly the least number of stamps and if there is more than one possible solution, the least number of denominations, so the 6 stamps in 2 demoninations solution appears to be the best.

Anyway, I know how I will do it, but I want to know if the SQL I intend to use is non-standard or frowned upon in some way.

Suppose I have two tables:
Table1:
Num
1
2
3

Table2:
Alph
A
B
C

(Using Jet SQL in Microsoft Access), this query:
SELECT Table1.Num, Table2.Alph
FROM Table1, Table2;

Yields the desired result of every possible combination of the two:
Num, Alph
1,A
2,A
3,A
1,B
2,B
3,B
1,C
2,C
3,C

It does so because there is no join specified.

Is this a quirk of Jet SQL (or a quirk of SQL generally), or is it a documented and approved technique? i.e. Am I going to find that an attempt to implement similar techniques on other DB apps fails miserably?

05-07-2003, 02:57 AM
It's standard, it's called a cartesian product and occurs if you fail to specify an existing foreign key between tables as well.

Mangetout
05-07-2003, 03:02 AM
Yay!

Thanks.

05-07-2003, 03:05 AM
Fluffycat (http://www.fluffycat.com/sql/) is my favourite beginners SQL site and has a good explanation with a movie database.

Mangetout
05-07-2003, 04:53 AM
Thanks, I'll take a look; I'm entirely self-taught on SQL and visual programming, so I tend to stumble across these sorts of things and wonder whether they are documented features or not.

My postage calc applet is finished now and it works like a dream...

Postage for 164g item is 72p

3 x Second Class (19p) Stamps
3 x 5p Stamps

A total of 6 Stamps in 2 denominations

JeffB
05-07-2003, 08:38 AM
There's nothing inherently wrong with not joining two table, but DO NOT DO THIS WITH VERY LARGE TABLES!

Not that I would know anything about it.

Terminus Est
05-07-2003, 08:51 AM
Heck, don't do this even with two reasonably-sized tables.

100 records x 100 records = PC LOAD LETTER

ultrafilter
05-07-2003, 09:50 AM
As others have alluded to, the size of the Cartesian product of two tables is the product of the sizes of the tables. Be careful.

Mangetout
05-07-2003, 09:53 AM
Hmmm, In another database, I have a table of keys (stock in stores) that is checked for completeness against a query like this; there are 2000 or so products, 180 or so stores; the query returns 350K+ records - still works OK, is just a little slow...

Anyway, for an application like this one (the postal charges calculator, it is actually four such queries - each of which builds the multiples of 1 to 10 of the each stamp denomination, then another one that draws all possible combinations together - without criteria it returns 14641 rows (each row describing an unique way to combine the stamps to make a value); it seems to run very fast.

Some dull facts that arose during the process... given up to ten each (including the possibility of zero) of the four denominations (actually now updated to 28p, 20p, 5p and 1p as a result of an imminent tariff change), there is (obviously only one way to make the minimum value - 1 x 1p and there is only one way to make the maximum (10 of each). There are 50 different ways to make up the values of £2.38, £2.46, £2.66, £2.74, £2.94 and £3.02

Bippy the Beardless
05-07-2003, 11:13 AM
Just remember that 'Cartesian product' is a dirty word to most dba's, as they usually only occur as an error, and can be grindingly slow.
There is usually (allways?) a 'better 'alternative such as doing the test within a programming language that has accessed the db tables.
But 'better' is very relative, as what you are doing is simple and works it is the best sollution (but will become a pain if you start keeping 20 denominations of stamps).
Cheers, Bippy

EasyPhil
05-07-2003, 11:18 AM
Originally posted by Bippy the Beardless
Just remember that 'Cartesian product' is a dirty word to most dba's, as they usually only occur as an error, and can be grindingly slow.
There is usually (allways?) a 'better 'alternative such as doing the test within a programming language that has accessed the db tables.
But 'better' is very relative, as what you are doing is simple and works it is the best sollution (but will become a pain if you start keeping 20 denominations of stamps).
Cheers, Bippy

Which is why you may want to consider a different method now as opposed to later i.e. thing for tomorrow rather than for today.

Mangetout
05-07-2003, 05:28 PM
Originally posted by Bippy the Beardless
There is usually (allways?) a 'better 'alternative such as doing the test within a programming language that has accessed the db tables.
But 'better' is very relative, as what you are doing is simple and works it is the best sollution (but will become a pain if you start keeping 20 denominations of stamps).
Cheers, Bippy Sound advice; I'm not likely to expand this project beyond changing the pricing scale or the denominations of the stamps I keep (both of which I have catered for in the front end) - it is just to cost up my eBay items.

Trouble is that this particular sort of problem is a bit like a combination of finding factors and the travelling salesman puzzle (albeit on a small scale); I can't think of another way to do it than essentially test every combination (OK, I could probably whip up an algorithm that eliminated the need to test half of the types of combination, or something) - if anyone has some ideas on how it could be done more elegantly, I'd be genuinely interested to hear about them.

Reproducing a process where the entire cartesian product is required and used (every single returned record) in code would involve mucho iteration and looping, wouldn't it?

Bippy the Beardless
05-07-2003, 05:56 PM
Yes the cartesian product may need to be generated at one point. The problems with using the database are only really problems if you want to do something else with the database at the same time. So cartestian products are bad on multi user systems, or multi task systems that have to opperate at a reasonable speed at all times and cannot be allowed to slow down whilst a computation of this sort is occuring.
If space is not a problem (and with modern cheep hard drives this is often the case) a system where you create a table from your cartesian select statement, and then keep referring back to that table may well be advantageous. You would then just recreate the table any time a buisness rule (stamp price in this case) changes. So you don't do the cartesian each time you make a postage calculation.

Now as with all computational problems, it is possible to go to extreme lengths in improving the efficiency of the way something is calculated.
Don't get carried away in improving efficiency for something which does not need to be any more efficient.

I bolded that, because so many otherwise very good programmers come unstuck by optimizing code that does not need it, leading to obscure and difficult to maintain code. If it is not time or memory critical, write code for best possible understanding (comments Og damn it!)
don't optimize it.

Mangetout
05-07-2003, 06:09 PM
Interesting; I can see exactly why you're suggesting it (the cartesian product doesn't change much, but that seems to go against the sacred rule of 'don't store anything that can be calculated'. In any case, I suspect the break-even point is quite a distance away from my humble postage calc.

Bippy the Beardless
05-07-2003, 07:20 PM
'Don't store anything that can be calculated', is a dodgy rule these days with very cheap disks and high multi user usage. But the other rule that it breaks 'don't store the same data twice' would be another reason to stick with the on-the-fly calculation method. But we are falling into my programmer's optimisation trap ;) we probably allready spent more time thinking about this and chatting (nice as it has been) than could ever be saved by optimising a rarely used function.

ultrafilter
05-07-2003, 08:37 PM
Mangetout: There's a very straightforward dynamic programming algorithm that will find the least possible number of stamps very quickly. I'd have to do a little bit of work to rediscover it, as I lost the notes for my algorithms class, but it's pretty straightforward. I'll get back to you.

ultrafilter
05-07-2003, 09:09 PM
All right, I think I've got this. We're going to order the value of the stamps (ck) in decreasing order. Let C(i, j) be the least number of stamps necessary to come up with a value of j pence using at most i coins. C(i, 1) = 1, and C(1, j) = j/c0 if c0|j, and 0 otherwise.

In general, C(i, j) = mink({C(i - 1, j - ck)}) + 1. In this case, k ranges from 0 to 3, with c0 = 27, c1 = 19, c2 = 5, and c3 = 1.

The value you'll be interested in is C(n, p), where p is the total value you want, and n is the least integer such that C(n, p) > 0. Note that n < p because of the 1p stamp. So I guess you're really interested in C(p, p).

Now here's the key to making this work: Don't do it recursively! If you do, you'll spend an inordinate amount of time solving the same subproblems over and over again. Instead, make a p x p table, and start at the top left. Fill in the base cases using the definition in the first paragraph, and then fill in the rest using the recurrence relation. Store a pointer in each cell to the cell that was used to calculate it (the cell with the minimum value) so you can retrieve the set of stamps to use, in addition to the number.

This algorithm runs in time Q(p2), and uses the same amount of space. It also can be re-used for other problems.

Anyone who reads this is of course invited to check my math.