SQL help.

I asked this before and nobody was able to help.

So I figured I’d ask a ‘human’ question and someone can attempt to convert it into SQL.

but before I do that… What is the technical name (if there is one) for a query that combines a sum from grouped data with a first and/or last entry of a specific field for that data?

For the sake of argument I have a table with account number, balance forward, and amount.

I want the SUM of the amount field, and the FIRST and LAST value of balance forward field for each account.

so my ‘english’ question is this - “give me the first value for balance forward, the last value for balance forward, a sum of amount, where the date is yesterday, for each account”

The annoying thing is the above should be a straight-forward (and commonly required) thing to do… but the SQL is proving hard for me to work out.

Unless I am missing some obvious method.
Halp!

Sorry, the data is in time order…

So the top would be the balance at the beginning of the day, and the bottom would be the balance at the end of the day.

for one day (yesterday) for all accounts (so it’s grouped… so I end up with three values for each account - a start bal, an end bal, and a sum of the amount)

Could you show the desciption for the table you are working with?

You mean all the fields, their data types, sizes etc?

There are a lot of fields. Only three of them are relevant to my GQ - Account number, balance forward (which is different for each record) and amount.

If it helps - it’s a table of bet transactions, each record corresponds to one bet. Each record updates the ‘balance_forward’ field.

SELECT AccountNumber,
(SELECT TOP 1 BalanceForward FROM table1 WHERE AccountNumber = T.AccountNumber AND DateEntered >= GetDate() ORDER BY DateEntered ASC) AS First,
(SELECT TOP 1 BalanceForward FROM table1 WHERE AccountNumber = T.AccountNumber AND DateEntered >= GetDate() ORDER BY DateEntered DESC) AS Last,
SUM(Amount) As SumAmount
FROM table1 T
WHERE DateEntered >= GetDate()
GROUP BY AccountNumber

You’ll have to jiggle with the date stuff to get yesterday’s data and it depends on your your particular database server.

Well basically I wondered it you have a timestamp for each transaction

Which flavor of SQL are you using? Oracle? MySQL? Microsoft? Different flavors will have different answers.

That’s a correllated subquery. The specific example is for MS-SQL/T-SQL, but it’s pretty universal and supported by the ANSI-92 standard.

*** Ponder

That’s interesting, I mostly use Oracle and never heard of it, and just found this page that talks a little about different platforms.

A properly designed correlated subquery shouldn’t need to resort to DB-specific TOP/LIMIT/ROWNUM tricks, as long as the data has decent recognizable ordering, such as a datetime or sequence number on each row.

*** Ponder

Yes. Each record has an exact timestamp… so the first record will be the earliest, and vice versa.

I would try:



SELECT accountnumber, SUM(amount),
    (SELECT balance FROM thetable 
        WHERE timestamp = (SELECT MIN(timestamp) 
                FROM thetable 
               WHERE accountnumber = T.account_number)) AS balance_start,
    (SELECT balance FROM thetable 
        WHERE timestamp = (SELECT MAX(timestamp) 
                FROM thetable 
                WHERE accountnumber = T.account_number)) AS balance_end
FROM thetable T
GROUP BY accountnumber


Manduck beat me to it.

I apologize for that earlier link; I didn’t read it carefully. The author of that guide used the TOP device in their examples, which is fine for T-SQL, but that might not be the DB-flavor you’re using. Manduck’s example should work on any SQL-92-compliant database.

*** Ponder

Am I missing something? If the rows are guaranteed to be in the correct order, it seems like this should do the trick:


SELECT AccountNumber, First(BalanceForward) AS Start, Last(BalanceForward) AS End, Sum(Amount) AS Total
FROM Table
WHERE TimeStamp = Date()-1
GROUP BY AccountNumber

Those functions aren’t Microsoft-specific, are they? (I know that Date() is, but there’s sure to be something like it in any language.)
If the rows aren’t always in the right order, it’s slightly more complicated, but I’d try to put a single subquery in the FROM clause, not two in the SELECT clause.

I may be way off, because I’m mostly self-taught.



SELECT accountnumber, SUM(amount),
    (SELECT balance FROM thetable 
        WHERE timestamp = (SELECT MIN(timestamp) 
                FROM thetable 
               WHERE accountnumber = T.account_number)) AS balance_start,
    (SELECT balance FROM thetable 
        WHERE timestamp = (SELECT MAX(timestamp) 
                FROM thetable 
                WHERE accountnumber = T.account_number)) AS balance_end
FROM thetable T
GROUP BY accountnumber


I don’t believe that takes yesterday into account. I’t will find the life of the account.



SELECT accountnumber, SUM(amount),
    (SELECT balance FROM thetable 
        WHERE timestamp = (SELECT MIN(timestamp) 
                FROM thetable 
               WHERE accountnumber = T.account_number
                  AND timestamp like (to_date("DD-MON-YYYY"), (SYSDATE -1))
     )) AS balance_start,
    (SELECT balance FROM thetable 
        WHERE timestamp = (SELECT MAX(timestamp) 
                FROM thetable 
                WHERE accountnumber = T.account_number
                     AND timestamp like (to_date("DD-MON-YYYY"), (SYSDATE -1))
     )) AS balance_end
FROM thetable T
GROUP BY accountnumber


Gah stupid timeout



SELECT accountnumber, SUM(amount),
    (SELECT balance FROM thetable 
        WHERE timestamp = (SELECT MIN(timestamp) 
                FROM thetable 
               WHERE accountnumber = T.account_number
                  AND timestamp like (to_date("DD-MON-YYYY"), (SYSDATE -1))
     )) AS balance_start,
    (SELECT balance FROM thetable 
        WHERE timestamp = (SELECT MAX(timestamp) 
                FROM thetable 
                WHERE accountnumber = T.account_number
                     AND timestamp like (to_date("DD-MON-YYYY"), (SYSDATE -1))
     )) AS balance_end
FROM thetable T
WHERE  timestamp like (to_date("DD-MON-YYYY"), (SYSDATE -1))
GROUP BY accountnumber


First() and Last() sound like functions specific to some particular DB-flavor, and shouldn’t be relied upon. The OP never specified what DB they were using, unless I missed it?

The Date()-1 construct is also DB-specific. I hate translating date functions between DB flavors; they aren’t covered in the SQL-92 standard, so everybody does them differently.

One should never assume the rows are in a specific order in the database, this is why we have key and index fields.

Because the OP’s requirement is to join on 2 different rows, I don’t think there’s any way around doing 2 different subselects to get the data they want.

Wolfman’s altered example also relies on DB-specific date functions. Lobsang will have to look up the correct syntax for their DB, and also do whatever tricks are necessary to split out the day component of the datetime to cover one specific date.

*** Ponder

Yeah, I took it as a given that anything date-related would need to be translated, and have conversion functions added.
Here’s my single-subquery version, without First, Last, or the assumption of record-order. It does assume that each TimeStamp is unique.


SELECT minmax.AccountNumber, small.BalanceForward AS Start, big.BalanceForward AS End, minmax.Total
FROM Table AS small
     INNER JOIN (
          [SELECT AccountNumber, Min(TimeStamp) AS FirstTime, Max(TimeStamp) AS LastTime, Sum(Amount) AS Total
          FROM Table
          WHERE TimeStamp = Date()-1
          GROUP BY AccountNumber]. AS minmax
               INNER JOIN Table AS big ON minmax.AccountNumber = big.AccountNumber AND minmax.LastTime = big.TimeStamp
     ) ON small.AccountNumber = minmax.AccountNumber AND small.TimeStamp = minmax.FirstTime