SQL Gurus - Row / column transpose - sorta

We have a goofy problem that I’m hoping to solve with reasonable elegance. If anyone has any clues, I’d appreciate a hint.

We’re using MS SQL Server Reporting Services to generate a report to match a government form’s format. Simplifying heroically, the end result resembles a spread sheet where each row represents a category of data, each column represents a day of the week, and the cell contents is a total.

The idea is that on Monday you run the report and only Column1 has data; the other columns are blank/zero/null. On Tuesday the first 2 columns are filled in, etc.

The original input params are start date & number of days, where number of days is constrained to 1…7.

For the rest of the example, lets’ ignore the difference between dates & days of the week. I’ll use DOW because it make the example read better. But we all understand the difference.

So right now we have a fairly complex query which eventually gets us to an intermediate result set like this (asuming our source data & parameters results in 2 categories & 3 days)


Category DayOfWeek Total
Alpha    Monday    110
Alpha    Tuesday   120
Alpha    Wednesday 130
Bravo    Monday    210
Bravo    Tuesday   220
Bravo    Wednesday 230

And we want to send back a final result set that looks like


Category Day1 Day2 Day3 Day4 Day5 Day6 Day7
Alpha    110  120  130  {}   {}   {}   {}
Bravo    210  220  230  {}   {}   {}   {}


Where {} stands for null or zero, we’re not sure which works better. Then the report definition in SSRS can trivially convert that into the desired display, blanking the null/zero fields.

For the first pass at a solution, we can assume that the upstream query always produces a record for each category for each date, even if the total for that category/date = 0. Bonus points for a solution which will insert zeros into the matrix where input records are missing. i.e. from the example above, assume the Bravo Tuesday record was not in the intermediate result set and we want the same output as above, with ‘0’ where ‘220’ is now.

The total number of categories is dozens, not thousands, & we have plenty of horsepower so the algorithm can be a messy if necessary.

Right now we can trivially see how to iteratively build up a temp table one record at a time by querying out each needed cell value from the intermediate table then inserting the 7 queried values into the 7 columns of the temp table. Like you’d attack the problem in a typical high level langauge.

But that just screams KLUDGE!!! to us. We’re thinking there ought to be a more “natively SQL-style” approach for lack of a better term.

Our Google-fu is weak these days. And we know this concept, translate a record set to a matrix, is a commonly needed one. And OLAP is out of bounds. We need to solve via plain T-SQL code.

Ideas anyone?

google CROSSTAB queries.

That would work. Or…

update FinalResult set Day1 = Total
where DayOfWeek = ‘Monday’ and a.Category = b.Category
from FinalResult a
inner join DayOfWeek b
on (whatever)

But you’d have to INSERT INTO FinalResult first, then you’d have to run the above query seven times.

The following tutorial works in Oracle - I believe the DECODE function is unique in Oracle, however there is an IF/THEN/ELSE construct in SQL Server, right?

The crosstab works with known column values - if you don’t anticipate certain results they will end up in the “Other” column.

I would now regard this as a curiosity since there are Data Warehouse functions that will produce dazzling results. But here it is anyway.

BTW - Decode is DECODE(expression, match1, replacewith1, match2, replacewith2… matchN, replacewithN, NoMatchReplaceWith)



SQL> set echo on
SQL> rem ============ raw data from tables: emp, training, emp_training ============
SQL> select * from emp order by ename;
     EMPNO ENAME      JOB              MGR HIREDATE                   SAL       COMM     DEPTNO
---------- ---------- --------- ---------- ------------------- ---------- ---------- ----------
      7876 ADAMS      CLERK           7788 1987-07-13 00:00:00       1100                    20
      7499 ALLEN      SALESMAN        7698 1981-02-20 00:00:00       1600        300         30
      7698 BLAKE      MANAGER         7839 1981-05-01 00:00:00       2850                    30
      7782 CLARK      MANAGER         7839 1981-06-09 00:00:00       2450                    10
      7902 FORD       ANALYST         7566 1981-12-03 00:00:00       3000                    20
      7900 JAMES      CLERK           7698 1981-12-03 00:00:00        950                    30
      7566 JONES      MANAGER         7839 1981-04-02 00:00:00       2975                    20
      7839 KING       PRESIDENT            1981-11-17 00:00:00       5000                    10
      7654 MARTIN     SALESMAN        7698 1981-09-28 00:00:00       1250       1400         30
      7934 MILLER     CLERK           7782 1982-01-23 00:00:00       1300                    10
      7788 SCOTT      ANALYST         7566 1987-07-13 00:00:00       3000                    20
      7369 SMITH      CLERK           7902 1980-12-17 00:00:00        800                    20
      7844 TURNER     SALESMAN        7698 1981-09-08 00:00:00       1500          0         30
      7521 WARD       SALESMAN        7698 1981-02-22 00:00:00       1250        500         30

14 rows selected.

SQL> select * from training;
TRAININGNO DESCRIPTION
---------- ------------------------------
        10 BASIC ORIENTATION
        20 YEARLY SAFETY REVIEW
        30 BUSINESS CONDUCT
        40 WORKING TOGETHER
        98 Oddball training 1
        99 Oddball training 2

6 rows selected.

SQL> select * from emp_training;
     EMPNO TRAININGNO TRAINDATE
---------- ---------- -------------------
      7369         10 2006-03-18 00:00:00
      7369         20 2006-03-19 00:00:00
      7521         10 2006-03-18 00:00:00
      7521         20 2006-03-19 00:00:00
      7521         40 2006-05-18 00:00:00
      7566         10 2006-03-18 00:00:00
      7566         20 2006-03-19 00:00:00
      7698         10 2006-03-18 00:00:00
      7698         20 2006-03-19 00:00:00
      7839         10 2006-03-18 00:00:00
      7839         20 2006-03-19 00:00:00
      7839         30 2006-04-26 00:00:00
      7902         20 2006-03-19 00:00:00
      7934         10 2006-03-18 00:00:00
      7934         20 2006-03-19 00:00:00
      7934         30 2006-04-26 00:00:00
      7934         40 2006-05-18 00:00:00
      7788         10 2006-03-18 00:00:00
      7876         10 2006-03-18 00:00:00
      7369         10 2005-03-17 00:00:00
      7521         10 2006-03-17 00:00:00
      7369         98 2006-01-08 00:00:00
      7934         99 2006-01-26 00:00:00

23 rows selected.

SQL> rem ============ find out who has taken training ===============================
SQL> select e.ename, t.description
  2  from emp e, training t, emp_training et
  3  where e.empno=et.empno and
  4      et.trainingno=t.trainingno
  5  order by 1,2;
ENAME      DESCRIPTION
---------- ------------------------------
ADAMS      BASIC ORIENTATION
BLAKE      BASIC ORIENTATION
BLAKE      YEARLY SAFETY REVIEW
FORD       YEARLY SAFETY REVIEW
JONES      BASIC ORIENTATION
JONES      YEARLY SAFETY REVIEW
KING       BASIC ORIENTATION
KING       BUSINESS CONDUCT
KING       YEARLY SAFETY REVIEW
MILLER     BASIC ORIENTATION
MILLER     BUSINESS CONDUCT
MILLER     Oddball training 2
MILLER     WORKING TOGETHER
MILLER     YEARLY SAFETY REVIEW
SCOTT      BASIC ORIENTATION
SMITH      BASIC ORIENTATION
SMITH      BASIC ORIENTATION
SMITH      Oddball training 1
SMITH      YEARLY SAFETY REVIEW
WARD       BASIC ORIENTATION
WARD       BASIC ORIENTATION
WARD       WORKING TOGETHER
WARD       YEARLY SAFETY REVIEW

23 rows selected.

SQL> rem ============ show employees even if they haven't taken training ===========
SQL> select e.ename, t.description
  2  from emp e, training t, emp_training et
  3  where e.empno=et.empno(+) and
  4      et.trainingno=t.trainingno(+)
  5  order by 1,2;
ENAME      DESCRIPTION
---------- ------------------------------
ADAMS      BASIC ORIENTATION
ALLEN
BLAKE      BASIC ORIENTATION
BLAKE      YEARLY SAFETY REVIEW
CLARK
FORD       YEARLY SAFETY REVIEW
JAMES
JONES      BASIC ORIENTATION
JONES      YEARLY SAFETY REVIEW
KING       BASIC ORIENTATION
KING       BUSINESS CONDUCT
KING       YEARLY SAFETY REVIEW
MARTIN
MILLER     BASIC ORIENTATION
MILLER     BUSINESS CONDUCT
MILLER     Oddball training 2
MILLER     WORKING TOGETHER
MILLER     YEARLY SAFETY REVIEW
SCOTT      BASIC ORIENTATION
SMITH      BASIC ORIENTATION
SMITH      BASIC ORIENTATION
SMITH      Oddball training 1
SMITH      YEARLY SAFETY REVIEW
TURNER
WARD       BASIC ORIENTATION
WARD       BASIC ORIENTATION
WARD       WORKING TOGETHER
WARD       YEARLY SAFETY REVIEW

28 rows selected.

SQL> rem ============ crosstab - part 1, raw data ==================================
SQL> select e.ename, et.trainingno
  2  from emp e, training t, emp_training et
  3  where e.empno=et.empno and
  4      et.trainingno=t.trainingno
  5  order by 1,2;
ENAME      TRAININGNO
---------- ----------
ADAMS              10
BLAKE              10
BLAKE              20
FORD               20
JONES              10
JONES              20
KING               10
KING               20
KING               30
MILLER             10
MILLER             20
MILLER             30
MILLER             40
MILLER             99
SCOTT              10
SMITH              10
SMITH              10
SMITH              20
SMITH              98
WARD               10
WARD               10
WARD               20
WARD               40

23 rows selected.

SQL> rem ============ crosstab - part 2, specific row values become cols ===========
SQL> select e.ename,
  2       decode(et.trainingno,10,1) t10,
  3       decode(et.trainingno,20,1) t20,
  4       decode(et.trainingno,30,1) t30,
  5       decode(et.trainingno,40,1) t40,
  6       decode( et.trainingno,10,0,20,0,30,0,40,0,1) oth
  7  from emp e, training t, emp_training et
  8  where e.empno=et.empno and
  9      et.trainingno=t.trainingno
 10  order by 1;
ENAME             T10        T20        T30        T40        OTH
---------- ---------- ---------- ---------- ---------- ----------
ADAMS               1                                           0
BLAKE               1                                           0
BLAKE                          1                                0
FORD                           1                                0
JONES               1                                           0
JONES                          1                                0
KING                1                                           0
KING                           1                                0
KING                                      1                     0
MILLER              1                                           0
MILLER                                                          1
MILLER                                               1          0
MILLER                         1                                0
MILLER                                    1                     0
SCOTT               1                                           0
SMITH               1                                           0
SMITH               1                                           0
SMITH                                                           1
SMITH                          1                                0
WARD                1                                           0
WARD                1                                           0
WARD                           1                                0
WARD                                                 1          0

23 rows selected.

SQL> rem ============= crosstab - part 3, collapse using sum and group by ==========
SQL> select ename,
  2       sum(t10) ORI,
  3       sum(t20) SAF,
  4       sum(t30) BUS,
  5       sum(t40) WRK,
  6       sum(oth) OTH
  7  from (
  8   select e.ename,
  9          decode(et.trainingno,10,1,0) t10,
 10          decode(et.trainingno,20,1,0) t20,
 11          decode(et.trainingno ,30,1,0) t30,
 12          decode(et.trainingno,40,1,0) t40,
 13          decode(et.trainingno,10,0,20,0,30,0,40,0,1) oth
 14   from emp e, training t, emp_training et
 15   where e.empno=et.empno and
 16         et.trainingno=t.trainingno
 17     )
 18  group by ename;
ENAME             ORI        SAF        BUS        WRK        OTH
---------- ---------- ---------- ---------- ---------- ----------
ADAMS               1          0          0          0          0
BLAKE               1          1          0          0          0
FORD                0          1          0          0          0
JONES               1          1          0          0          0
KING                1          1          1          0          0
MILLER              1          1          1          1          1
SCOTT               1          0          0          0          0
SMITH               2          1          0          0          1
WARD                2          1          0          1          0

9 rows selected.

SQL> rem ============= crosstab - variant: Yes/No indicator =======================
SQL> select ename,
  2       decode(sum(t10),0,'   ',' x ') ORI,
  3       decode(sum(t20),0,'   ',' x ') SAF,
  4       decode(sum(t30),0,'   ',' x ') BUS,
  5       decode(sum(t40),0,'   ',' x ') WRK,
  6       decode(sum(oth),0,'   ',' x ') OTH
  7  from (
  8   select e.ename,
  9          decode(et.trainingno,10,1,0) t10,
 10          decode(et.trainingno,20,1,0) t20,
 11          decode(et.trainingno,30,1,0) t30,
 12          decode(et.trainingno,40,1,0) t40,
 13          decode( et.trainingno,10,0,20,0,30,0,40,0,1) oth
 14   from emp e, training t, emp_training et
 15   where e.empno=et.empno and
 16         et.trainingno=t.trainingno
 17     )
 18  group by ename;
ENAME      ORI SAF BUS WRK OTH
---------- --- --- --- --- ---
ADAMS       x
BLAKE       x   x
FORD            x
JONES       x   x
KING        x   x   x
MILLER      x   x   x   x   x
SCOTT       x
SMITH       x   x           x
WARD        x   x       x

9 rows selected.

SQL> rem ============= crosstab - part 3, variant: show all employees ===========
SQL> select ename,
  2       decode(sum(t10),0,'   ',' x ') ORI,
  3       decode(sum(t20),0,'   ',' x ') SAF,
  4       decode(sum(t30),0,'   ',' x ') BUS,
  5       decode(sum(t40),0,'   ',' x ') WRK,
  6       decode(sum(oth),0,'   ',' x ') OTH
  7  from (
  8   select e.ename,
  9          decode(et.trainingno,10,1,0) t10,
 10          decode(et.trainingno,20,1,0) t20,
 11          decode(et.trainingno,30,1,0) t30,
 12          decode(et.trainingno ,40,1,0) t40,
 13          decode(et.trainingno,null,0,10,0,20,0,30,0,40,0,1) oth
 14   from emp e, training t, emp_training et
 15   where e.empno=et.empno(+) and
 16         et.trainingno=t.trainingno(+)
 17     )
 18  group by ename;
ENAME      ORI SAF BUS WRK OTH
---------- --- --- --- --- ---
ADAMS       x
ALLEN
BLAKE       x   x
CLARK
FORD            x
JAMES
JONES       x   x
KING        x   x   x
MARTIN
MILLER      x   x   x   x   x
SCOTT       x
SMITH       x   x           x
TURNER
WARD        x   x       x

14 rows selected.

SQL>
SQL> spool off


I usually do this as follows, not claiming it’s elegant, but it’s a habit:

select
category
,sum(case when DayOfWeek=‘Monday’ then Total else 0 end) as MonTot
,sum(case when DayOfWeek=‘Tuesday’ then Total else 0 end) as TueTot
etc.
From sometable
group by
category

If you’re using SQL Server 2005, look at the new PIVOT clause.

What you want is something along the lines of this:



SELECT * FROM IntermediateQuery

PIVOT
(
    SUM(Total)
    FOR [Day]
    IN ([Monday], [Tuesday], [Wednesday] ...)
)
AS p


Because you have a limited list of days, that would be the simplest way to do it, I think.

I’m also claiming the bonus points for my solution, since it will do almost exactly what you want, except that instead of 0 you will get a NULL in the matrix. As you noted, this can be dealt with in RS.

Of course, if you’re still using SQL Server 2000, then we need to re-work the whole thing without the PIVOT, and then we’re into the messy land of cross-tab queries. Hopefully it should be still reasonably simple, though, because you have a fixed set of categories and days of the week.

Thanks everyone.

We had arrived at something about like RaftPeople showed, which is a variant on K364’s idea once you allow for dialect differences.

We do use SQL Server 2005 and the real hot setup seems to be Devorin’s PIVOT approach.

Our shop uses SQL just enough to be dangerous, and so we’re kinda weak on the latest features.

I don’t need any further help here, but if anyone wants to keep playing around with this thread feel free.

Thanks again.