"Aggregating" strings in SQL

I’m wondering if there’s a simple way to do the following. Essentially, I want to do an equivalent of a sum() on a character field, to get a list of its values. Say I have a classes table like:

ClassName  TeacherID  Students
Econ 101           1        50
Econ 102           1        30
Calc 201           2        22

I want to get a summary result of the classes for each teacher, like:

TeacherID  Classes
        1  Econ 101 Econ 102
        2  Calc 201

Is there anything similar to a select TeacherID, sum(Students) group by 1 or does it need to be more complicated?

If you just want to know how many classes a teacher teaches, you can just do a COUNT() with a GROUP BY. But if you want to know which classes, I think you’re going to have to use a pivot table of some sort.

If you absolutely have to do it in SQL, it’s a little tricky and the best solution is going to depend on which DBMS you’re using. On the other hand, it’s easy if you can use another scripting language of some kind to make the output. What are your constraints?

Which variant of SQL?

Straight SQL isn’t meant to lay out results the way that you are trying to do it. You are trying to create multiple fields out of a single field and that violates good database design. In essense, SQL doesn’t want you to force the classes to list side by side. It is built to list them as separate rows in one column.

That said, you can do what you are trying but it terrible to write and requires many layers of subqueries. I can tell you how to do it but it is an ugly solution and not that easy to write.

The data is in SQL Server, but the solution doesn’t have to be in pure SQL. I was planning to just write a little script to concatenate the strings, so that’s no problem, but I wanted to check if there was a tricky SQL method instead.

A while ago I had to do basically the opposite operation to split a list, and it turns out there’s a nice method with a table of numbers for that. Just wondering if there’s something similar here.

In SQL Server, if you know the maximum number of classes each teacher might have, it’s a pretty simple trick. It’s more complicated if your query has to support a large number of classes per teacher.

    SELECT TeacherID, 
           "1" + ', ' + isnull("2", '') + ', ' + isnull("3", '') + ', ' + isnull("4", '') + ', ' + isnull("5", '') + ', ' + isnull("6", '') + ', ' + isnull("7", '') + ', ' + isnull("8", '') AS Classes
      FROM ( SELECT TeacherID, ClassName, 
                    ROW_NUMBER() OVER (PARTITION BY TeacherID ORDER BY ClassName)
               FROM <your_table_here>) P ( TeacherID, ClassName, seq )
     PIVOT ( MAX( ClassName ) FOR seq IN ( "1", "2", "3", "4", "5", "6", "7", "8" ) ) AS P_ ;

(note, not tested. should work for up to 8 classes.)

This should also work with any other database that supports the pivot syntax, with maybe some modification. Or you can use a traditional pivot query. But you still need to know the max# of classes.

There is a way but you have to use subqueries as class name fields in your main SELECT clause. To list the classes in alphabetical order, you would choose the min(class name) for that professor id. It gets trickier after than. The second class has a subquery with a subquery of its own as part of the WHERE clause. In short, you select the min(class name) for that professor WHERE the class name isn’t the min(class name) for that professor. That gives you the second one one in the series. You keep going and adding more levels for each class down the line. I think you can only have seven layers of subqueries in some versions of SQL but that should be enough for classes per professor.

If you can understand what I just described, it is as long and ugly as it sounds. It will be a huge query when you are done likely several hundred lines so don’t say I didn’t warn you.

There are a lot of different ways to do aggregate concatenation, some more difficult than others. See here.

If you are on SQL Server 2005 or higher, using FOR XML might be the simplest. In your specific case, this should work.

SELECT t1.TeacherId,
   ( SELECT ClassName + ' ' 
       FROM zTest t2
      WHERE t2.TeacherId = t1.TeacherId
      ORDER BY ClassName
        FOR XML PATH('') ) AS Classes
  FROM zTest t1
  GROUP BY TeacherId 

  1. You can use a cursor to concatenate strings from a SELECT FROM GROUP BY query, storing the concatenated strings in a temporary table. Then, return the temporary table.

  2. Use another language/layer location. If you are building a method stack that is accessible through an API, just make the stored procedure or whatever return a GROUP BY, then do the transformation in Java, C#, or whatever you are using, and return the transformed data to the caller.

  3. You said you are using SQL Server. Can you use an Extended Stored Procedure?

I’d suggest that this is the simplest method by far. It’s not “pure” SQL, and it’ll only work on SQL Server 2005 or later (as TroutMan notes), but it does have the benefit of being quick and simple.

More ways to concatenate strings than you can shake a stick at: http://www.projectdmx.com/tsql/rowconcatenate.aspx. I’ve used several of these, but FOR XML beats all the others hands down for simplicity.

Thanks all for the information.

in programmish this would be called “concatenate string list using sql” or similar. Google returns for instance this page http://databases.aspfaq.com/general/how-do-i-concatenate-strings-from-a-column-into-a-single-row.html

A brief look at this page should convince you that doing the concatenation in java, php or whatever other actual client app that you are using is a much better idea than trying to keep it inside SQL or TSQL.

To each their own. I don’t see anything too complex or confusing about the method I provided above, which is included in the page you linked to. I guess it comes down to whatever language and environment you feel most comfortable in.