# Statistics: is there a kind of regression that does this?

There’s a kind of regression I think should exist but I don’t know what it’s called. Or perhaps there’s actually just a way of accomplishing this with a general linear model, and I simply don’t see how. Or maybe what I’m picturing wouldn’t work.

Imagine a big population of things that have a variety of properties. But imagine further that they tend to derive from a small population of certain prototypical things with these same properties in certain values. I want to try to understand each member of the big population as a mix of the prototypes in the small population.

Here’s an example. Suppose the things are people with skills, and the skills are taught in university courses. But we don’t think of the people as having had this course, that course, et cetera. We think of them as having a degree in this field or that field. Well, all the people with a degree in a certain field, say electrical engineering, will tend to have had many similar courses, but not exactly. Still, you’ll find they tend to have a skill set different from people with a degree in nursing.

What I want to do is, first, apply the skill measurement tests to all the electrical engineers, and all the nurses, and all the other graduates on a field by field basis, so I have an average skill profile for each field. Then, I want to look at each person’s skill results, and say this person is best fit by adding together this much electrical engineer, this much nurse, and so forth.

Or, another example, I have baked goods made of flour, sugar, egg, cinnamon, salt, yeast, et cetera. I want to analyze all these and average together all the instances of donut, and all of the cake, and all of the bread, and all of the crackers, and these become my prototypes. Then when I am confronted with an instance of cookie, I can say it’s midway between cake and crackers.

Does this sound like a special kind of regression that has a name? Or is all of this just muddled somehow, or obviously a common kind of regression?

Thanks!

Thanks, Sage, but that identifies groups. I want to attribute an individual’s blend of properties to contributions from each group.

Look at faces. There’s eye separation, size of eyebrows, prominence of chin, etc etc. I want to look at my Smith ancestors and my Jones ancestors, and wind up attributing 63% of my face to the Smiths and 37% to the Joneses. That’s the method I want.

I think cluster analysis would sort my family into a cluster of Smiths and a cluster of Joneses, and perhaps more probably put me with the Smiths.

This may help:

Perhaps you have not asked a statistical question? If you average some points to produce a well-defined Smith point, and some other points to get a Jones point, and so on, you just end up with a well-defined set of points. Then, for any point in their convex hull, you can represent it as a linear combination of extreme points, with the coefficients summing to 100%.