(I feel like I should know this but everything feels overly simplified or overly complicated)
We have 50 applicants for a job and a 5 person hiring team. Everyone on the team looks at the applicant’s resume and cover letter and scores all of the applicants between 0-5 on 20 different metrics for a total of 100 points possible for each applicant. HR wants to use the top cumulative scorers to bring in for interviews.
The problem I see is that some people on the hiring team have a mean of around 30 and some people have a mean around 80. Using a cumulative score weights the decision toward scorers with a larger mean.
Possible Solution 1: Rank the applicants 1-50 for each person on the hiring team prior to comparing. (Seems too simple, but if it’s pretty standard I’ll do it)
Possible solution 2: (I know just enough stats to get me in trouble) Use z-scores? Can I sum the applicants z-scores based on each team members individual mean and SD?
Possible solution 3: ???__ I’d love something simple and defensible but not dumbed down to the point of uselessness.
Thanks
That’s what I’d do. This is the 1-dimensional case of the Mahalanobis distance.
Instead of using these “z-scores” on a judge’s totals, you might do this independently for each of the 20 metrics. Then you could have a 20-Dimensional Mahalanobis distance. 
Thanks **septimus ** I decided to wait on the 20-Dimensional Mahalanobis distance until HR really pisses me off.
The z-scores did a much better job of representing what everyone was “feeling” about the applicants. A huge improvement over the cumulative totals. I just hope HR doesn’t give any pushback, they aren’t the brightest bulbs.
I did the equivalent of this in a different context, where I had 70 judges, each scoring a subset of about 400 items, with a total of 4 judges for each item. It was really important to normalize the scores as you did. I compared rank ordering and Z scores. The results were very similar, which gave me enough confidence to not worry about it and use either one.
That’s good to know JWT. I chose to use the Z scores because it limited any issues with ties. Rank ordering was getting a lot of ties with the low variability.