Simple statistics question

Jragon · December 15, 2014, 1:49am

Except ultrafilter has been correcting for small finite populations this entire time. Everything he’s said has been along the lines of “the proportion of the sample size to the population…” In fact, I think you both agree about Mary’s uncertainty with 50 samples. It’s whatever you’re doing around 30 samples where somehow John has a better estimate that’s confusing.

ultrafilter · December 15, 2014, 2:10am

I’m not willing to take your word for this. Please provide a cite.

OldGuy · December 15, 2014, 3:16am

I’m really not sure what you wish a cite for?

Mary wants to know the population mean which is defined as (x[sub]1[/sub]+ … + x[sub]50[/sub])/50.

She knows x[sub]1[/sub] + … + x[sub]30[/sub] exactly after her sample of 30.

A standard assumption about each of the unknown x[sub]31[/sub], … , x[sub]50[/sub] is they have mean
m = (x[sub]1[/sub]+ … + x[sub]30[/sub])/30 and variance v = (x[sub]1[/sub][sup]2[/sup] + … + x[sub]30[/sub][sup]2[/sup] - 30m[sup]2[/sup])/29 (or divide by 30 if you wish the MLE).

Assuming the remaining x’s are independent, var[x[sub]31[/sub] + … + x[sub]50[/sub]] = 20v.

Var[k*z] = k[sup]2[/sup] var[z] for any constant k

so var[(x[sub]1[/sub] + … + x[sub]50[/sub])/50] = var[(x[sub]1[/sub]+ … + x[sub]30[/sub])/50] + var[(x[sub]31[/sub] + … + x[sub]50[/sub])/50] = 0 + var[(x[sub]31[/sub] + … + x[sub]50[/sub])/50] = 20v/50[sup]2[/sup]

Which step(s) do you want a cite for?

ultrafilter · December 15, 2014, 3:42am

This is not the variance of Mary’s estimator. The variance of Mary’s estimator depends only on the observations that she’s seen, and assuming independence, has absolutely no relationship to the unobserved part of the population.

Mary’s estimator is (x[sub]1[/sub] + x[sub]2[/sub] + … + x[sub]30[/sub])/30, and it’s variance assuming that she’s sampling from an infinite population is Var(x[sub]1[/sub]) / 30. Since she’s not sampling from a infinite population we have to multiply by the finite population correction factor that I described above.

This is really basic stuff that even a non-mathematical introductory textbook will cover. The book I linked earlier by Friedman et al. is a really fantastic discussion of the concepts of statistics and I recommend it to anyone who’s willing to put in the work to understand them.

OldGuy · December 15, 2014, 4:53am

I apologize. I couldn’t see the error because my math was correct. My assumption was wrong. In case anyone still cares:

What we want when sampling n out of N is not variance of (x[sub]1[/sub] +…+ x[sub]N[/sub])/N but the variance of the estimator’s error which is (x[sub]1[/sub] +…+ x[sub]N[/sub])/N - (x[sub]1[/sub] +…+ x[sub]n[/sub])/n. This is var of (x[sub]n+1[/sub] +…+ x[sub]N[/sub])/N + (n-N)(x[sub]1[/sub] + … x[sub]n[/sub])/nN. Assuming independence, as usual, a little algebra gives the variance to be

[(N-n)/N]v/n

This is the finite sample correction mentioned except if you don’t know the population variance v you use v/(n-1) rather than v/n.

Topic		Replies	Views
Questions about doing statistics on sampled data Factual Questions	9	1081	January 14, 2017
Statistics question - comparing single samples from normal distribution Factual Questions	4	6083	July 10, 2010
Very Basic Math That you Still Don't Get Factual Questions	30	1889	November 27, 2001
Population vs. Sample variance? Factual Questions	4	836	April 18, 2003
Probability Calculation: Is There An Easy way To Do This? Factual Questions	4	1965	June 5, 2004

Simple statistics question

Related topics