Well, it’s cute. But what possible use is it to analyze a bimodal variable as if it were normally distributed? Isn’t that going to lead you seriously astray?
Before anybody points it out, I will admit that I’m pretty ignorant about probability and statistics. Which means I need to blindly trust statisticians a lot, and this video is making me wonder whether I should.

Well—if I can dredge this up correctly from years ago when I faked my way through calc-based statistics—the idea here is that the CLT lets you come up with an estimate of the variation around your estimate of the population mean, not the real distribution of the real population. This is still useful because a lot of statistical tests are geared toward comparing samples to make inferences about the populations from which they’re drawn.

So for instance, a t-test tests the hypothesis that the means of two samples are more different from each other than expected by chance. Even if you can’t completely describe the two populations, you can estimate the variation in different samples you’d get if you drew multiple samples over time. If the variation around your estimates of the means for your two samples doesn’t overlap (too much) you can conclude that the means of the populations from which you drew them are probably different.

Basically, the question you’re asking with the t-test is: I have two samples that have different means—but are they different enough that I’m sure I wouldn’t have gotten this difference just due to the variation among samples taken from the same population? Estimating the variation around the sample means gives you a way to answer that.

I think it helps — if I’m correct in understanding you. Are you saying that even though the mean may not accurately reflect the distribution, the t-test calculations contain measurement of central tendency and therefore will distinguish between the unimodal and bimodal populations?

Well, it’s cute. But what possible use is it to analyze a bimodal variable as if it were normally distributed? Isn’t that going to lead you seriously astray?

Before anybody points it out, I will admit that I’m pretty ignorant about probability and statistics. Which means I need to blindly trust statisticians a lot, and this video is making me wonder whether I should.

Well—if I can dredge this up correctly from years ago when I faked my way through calc-based statistics—the idea here is that the CLT lets you come up with an estimate of the variation around

your estimate of the population mean, not the real distribution of the real population. This is still useful because a lot of statistical tests are geared toward comparing samples to make inferences about the populations from which they’re drawn.So for instance, a t-test tests the hypothesis that the means of two samples are more different from each other than expected by chance. Even if you can’t completely describe the two populations, you can estimate the variation in different samples you’d get if you drew multiple samples over time. If the variation around your estimates of the means for your two samples doesn’t overlap (too much) you can conclude that the means of the populations from which you drew them are probably different.

Basically, the question you’re asking with the t-test is: I have two samples that have different means—but are they different enough that I’m sure I wouldn’t have gotten this difference just due to the variation among samples taken from the same population? Estimating the variation around the sample means gives you a way to answer that.

Er. If that makes any sense?

I think it helps — if I’m correct in understanding you. Are you saying that even though the mean may not accurately reflect the distribution, the t-test calculations contain measurement of central tendency and therefore will distinguish between the unimodal and bimodal populations?