Hi. I have no background on statistics, so please just go down to fundamental for me. But here are some basic questions about statistics and standard deviation ....

Assuming study from 20 samples (please don't mind the sample size) found that, for example an average American play 3 +/- 2 hours of gaming a day.

But if we want to predict/forecast out how many hours the whole US (let say 300 millions in population) spent in gaming per year (let say 300 days for easy calculation). Of course the statistics won't be 270+/-180 billions hours per year. Because firstly the standard deviation will be too large for the study to be much meaningful. And secondly, we expect that if you add that many samples together, the fluctuate should cancel each other out, and we will get closer to the 270 billions hour rather than the large range 90-450 hours billions hour like the standard deviation implied...

Thank guys

Assuming study from 20 samples (please don't mind the sample size) found that, for example an average American play 3 +/- 2 hours of gaming a day.

But if we want to predict/forecast out how many hours the whole US (let say 300 millions in population) spent in gaming per year (let say 300 days for easy calculation). Of course the statistics won't be 270+/-180 billions hours per year. Because firstly the standard deviation will be too large for the study to be much meaningful. And secondly, we expect that if you add that many samples together, the fluctuate should cancel each other out, and we will get closer to the 270 billions hour rather than the large range 90-450 hours billions hour like the standard deviation implied...

**So my first question is:**is there a name or a rule in statistics, pointing out that if you want to use a smaller sample to predict a much larger one, it's better to not using the standard deviation. Because it's misleading and useless if you do? And maybe it's better to use some other methods?**My second question is:**what is the differences between, let say pick a study sample and monitor for a much longer time (for example pick 5 random person, then see how much games they play every day for a month), in comparison with picking 150 people, and ask how many hours of games they were playing today. I know the first is skewed if you don't pick correct study candidates (that can present the whole population). While the later data is skewed if you pick the wrong time and location (when a blockbuster game is just released, or during weekend, which will show a much higher average gaming time). But is there a more academic answer to show the difference between the two? If there is even a name for it to refer to, it will be perfectThank guys

Last edited: