Originally posted by lapdog2001@Mar 3 2005, 02:33 PM
I remember from my stats class that a sample size of 40 or more is statistically relevant. Will the results be better with 16,000? Yes, but you could get away with 40!
LapDog  :9
[post=288089]Quoted post[/post]
Sample size selection is far more complex than that, and there is no single number describes a statistically relevant sample size. Sometimes inexperienced people who do surveys, etc, use very simple formulas to determine sample size. However, these equations are often only valid for a simple binary decision, i.e. "Do you prefer Coke or Pepsi?" and where some other assumptions are met. I often see pollsters misapply these sample size caluclations to more complex situations, and get bogus results, which they nevertheless ignorantly report to the media as having some high level of confidence or small margin of error, based on incorrect use of a formula they don't fully understand.
Let's say you are trying to determine the average length of something, like a penis. If the length of the thing varies only a little (i.e. has a small standard deviation, and an nice normal distribution), you need take only a few samples to be reasonably confident that your sampling is representative. But, if the length varies very widely (or if the distribution is weird), you need to take a lot more samples. The number of samples required is in fact approximately proportional to the square of the ratio of the standard deviation to the desired precision (assuming normal distribution). Thus, if the variability (standard deviation) of the length doubles, you need to take 4 times as many samples.
The actual equation which applies in the penis-length case, in simplified form, is n = (Zae * s/B) squared, where n is the sample size, s is the standard deviation, B the precision, and Zae is a parameter depending on the desired confidence level. It is then necessary to apply two correction factors, the first of which is usually obtained from a table, and corrects for the fact that a simplified equation, which underestimates a bit, was used. The second corrects for the fact that the total population is finite. For the commonly used 95% condifence level, Zae=1.96. (The confidence level is a measure of how likely a second identical study will agree with yours.) An interesting thing is that you have to do a preliminary study first, to get some idea of the standard deviation, so that you can calculate the sample size for the real study. Then, after you have done your real study, you will have a better idea of the actual standard deviation, and can determine if your sample was was indeed adequate. Larger than calculated sample sizes are often used, to minimize the risk of discovering that the original sample was too small.
Example: Let's say you want to find the average penis length to a quarter inch precision, with 95% confidence. A preliminary study shows that the standard deviation is +/- 1 inch. From the formula above, with s=1, B=0.25, and Zae=1.96, you would need 62 samples. The correction table gives the corrected sample size as 78. Because the total population of men with a penis is much, much larger than 78, you would not have to apply the finite population correctition, as it only applies when the sample size is a significant portion of the total population.
You can see this equation explained, and the correction factors, courtesy of the Department of Fish and Wildlife, here:
http://fire.fws.gov/ifcc/monitor/RefGuide/...m#EQUATION%20#1
The example I cited corresponds to the case of their equation 1, "Determining the necessary sample size for estimating a single population mean or a single population total with a specified level of precision."
However, this example made some assumptions. One of them was that there was a single population, that is, that all persons in the population are basically identical, with some random statistical variation among them. Of course, this isn't true in the real world, since there are different races, ethinc gene pools, selective breeding effects, environmental factors, etc, leading to a heterogeneous population. Therefore, this estimate of sample size isn't really valid. The real sample size necessary would be much larger when these effects are taken into account.
The sample sizes become much greater if the object of the research is to stratify a heterogeneous population. For example, if we ask the penis length survery participants a question about ethnic background, and then try to divide up the population on this basis, to answer some question like "Do Germans, Greeks or Poles have bigger cocks?" (my background is all three), we have stratified the population, and our sample size must be met in each stratum. This is a problem in trying to select a sample size in advance, since we don't know how many Greeks, Germans, or Poles will respond. So, we might have to do a preliminary study on this, too, so that we can choose a sufficient sample size to allow for stratification.
Further, the sample size equation becomes more complicated if you are trying to measure different things than the simple mean of a single variable. For example, let's say you are trying to measure the difference in opinion at two times. This is more complex, because the reliability of the final answer depends on the variability of two different studies. You can see an example of this in the Equation 2 of the above reference. Any time a study involves comparison of multiple variables that are interrelated, larger samples sizes are needed.
The condom study is interesting because it includes both of these issues, plus others. The study takes several data sets on each participant, because it provides different sample condoms to use at different times (every two weeks, a fresh batch is sent). Since the object of the study is presumably to tell if one kind of condom is better than the others, the issue of measuring a difference comes into play.
Further, they probably intend to stratify the population. They ask a lot of questions about sex and sexual habits, in surprisingly frank ways. I think a lot of this is just informational, and is not likely to be used to stratify the population. However, there are obvious ways in which it is necessary to stratify, and they do ask questions relevant to these. For example, they ask if you fuck men or women (and in which orifice), what type of lubrication you use, how long your intercourse lasts, how hard you fuck, etc. Since these factors may affect condom performance, they probably will stratify on this basis for at least portions of the analysis. All told, 2000 men seems a pretty small sample size to me.