Statistics 101

mkirda · Feb 2, 2004

After arguing with Kalk about this over and over, and trying to get him to run the numbers so he could understand, I gave up.

Instead, I ran them myself, using numbers from Peter Rubec's published paper and Population estimates provided by Kalkbreath.

Statistics 101 for Kalk.

Lesson #1 What sample size is required by population N?

Sample size can vary according to the requirements needed.

First thing, you want to be reasonably certain of the results.
Typical scientific analysis look for a 'Confidence Level' of 95% or greater.
Moving from a confidence level of 95% to 99% doesn't require as great a sample size increase as might be expected.

Second thing, you want to be reasonable sure that the average or mean doesn't vary all that much.
In other words, the error bars should be reasonably small.

(You might have noticed this in presidential polls, where they talk about the margin of error as being plus or minus 3%...)

Shrinking the error bars becomes quite 'costly' in terms of sample size.
Moving from 5% to 3% to 1% means a much greater sample number, regardless of confidence level.

In order to illustrate this, I have created the chart at the bottom of the post:

Read this as xxx/yyy: xxx is the sample size required for a 95% confidence level.
yyy is the sample size need for a 99% confidence level.

Just for fun, I ran the numbers for what is required for a population of 50,000, with a confidence level of 95% and error bars that were higher than ideal, just to see the sample sizes required...

N=50,000 with 95% confidence and 15% error bars require a sample size of just 43.
If you are willing to live with 20% error bars, you can sacrifice just 24 fish.
The same sample sizes are required for N=100,000.
For N=1,000,000, sample size is the same.

Now, using the original data set provided by Peter, 7,703 fish were tested for cyanide during 1996 to 1999.
25% of these fish tested positive for cyanide.

We are going by Kalkbreath's assertion that 14,000,000 fish were exported per year during this time period.
Additionally, we are going to make an assumption that the sample size was the same each year.

In other words, roughly 2567 MO fish were tested each year out of a total of 14,000,000 being exported.

Based on this number, I can say that I am 99% confident that the percentage was 25%, plus or minus 3%.
Actually, when I run the numbers again, I can reduce the error bars to 2.5% with a sample size of 2662, a sample size we are extremely close to.

I hope that this chart will be useful for all in understanding Kalkbreath's arguments for what they are.

Regards.
Mike Kirda

Kalkbreath · Feb 2, 2004

Now run your numbers based on actual fish ............6.5 million damsels {45%} of 14,000..............and 1850 samples ? But 5 million of these damsel were represented by only twohundred fifty of the 1850 samples ....{twentyfive samples for five species .Thats 5 million fish and two hundred- fifty samples ...........now do the math.

mkirda · Feb 2, 2004

Kalkbreath":gr95wvad said:
Now run your numbers based on actual fish ............6.5 million damsels {45%} of 14,000..............and 1850 samples ? But 5 million of these damsel were represented by only twohundred fifty of the 1850 samples ....{twentyfive samples for five species .Thats 5 million fish and two hundred- fifty samples ...........now do the math.

For N=5,000,000
95% confidence level with 6% error bar requires a mere 267 fish.

If you require a 250 fish sample size, it raises the error bar to 6.2%.

Statistics 101

mkirda

Advanced Reefer

Attachments

Kalkbreath

Advanced Reefer

mkirda

Advanced Reefer

Sponsor Reefs

Continue in the app