A hypothesis is a statement that has yet to be proved. In statistics, the hypothesis is about the value of a population parameter, and can be tested by carrying out an experiment or taking a sample of the population.
The test statistic is the result of the experiment / the statistic generated from the sample.
In order to perform a hypothesis test, two hypotheses are required:
The null hypothesis, H₀ is the one you assume to be correct
The alternative hypothesis, H₁ is the one you are testing for, to see if the assumed parameter is correct or not.
A specific threshold for the probability of the test statistic must also be defined. If the probability of the test statistic is lower than this threshold, there is sufficient evidence to reject H₀. If it is above the threshold, there is insufficient evidence to reject H₀. This threshold is known as the significance level, and is typically set at 1, 5 or 10%.
When ending a hypothesis test, you must conclude by saying whether or not there is sufficient evidence to reject H₀. Do not say accept or reject H₁
Critical Regions & Values
If the test statistic falls within the critical region, there is sufficient evidence to reject H₀. The critical value is the first value to fall inside the critical region. The acceptance region is the set of values that are not in the critical region, so there is insufficient evidence to reject H₀.
The actual significance level is the probability of incorrectly rejecting the null hypothesis. What this actually means is that:
the actual significance level is the probability of getting the critical value
One- and Two-Tailed Tests
Hypothesis tests can be one-tailed or two-tailed. This refers to how many critical regions there are:
For a one-tailed test, H₁: p < ... or H₁: p > ... and there is only one critical region
For a two-tailed test, H₁: p ≠ ... so there are two critical regions, one on each 'tail'
See the examples below.
Hypothesis Tests on Binomial Distributions
Often, hypothesis tests are carried out on discrete random variables that are modelled with a binomial distribution.
A discrete random variable, X, is distributed as B(12, p). Officially, X is distributed with a probability of 0.45. However, there is a suspicion that the probability is, in fact, higher. Find, at the 5% significance level, the critical region and actual significance level of the hypothesis test that should be carried out.
Write out the hypotheses & test statistic
H₀: p = 0.45 H₁: p > 0.45 X∼B(12, p)
Since we are only looking at whether or not the probability is more than 0.45, it is a one-tailed test. Therefore, look for the first value of X for which the cumulative probability is more than 0.95 (1 - 0.05, the 5% significance level)
As you can see, the first value to have a cumulative probability of more than 0.95 is 8, so:
The critical value is 8
The critical region is > 7
Find the actual significance level
1 - 0.964 = 0.036
0.036 = 3.6 %
The actual significance level is 3.6%
Write a conclusion
If the experiment were repeated 12 times, and 8 or more of the 12 trials were successful, there would be sufficient evidence to reject H₀, suggesting the probability is indeed higher than 0.45
a. A manufacturer of kebab-makers (a kebab-maker-maker, if you will) claims that just 25% of the kebab-makers he makes make low quality kebabs. At the 10% significance level, find the critical region for a test of whether or not the kebab-maker-maker's claim is true for a sample of 12 kebab-makers.
Write out the hypotheses and test statistic
H₀: p = 0.25 H₁: p ≠ 0.25 X∼B(12, p)
We do not know if the probability could be more or less than 0.25, so the test is two tailed.
Therefore, divide the significance level by two, and find the critical region. This will be any cumulative probability that is less than 0.05 or more than 0.95
Here you can see the critical region is in two parts, one at each 'tail' of the values;
The critical region is X < 1, X > 5
b. A random sample of 12 kebab-makers is taken, and 5 are found to make low quality kebabs. Does this imply the kebab-maker-maker is lying?
See if 5 is in the critical region
5 is not > 5 not < 1
5 does not lie within the critical region for this test (X < 1, X > 5), so there is insufficient evidence to reject H₀ - this implies the kebab-maker-maker is not lying.
Find the cumulative binomial probability when X=5
When X∼B(12, 0.25), P(X=5) = 0.946
P(X=5) = 0.945, which is not within the significance level for the test. Therefore, there is insufficient evidence to reject H₀ - this implies the kebab-maker-maker is not lying.
Hypothesis Tests on Normal Distributions
You can carry out hypothesis tests on the mean of a normally distributed random variable by looking at the mean of a random sample taken from the overall population.
To find the critical region or critical value, you need to standardise the test statistic:
Then, you can use the percentage points table to determine critical regions and values, or you can use the inverse normal distribution function on a scientific calculator.
The kebabs that the kebab-maker makes have diameter D, where D is normally distributed with a mean of 4.80 cm. The kebab-maker is cleaned, and afterwards a 50 kebabs are made and measured, to see if D has changed as a result of the cleaning. D is still normally distributed with standard deviation 0.250 cm.
Find, at the 5% significance level, the critical region for the test.
Write out your hypotheses
H₀: μ = 4.8 H₁: μ ≠ 4.8
Assume H₀ is true:
Sample mean of D, Ď ∼ N(4.8, 0.25²/50 )
Z = (Ď - 0.48) / (0.25/√50)
Z ∼ N(0, 1)
The test is two tailed, so area on each side should be 0.025 (half of 5%):
Decode, using ±1.96
(Ď - 0.48) / (0.25/√50) = -1.96
Ď - 0.48 = -0.0693
Ď = 0.411
(Ď - 0.48) / (0.25/√50) = 1.96
Ď - 0.48 = 0.0693
Ď = 0.549
The critical region is when the sample mean is smaller than 0.411 or larger than 0.549
Hypothesis Tests for Zero Correlation
You can determine whether or not the product moment coefficient, p, of a sample indicates whether or not there is likely to be a linear relationship for the wider population using a hypothesis test.
Use a one-tailed test if you want to test if the population p is either > 0 or < 0
Use a two tailed test if you want to see that there is any sort of relationship, so p ≠ 0
The critical region can be determined using a product moment coefficient table. It depends on significance level and sample size.
To calculate the product moment coefficient of the sample, use your calculator (see notes sheet on regression & correlation).