Lab 4 Instructions-2
Lab 4, What Does Confidence and Level of Significance Really Mean?
We will analyze as a class the meaning of the confidence in the confidence interval and the level of significance in the hypothesis tests. You will turn in both a paper into Canvas and record your four confidence intervals and four hypothesis test results in the Google Doc file found here Links to an external site.. We will look at the results as a class. Submit your answer in Lab 4 Answers
- Go to the data set posted here. The data was taken from a lung cancer study, so happy stuff.
- You or your group will take random samples of size 40 from the data set. Record the patient number, age, and gender of your sampled individuals. There are 2,538 data points (individuals) from which you will be sampling. If you do this as a group of four, for example, each group member will take one sample each. You will take these samples in four ways:
- The first sample will be using the systematic sampling method. Choose a k, make it somewhere between 30 and 150. Choose a random starting data point (doesn’t have to be the first block of data.) and collect your sample. If you reach the end of the data set and you don’t have a sample size of 40, return to the beginning block and continue.
- The second sample will be gathered using cluster sampling. Split the data into groups of five and use the random number table in a book, online, or the random number generator found here Links to an external site. to randomly select 8 of the groups to include in your sample. How you use the random number table to do this, is up to you.
- The third sample will use the stabbing method. Scroll to a random block of data, stab the block (with your finger!) and select the individual you stabbed.
- The fourth sample will be gathered using a method of your choice. (not a repeat of the other three) You must describe your sampling design.
- What is the mean and sample standard deviation for the age of your sampled individuals in each of the four samples? What is the proportion of females in your four samples?
- For each sample, you will calculate a 90% confidence interval for the mean age of individuals in the study. Since all samples are n=40, all the critical values will be the same, so this should go quickly. List all the confidence intervals you found.
- If all the confidence intervals overlapped, does that imply that the mean must be in that overlap?
- What percentage of the confidence intervals generated by the class for this lab will fail to contain the population mean?
- I claim 49.5% of the study population is female. Conduct a hypothesis to test for each sample to see if I am wrong, to see if the population proportion is of female patients is not 49.5%. Use a 5% level of significance. State your conclusion to each hypothesis test.
- What is a Type I error? Of all the hypothesis tests done by this class for this lab, how many (percentage) will commit a Type I error?
- How many of your hypothesis tests agree? If different samples gave you different hypothesis test results (i.e. some rejected the null hypothesis and some did not) does that mean your sampling design for the incorrect hypothesis test(s) was faulty?
- On a word file, include your calculations (use the equation editor included in Word to enter them) for the four confidence intervals and your four hypothesis tests in full. Include your four samples with details on how you gathered them. Also copy and paste the blocks of data from your four samples into the Word file.