These comments were produced as part of the final project for my CI 431 class. They are intended to assist teachers in using this lesson.
1) Goals of project:
The goals of my lesson are to teach the methods of finding confidence intervals and tests for differences using the bootstrap method. Many statistical ideas will be investigated such as randomness, how to sample from a data set and how to make decisions based on statistical evidence. My hope is that the students come away with the ability to formulate their own problems for which they have some sort of data and understand how to run an analysis using the bootstrap. My choice of using the bootstrap method is because I believe it eliminates some of the more threatening facets of statistical analyses such as using the proper formulas and finding tabular values. Also, there are not many complicated calculations in using this method and it can be taught at lower levels that usually would not be introduced to statistical decision making.
2) Level:
The class level that this lesson is geared towards are high school mathematics or statistics classes who have an interest in investigating statistical decision making. There are no special mathematical requirements for this lesson so classes from many levels can use this lesson.
3) Outline:
The ideas behind sampling - The first section of the lesson explores how to sample from a data set. The main assumption to take from this section is that since we do not know anything about the population from which this data was taken, we assume that our data is the population and we take samples from this data. Other ideas to be taken are randomness and the difference between populations and samples. Sample exercises and question can be found in the section.
Creating confidence intervals - This section explores how after repeated sampling is done, to analyze the trials. The students are expected to analyze the distribution of trials and to determine the likelihood of certain trials occurring. Main ideas to be taken from this section are normal distributions, significant events and how to formulate confidence intervals to help in making decisions based on the results. Sample exercises and question can be found in the section.
Sampling from two data sets - This section explores how to sample when given two data sets when some sort of difference between the two are of importance. The main assumption form this section is that first, we assume that there is no difference between the two (the null hypothesis) and try to show that repeated sampling either supports or rejects our hypothesis. Therefore, we need to sample from the two data sets combined since we assume that there is no difference. Another important assumption is that when the samples are taken, we cannot assign certain trials as representative of one of the data sets. Therefore, for the analysis we are interested in finding which samples show absolute differences exceeding that of the difference between the data sets. The main ideas to be taken are the assumptions listed above. Sample exercises and question can be found in the section.
Fidning significant differences between two means - This section explores how to find if the sampling from the two data sets show significant differences than the actual data sets. Again, the distribution of the trials will be normal and it is important for the student to understand that sampling always produce normal distributions (part of the Central Limit Theorem). It is also important for the student to analyze the trials and determine what they think is significant in terms of the randomly sampling compared to the actual difference between the data sets. The main ideas from this section are significance, decision making and normal distributions from sampling. Sample exercises and question can be found in the section.
Performing their own analysis - This last section gives example problems for the student to do using the techniques described throughout the lesson. Here the student gets a feeling for how to model problems statistically and to make decision based on their analysis.
4) Instructor commentary:
This module uses Excel 5.0 spreadsheets to explore the bootstrap method of statistical analyses. However, the user does not need to know how to use Excel to do the lesson. The only requirement is knowing how to push buttons to run specially created macros to simulate the statistical sampling and trials. The spreadsheet should be able to be downloaded in either PC or Mac environments. The lessons are self-describing and contain questions to lead the student through the analyses and to explore the main statistical ideas of the lesson (sampling, randomness, decision-making, modeling, etc..)
It is certainly possible to organize the class in small groups to proceed through the lesson and it is recommended. Also, larger group discussions can be effective in those questions that ask to discuss certain ideas.
5) Self- Critique of the module: (1-10 rating scale)
![]()