measure_level2.gif (1266 bytes)

Information Works! 2000
pixel2.gif (807 bytes)

Technical Brief
Statistical Significance

Statistical Significance
Researchers want to know and to be able to say with some degree of confidence whether any relationships they have found among various types of data are different from relationships they would find solely due to chance. A measure for the degree of confidence we have in a relationship is statistical significance. Most researchers are willing to declare that a relationship is statistically significant if the chances of observing the relationship in the sample are less than 5%, assuming no other factors are affecting the data set. (Statistical modeling is based only on the factors included in the model and by its artificial nature automatically excludes all other factors.) In other words, a relationship is considered to be statistically significant if it appears less frequently than 95% of the relationships among the selected variables we would expect to see just by chance.

Thus, on the second field of the school report charts, the band (range) illustrating the statistical model's projection represents this 95% confidence level -- i.e., that there is no more than a 5% possibility that a school's actual scores would lie outside the band due solely to chance. This confidence level for the model (represented by the range of the band) includes not only actual numerical calculations for scores but also includes statistical errors that are part of the model. (Please note that all models have statistical errors and take them into account during calculations.) However, because this model is based on statistical probability, there is the possibility that a school could lie below or outside the band this year or even in subsequent years solely by chance. One goal of the SALT initiative is to shift the fundamental paradigm of school improvement in Rhode Island toward a blend of a rich variety of data sources for measuring, improving, and judging school performance. This means that the statistical model alone should not be used to assess a school but must be coupled with other data from independent sources that confirm that the results are not due to chance. For example, by using a combination of SALT survey data, observations of independent observers in the course of SALT visits, analysis of selected samples of actual student work and other forms of local student assessment results, an observer could confirm that these "adjusted" assessment results are not due to chance. The RI Skills Commission is working on a similar paradigm shift at the level of the individual high school student in their efforts to design a Certificate of Initial Mastery (CIM) that credentials a student's achievement on the basis of having observed a rich array of data that demonstrates convincingly that the student has the requisite competencies for life, living and employment in the 21st century.

The meaning of statistical significance also holds true for any data results for individual students, groups of students or groups of schools. As a rule of thumb, for example, statisticians routinely conduct their studies accounting for the fact that 5% of any data set is likely flawed due to entry or other data errors. Statistical probability underscores why it is always inadvisable to judge schools solely, for example, on the results of state testing programs which by their very nature are limited in time, scope, complexity, and above all else, are themselves subject to the laws of probability and statistics.

Please note that statistical significance does not mean that two variables have a relationship that is necessarily more than statistically important. For example, a school may sit just outside the top of a band several years in a row. Quite possibly the school's staff and its parents might attach very different values to the three-percentage points difference between being within or just outside the band. The school's staff may interpret the percentage points as evidence that they are doing better than other schools and thus do not need significant improvement. The parents might see these percentage differences as evidence of an even larger gap between current proficiency and proficiency on the part of ALL students in the school. Sometimes a very interesting relationship may be missed if it fails to achieve statistical significance and the lack of complimentary observations does not flesh out this subtle relationship. Without a very strong relationship, a sufficiently large sample, or complimentary observations, chance is hard to rule out.

In terms of Rhode Island achievement results, for example, certain schools show disaggregations of student achievement results among groups of students (whites versus other groups, LEP versus non-LEP) which are not statistically significant due solely to the fact that the sample does not include a sufficient numbers of individuals to achieve statistical significance. Conversely, other schools show "gaps" which are statistically significant, but the 2-3 points difference may be educationally unimportant given the possibility of variation in achievement test scores from one day to the next among the same (or similar) group of students. In other words, there is a natural fluctuation in individual (and sometimes whole class) performance due to other factors.

diagram a.gif (46543 bytes)

Diagram "A"

The simplest kind of visual description of a relationship between two variables is a straight line. Imagine, if you will, plotting (scatter plotting) a whole set of spending per student data and then drawing a straight line that comes as close as possible to all the points in the scatter plot. (See Diagram A)
4 We call this procedure "regression," the resulting line the "regression line" and the formula that describes the line the "regression equation." The word "regression" originated from Francis Galton's work in the late 1800s when he realized that for many relationships there was "regression" (reversion) toward what he termed "mediocrity." We now express this frequently seen statistical phenomenon as "regression toward the mean." Human height data, for example, demonstrates that if two parents both have above average heights, their children are more likely than not to have average or below average heights.

diagram b.gif (10173 bytes)

Diagram "B"

Imagine if you will, plotting achievement scores for grade eight students on a particular achievement test. The vertical axis can be achievement scores recorded as a number. The horizontal axis can be the education level of the child's mother as reported by the child, also expressed as numbers assigned to each level. (Of course, this axis could be any other variable for which you have consistent data that you believe to be reliable). The question is, then, where is the best straight line that relates these two variables (achievement score and mother's education level) to each other? You could take a ruler and try to fit a line through the scatter plot. However, different people would draw different lines, based on their best visual guess as to which line is closest to most of the points. To find the one line out of the infinite possibilities that is as close as mathematically possible to all of the points, statisticians commonly use a procedure called the "least squares line." (See Diagram "B".)
5 To determine the least squares line, priority is given to the vertical axis (in this case achievement scores) to calculate how close the points fall to the line. Those distances are then squared and added up for all of the points in the sample. For the least squares line, that sum is smaller than it would be for any other line. The vertical distances are chosen because the equation is often used to predict that variable when the one on the horizontal axis (mother's education level) is known.

All straight lines can be expressed by this formula for the least squares line. The standard mathematical convention is to write an equation for the line relating the two variables as: y = a + bx, where y represents the vertical axis (achievement scores in our example); x represents the horizontal axis (education level of the mother in this example), and a and b are replaced by numbers, i.e., two unique constants derived from this particular regression line. The number represented by a is called the intercept and the number represented by b is called the slope. The intercept describes one particular point on the line that falls where the line crosses the vertical axis, when the horizontal axis is at zero. A positive slope describes how much of an increase there is for the variable on the vertical axis (here achievement scores) when the other variable, on the horizontal axis (education level of the mother), increases by one unit. A negative slope indicates a decrease in one variable as the other one increases. Thus, for example, as a school's population becomes poorer in the overall data set of all RI schools (e.g., an increase in the numbers of students eligible for free and reduced lunch), achievement tends to decline (decrease in scores).

Back to top || Return to the Information Works Home Page

curvedtopright.GIF (111 bytes)
pixel2.gif (807 bytes)     pixel2.gif (807 bytes)
For further information call the Rhode Island Department of Education
at 401-222-4600 x2231.