Date of Award


Document Type


Degree Name

Master of Science (MS)


School of Teaching and Learning

First Advisor

May Jadallah


The main goal of science education has been achieving scientific literacy. However, this has been no easy task considering that scientific literacy has many definitions that involve a plethora of activities. This means that assessing the topic becomes quite challenging, especially if this is done with some sort of overarching instrument. Fortunately, Shamos (1995) has characterized the many dimensions of scientific literacy into three levels. These dimensions can then be assessed individually, making the task of assessment less overwhelming. The highest level, true scientific literacy contains dimensions that are discussed in this study, which already have individual assessments. Wenning's Nature of Science Literacy Test (2006) assesses the dimension of having a proper understanding the nature of science. His Scientific Inquiry Literacy Test (2007) assesses the dimension of understanding the scientific processes of knowledge development. The Lawson Classroom Test of Formal Reasoning (1978, 2000) and the Inventory for Scientific Thinking and Reasoning (iSTAR) Assessment (2013) assess the dimension of using logic for induction and deduction or what can be referred to as scientific reasoning.

The Lawson test and iSTAR assessment were designed to assess six and eight mostly overlapping reasoning dimensions, respectively. When looking at a framework developed by Wenning and Vierya (2015), six to eight reasoning dimensions may not be enough to comprehensively assess scientific reasoning. These authors include 31 scientific reasoning skills in their framework that are organized into six defined categories based on intellectual sophistication. This study was designed to create a test that addresses these 31 skills in order to comprehensively assess high school students in a more systematic fashion.

The final iteration of the test assessed 26 of the 31 skills found in five of the six defined categories of intellectual sophistication. Before the final iteration came to fruition, a bank of test questions and the framework went through a review by five experts. Following the changes made because of this review, a pilot test of 33 questions was administered to high school students in central Illinois. The statistical analysis of this pilot test showed that the test had a mean score percentage well below the ideal 50%, and a KR-20 value considerably lower than the benchmark of .80. In order to increase the performance of the test and move these statistical values to acceptable levels, seven questions were eliminated and 12 questions were replaced or revised. These questions were primarily chosen because of their unacceptable item difficulty indices outside the .40 and .60 range, and point-biserial discrimination indices below the desirable .20 value. A second test of 26 questions reflecting these changes was administered to different high school students in central Illinois. The end result was a test had a mean score percentage relatively close to the ideal 50%, and a KR-20 value higher than the benchmark of .80. By taking these preceding steps of the expert review and administering two rounds of testing to reach the acceptable statistical values, a valid and reliable scientific reasoning test for high school students that addressed skills above and beyond the dimensions of the Lawson test and iSTAR was created.


Imported from ProQuest Hanson_ilstu_0092N_10741.pdf


Page Count