3.5 Example: Review quiz statistics
Here are some results from a case-based quiz in a Dentistry course.
If you were helping your colleague with interpreting these quiz statistics, what would you suggest? Which question would get highest priority for further investigation if you could only pick 1?
Initial analysis
Top question ("Based on the 2018 Periodontal Classification, what Generalized Stage would you assign?")
With such a difficult question, the reliability needs to be higher to justify its appropriateness. A third of the class thinks the correct answer is Stage 2 (which is enough of a concern in it's own right) and the Discrimination Index (DI) suggests this includes top performers. To investigate further, we need to download the reports.
Bottom question ("What is the Grade diagnosis based on the 2018 classification?")
If students get the question 100% right, the DI is 0. Thus, the negative DI isn’t that concerning with an “easy” question where 96% of students got it right. The real question for the instructor is: "Should this question be more difficult?" To answer that, we need to know more about the assessment holistically.
Secondary analysis
Excerpt from the Student Analysis Report
When you download the spreadsheet there will be a column at the very end for final score and each question will have it’s own column with the assigned Canvas number for that question. Here we’ve pulled the two question columns of interest and took out a bunch of student rows in the middle for ease of analysis. Scores were also sorted from highest to lowest in this excerpt.
Student analysis
SCORE |
1337718: …Generalized… |
13377189: ...Grade… |
8.75 |
Stage 2 |
Grade B |
7.75 |
Stage 2 |
Grade B |
7.75 |
Inflammation… |
Grade B |
7.75 |
Inflammation… |
Grade B |
7.75 |
Inflammation… |
Grade B |
7.75 |
Inflammation… |
Grade B |
… |
… |
|
3.5 |
Stage 3 |
Grade B |
0 |
Stage 2 |
Grade B |
Excerpts from the Item Analysis Report
Item Analysis - Reliability metrics
Top student count |
Middle student count |
Bottom student count |
Alpha |
35 |
34 |
23 |
0.540865 |
Rarely is there a true middle representation of 46% because the data isn’t normally distributed. This can factor into decisions about acceptable reliability which is why it's important to check.
Item Analysis: Question difficulty
Question |
Difficulty |
13377182 |
0.945652 |
13377183 |
0.195652 |
13377184 |
0.26087 |
13377188 |
0.576087 |
13377189 |
0.956522 |
13377190 |
0.336957 |
Three questions (13377185-7) do not have difficulty scores shown on the Item Analysis because they were multiple select types (i.e., "check all that apply"). You can see the Difficulty on the Quiz statistics page (they ranged from 1-9% so almost all students checked an additional answer choice when they shouldn’t have) but it doesn’t pull into the report.
Conclusions
Top question ("Based on the 2018 Periodontal Classification, what Generalized Stage would you assign?")
When we look at the Student Analysis Report, the top 2 students chose Stage 2. So the next action step should include determining what made Stage 2 so attractive to a third of the class and the top performers might be a good place to start this dialogue. It should also be noted that there were only 9 questions total on this quiz, which makes it extremely difficult to achieve a high reliability for the assessment (Cronbach’s alpha score above 0.7) and with each question (DI > 0.24).
Curious to know what students said?
One student explained their rationale for Stage 2 as the criteria presented for Inflammation includes >10% BOP (which wasn’t stated in the case description) and CAL </= 3. They could see a generalized diagnosis of healthy but that wasn’t a choice so they based their choice on CAL data of 79% 3-4 (noting 4 is above the inflammation choice threshold).
Curious to know what action was taken?
The instructor awarded credit for this justification since it’s quiz 1 and CAL-based assessment was reasonable so early on without a BOP provided or a choice of healthy on a reduced periodontium post-successful treatment. The case was updated with an overall BOP for next time to make the correct answer more compelling.
Bottom question ("What is the Grade diagnosis based on the 2018 classification?")
In this case, it’s the first quiz and she has a range of question difficulties to help gauge each student’s readiness. There are additional questions that are more challenging for Grade diagnosis and she’s happy to have identified 4 students who need remediation on the topic. She can now reach out to them and suggest an office hours visit. Moreover, several questions are arguably too hard with only 20-30% of students getting them correct (as shown on the item analysis).