Psychological - W3 - Chapter 8 - Test Development - DN Part 1
This deck covers key concepts and terminology from Chapter 8 of the Psychological W3 course, focusing on test development processes and methodologies.
anchor protocol
a test answer sheet
developed by a test publisher
to test the accuracy of examiners’ scoring
p.280
Key Terms
anchor protocol
a test answer sheet
developed by a test publisher
to test the accuracy of examiners’ scoring
p.280
biased test item
an item that favours one group in relation to another
when differences in group ability are controlled
p.271
binary-choice item
multiple choice item
contains only two possible responses (true-false)
p.254
categorical scaling
system of scaling
stimuli placed in one of two or more alternative categories that differ quantitatively with respect to some continuum
categorical scoring
a method of evaluation
where test responses earn credit toward placement in a particular class/category
sometimes testtakers must meet ...
ceiling effect
diminished utility of a tool of assessment in distinguishing testtakers at the high end of the ability, trait, or other attribute being measured
Related Flashcard Decks
Study Tips
- Press F to enter focus mode for distraction-free studying
- Review cards regularly to improve retention
- Try to recall the answer before flipping the card
- Share this deck with friends to study together
| Term | Definition |
|---|---|
anchor protocol | a test answer sheet developed by a test publisher to test the accuracy of examiners’ scoring p.280 |
biased test item | an item that favours one group in relation to another when differences in group ability are controlled p.271 |
binary-choice item | multiple choice item contains only two possible responses (true-false) p.254 |
categorical scaling | system of scaling stimuli placed in one of two or more alternative categories that differ quantitatively with respect to some continuum p.249 |
categorical scoring | a method of evaluation where test responses earn credit toward placement in a particular class/category sometimes testtakers must meet a set number of responses corresponding to a particular criterion to be placed in a specific category also called class scoring contrast with cumulative scoring & ipsative scoring p.260 |
ceiling effect | diminished utility of a tool of assessment in distinguishing testtakers at the high end of the ability, trait, or other attribute being measured p. 259, 307 |
class scoring | a method of evaluation where test responses earn credit toward placement in a particular class/category sometimes testtakers must meet a set number of responses corresponding to a particular criterion to be placed in a specific category contrast with cumulative scoring & ipsative scoring p.260 |
comparative scaling | in test development a method of developing ordinal scales through the use of a sorting task entails judging a stimulus in comparison with every other stimulus used on the test p.249 |
completion item | requires an examinee to provide a word or phrase that completes a sentence p. 254 |
computerized adaptive testing (CAT) | an interactive, computer-administered testtaking process items are presented to the testtaker, based in part on the testtakers’ performance on previous items p.15, 255-256 |
co-norming | the test norming process conducted on two or more tests using the same sample of testtakers when used to validate all of the tests being normed, this process may also be referred to as co-validation p.138n4, 278 |
constructed-response format | a form of test item requiring a testtaker to construct or create a response as opposed to simply selecting a response contrast with selected-response format p.252 |
co-validation | when co-norming is used to validate all of the tests being normed this process may also be referred to as co-validation p.278 |
cross-validation | a revalidation on a sample of testtakers other than the testtakers on whom test performance was originally found to be a valid predictor of some criterion p.278 |
essay item | a test item that requires a testtaker to write a composition typically one that demonstrates recall of facts, understanding, analysis, and/or interpretation p.255 |
expert panel | in test development process group of people knowledgeable about - the subject matter being tested, and/or the population for whom the test is being designed they can provide input to improve test’s content, fairness etc. p.274-275 |
floor effect | a phenomenon arising from the diminished utility of a tool of assessment in distinguishing testtakers at the low end of the ability, trait, or other attribute being measured p. 256-259 |
giveaway item | a test item, usually near the beginning of a test of ability or achievement designed to be relatively easy usually for the purpose of building the testtakers confidence or reducing test-related anxiety p.263n4 |
What three criteria must be met when correcting for the impact of guessing? | must recognize that guesses are not normally totally random must deal with the problem of omitted items some testtakers are lucky and others unlucky p.269-271 |
Guttman scale | a scale - items range sequentially from weaker to stronger expressions of the attitude or belief being measured constructed so that selection of an earlier item presumes that all following items are also true of the testtaker named after its developer p.249 |