Psychological - W3 - Chapter 8 - Test Development - DN Part 2

Question 1

ipsative scoring

Accepted Answer

approach to scoring & interpretationresponses & presumed strength of measured trait are interpreted relative to the measured strength of other traits for that testtakercontrast with class scoring & cumulative scoringp.260

Question 2

item analysis

Accepted Answer

general term used to describe various procedures

usually statistical, designed to explore how individual items work compared to others in the test & in the context of the whole test

e.g., to explore the level of difficulty of individual items on an achievement test

e.g., to explore the reliability of a personality test

contrast with qualitative item analysis

p.262-275

Question 3

item bank

Accepted Answer

a collection of questions to be used in the construction of a testp. 255, 257-259, 282-284

Question 4

item branching

Accepted Answer

in computerised adaptive testing (CAT)the individualised presentation of test items drawn from an item bank based on the testtakers’ previous responsesp.260

Question 5

item-characteristic curve (ICC)

Accepted Answer

graphic representation of the probalistic relationship between a person's level of trait (ability, characteristic) being measured and the probability for responding to an item in a predicted way

also known as a category response curve or an item trace line

p.177, 281 p.268

Question 6

item-difficulty index

Accepted Answer

items cannot be too easy or too hard in order to differentiate between testtakers knowledge of the subject matter

a statistic obtained by calculating the proportion of the total number of testtakers who answered an item correctly

p is used to denote item difficulty

a subscript 1 refers to the item number = p1

can range from 0-1

the larger the item-difficulty index, the easier the item

(i.e., the higher the p, the easier the item - because p represents the number of people passing the item)

p.263-264

Question 7

item-discrimination index

Accepted Answer

measure of item discriminationsymbolised by dp.264-268

Question 8

item-endorsement index

Accepted Answer

the name given to an item-difficulty test (which is used in achievement testing) when used in other contexts (e.g., personality testing)p. 263

Question 9

item fairness

Accepted Answer

a reference to the degree of bias, if any, in a test itemp. 271-272

Question 10

item format

Accepted Answer

a reference to the form, plan, structure, arrangement, or layout of individual test itemsincluding whether the test items require testtakers to select or create a responsep.252-255

Question 11

item pool

Accepted Answer

the reservoir or well from which items will or will not be drawn for the final version of the testthe collection of items to be further evaluated for possible selection for use in an item bankp.251

Question 12

item-reliability index

Accepted Answer

provides an indication of the internal consistency of a test

the higher the index, the greater the internal consistency

index is equal to

the product of the item-score standard deviation (s) and

the correlation (r) between the item score and the total test score

p.264

Question 13

item-validity index

Accepted Answer

a statistic designed to provide an indication of the degree to which a test is measuring what it purports to measure

important when a test developer's goal is to maximise the criterion-related validity of a test

the higher the item-validity index, the greater the test's criterion-related validity

to calculate we must first know

the item-score standard deviation (symbolised as s1, s2, s3 etc.)

and the correlation between the item score and the criterion score

then we use the item difficulty index p1 in the following formula

s1 = square root of p1 (1 - p1)

the correlation between the score on item 1 and a score on a criterion measure (r1c) is multiplied by item 1's item-score standard deviation (s1)

the product is an index of an items validity (s1 r1c)

p.264

Question 14

Likert scale

Accepted Answer

summative rating scale with 5 alternative responsesranging on a continuum from e.g., "strongly agree" to "strongly disagree"p.247

Question 15

matching item

Accepted Answer

the testtaker is presented with two columns

premises on the left & responses on the right

task is to determine which response is best matched to which premise

young testtakers (draw a line)

others typically asked to write a letter/number as a response

p.253

Question 16

method of paired comparisons

Accepted Answer

a scaling methoda pair of stimuli (e.g., photos) is selected according to a rule(e.g., "select the one that is more appealing") p.248

Question 17

multiple-choice format

Accepted Answer

one of the three types of selected-response item formatsthree elementsa stema correct alternative or optionand several incorrect alternatives (referred to as distractors or foils)p.252

Question 18

pilot work

Accepted Answer

also referred to as pilot study & pilot research

preliminary research surrounding the creation of a prototype test

general objective is to determine how best to

gauge

assess, or

evaluate the targeted construct(s)

p.243-244

Question 19

qualitative item analysis

Accepted Answer

non-statistical procedures designed to explore how individual test items work

both compared to other items in the test & in the context of the whole test

unlike statistical measures, they involve exploration of the issues by verbal means

(e.g., interviews & group discussions with testtakers & other relevant parties)

p.272-275

Question 20

qualitative methods

Accepted Answer

techniques of data generation & analysisrely primarily on verbal rather than mathematical or statistical proceduresp.272

Psychological - W3 - Chapter 8 - Test Development - DN Part 2

ipsative scoring

Key Terms

Related Flashcard Decks

Study Tips

Company

Explore

Study Tools

Term	Definition
ipsative scoring	approach to scoring & interpretation responses & presumed strength of measured trait are interpreted relative to the measured strength of other traits for that testtaker contrast with class scoring & cumulative scoring p.260
item analysis	general term used to describe various procedures usually statistical, designed to explore how individual items work compared to others in the test & in the context of the whole test e.g., to explore the level of difficulty of individual items on an achievement test e.g., to explore the reliability of a personality test contrast with qualitative item analysis p.262-275
item bank	a collection of questions to be used in the construction of a test p. 255, 257-259, 282-284
item branching	in computerised adaptive testing (CAT) the individualised presentation of test items drawn from an item bank based on the testtakers’ previous responses p.260
item-characteristic curve (ICC)	graphic representation of the probalistic relationship between a person's level of trait (ability, characteristic) being measured and the probability for responding to an item in a predicted way also known as a category response curve or an item trace line p.177, 281 p.268
item-difficulty index	items cannot be too easy or too hard in order to differentiate between testtakers knowledge of the subject matter a statistic obtained by calculating the proportion of the total number of testtakers who answered an item correctly p is used to denote item difficulty a subscript 1 refers to the item number = p1 can range from 0-1 the larger the item-difficulty index, the easier the item (i.e., the higher the p, the easier the item - because p represents the number of people passing the item) p.263-264
item-discrimination index	measure of item discrimination symbolised by d p.264-268
item-endorsement index	the name given to an item-difficulty test (which is used in achievement testing) when used in other contexts (e.g., personality testing) p. 263
item fairness	a reference to the degree of bias, if any, in a test item p. 271-272
item format	a reference to the form, plan, structure, arrangement, or layout of individual test items including whether the test items require testtakers to select or create a response p.252-255
item pool	the reservoir or well from which items will or will not be drawn for the final version of the test the collection of items to be further evaluated for possible selection for use in an item bank p.251
item-reliability index	provides an indication of the internal consistency of a test the higher the index, the greater the internal consistency index is equal to the product of the item-score standard deviation (s) and the correlation (r) between the item score and the total test score p.264
item-validity index	a statistic designed to provide an indication of the degree to which a test is measuring what it purports to measure important when a test developer's goal is to maximise the criterion-related validity of a test the higher the item-validity index, the greater the test's criterion-related validity to calculate we must first know the item-score standard deviation (symbolised as s1, s2, s3 etc.) and the correlation between the item score and the criterion score then we use the item difficulty index p1 in the following formula s1 = square root of p1 (1 - p1) the correlation between the score on item 1 and a score on a criterion measure (r1c) is multiplied by item 1's item-score standard deviation (s1) the product is an index of an items validity (s1 r1c) p.264
Likert scale	summative rating scale with 5 alternative responses ranging on a continuum from e.g., "strongly agree" to "strongly disagree" p.247
matching item	the testtaker is presented with two columns premises on the left & responses on the right task is to determine which response is best matched to which premise young testtakers (draw a line) others typically asked to write a letter/number as a response p.253
method of paired comparisons	a scaling method a pair of stimuli (e.g., photos) is selected according to a rule (e.g., "select the one that is more appealing") p.248
multiple-choice format	one of the three types of selected-response item formats three elements a stem a correct alternative or option and several incorrect alternatives (referred to as distractors or foils) p.252
pilot work	also referred to as pilot study & pilot research preliminary research surrounding the creation of a prototype test general objective is to determine how best to gauge assess, or evaluate the targeted construct(s) p.243-244
qualitative item analysis	non-statistical procedures designed to explore how individual test items work both compared to other items in the test & in the context of the whole test unlike statistical measures, they involve exploration of the issues by verbal means (e.g., interviews & group discussions with testtakers & other relevant parties) p.272-275
qualitative methods	techniques of data generation & analysis rely primarily on verbal rather than mathematical or statistical procedures p.272