Psychological - Lecture 2 - Dr Greg Yelland (DN) (incomplete)
Validity refers to how well a test measures what it claims to measure. It’s crucial because it affects the accuracy of the conclusions drawn and the decisions made based on the test results. Without validity, test results can be misleading or misused.
validity
How well a test measures what it purports to measure
important Implications regarding
appropriateness of inferences made and
actions taken on the basis of measurements
Key Terms
validity
How well a test measures what it purports to measure
important Implications regarding
appropriateness of inferences made and
acti...
precision
sensitivity & specificity
always a compromise between sensitivity & specificity
usually screening process using sensitive test<...
accuracy
test needs to be accurate
6: 30
4 reliability
Answer: stability of measurement
measurement is stable over time & within it...
reliability
stability of measurement
measurement is stable over time & within itself
7: 20
what are the three components of reliability?
1) inter-rater reliability - more to do with scoring than the nature of tests
2) test-retest reliability - should get the same score when doi...
What is test reliability?
this is not scorer reliability
test-retest - stability over time
internal consistency
homogenous - all items just testing one fac...
Related Flashcard Decks
Study Tips
- Press F to enter focus mode for distraction-free studying
- Review cards regularly to improve retention
- Try to recall the answer before flipping the card
- Share this deck with friends to study together
| Term | Definition |
|---|---|
validity | How well a test measures what it purports to measure important Implications regarding appropriateness of inferences made and actions taken on the basis of measurements |
precision | sensitivity & specificity always a compromise between sensitivity & specificity usually screening process using sensitive test then use highly specific test to determine which actually have dementia 3:00 |
accuracy | test needs to be accurate 6: 30 Answer: stability of measurement measurement is stable over time & within itself 7: 20 |
reliability | stability of measurement measurement is stable over time & within itself 7: 20 |
what are the three components of reliability? | 1) inter-rater reliability - more to do with scoring than the nature of tests 2) test-retest reliability - should get the same score when doing the same test twice 3) internal consistency - within the test ppl should be scoring consistently - items should items should be equally good at measuring what they are trying to measure 7: 50 |
What is test reliability? | this is not scorer reliability test-retest - stability over time internal consistency homogenous - all items just testing one factor (anxiety) should be equally good at assessing that factor need to be aware of how many factors/behaviours a test is measuring if intend to measure one then should only measure one 10: 00 |
What is reliability? | the proportion of total variance (σ2) made up of the true variance (σ2tr) variability in test scores: σ2 = σ2tr + σ2e reliability of a test score is always made up of true score + error X=T+E error is made up of random error & systematic error |
Whenever we are talking about reliability & validity, we are talking about…….. | correlation or correlation coefficients i.e., how well things are correlated on different aspects e. g., with: test-retest (looking at the correlation between first & second time test taken) internal consistency (looking at the correlation between different items on the test) 15:30 |
What are some sources of error variance? | Test Construction Test Administration - Environment - Test-Taker Variables - Examiner-Related Variables Test Scoring/Interpretation each can contain both random & systematic error 16:20 |
What is the difference between systematic & random error variance? | Systematic - constant, or proportionate source of errror in variables other than the target variable should not affect variance in scores Random - caused by unpredictable fluctuations & inconsistencies in variables other than the target variable Systematic changes should not affect the scores; unpredictable changes will affect the correlation; the more robust the test to fluctuation, the greater the reliability. |
How does error occur in test contruction? | the way you select or sample test items if all items consistently perform in the same way (the way you intended them) systematic error - could come from an ambiguous question - some ppl may respond one way and others another random error - may have one or two questions where someone does not have enough experience to give the standard response to the item 17:00 |
How can error occur during test administration? | Environmental Variables Test-Taker Variables Examiner-Related Variables |
How do testtakers contribute to error? | Test-Taker Variables during test administration differences between ppl taking the tests systematic - different ages & not taking ages into account random - age, personality etc issue: dont necessarily want to minimize by only testing 10 year olds coz then test is only relevant to 10 yr olds solution: so do 10 yrs, 11yrs, 12yrs etc, then create norms for different ages (age norms) - takes care of the variable by having different normative data for different ages 20:00 |
How does the test environment contribute to error? | during test administration one may be tested in noisy another in a quiet environment testing in a group or individually affects test scores |
How can examiners contribute to error? | during test administration examiner humanness - may be exhausted by last test - may skip bits to hurry it up |
How can test scoring/interpretation contribute to error? | subjectively scored tests have greater error (because rely on subjective judgements) moving toward computer based scoring to remove this source of error cannot have computer based if its the quality of the response (qualitative) much more error on qualitative than quantitative 22: 35 |
What should we aim for with regard to error & reliability | aim to remove systematic error and minimise random error so we get better reliability 24:35 |
What are some reliability estimates? | test-retest parallel forms/alternate forms 24:50 |
What is a test-retest reliability estimate? | 24:45 same test taken twice - then see how well the scores are correlated issue of how long an interval between testing? the shorter the interval = the higher the test-retest reliability, because there are lots of things that can change in an individual over time systematic changes should not affect test-retest reliability e.g., hot room, cold room (everyone affected equally) 26:50 random changes will affect correlation (test-retest reliability) (27:15) the more robust the test is to fluctuation = more reliability e. g., a test that is not affected by time of day, or amount of sleep etc - robust enough to wash those effects out - therefore (28:30) participant factors will affect test-retest reliability - experience, practice, fatigue, memory, motivation, morningness/eveningness - as everyone differs in these areas = greater error variance - practise effects - give you a clue about what is going to happen next time we do the same test - this may mean that we cannot use test-retest 24: 45 |
When would we use Parallel or Alternate forms of a test? | when we cannot use tes-retest reliability due to e.g., practise effects giving testtaker a clue about what will be on the test next time |
What is a parallel forms or alternate forms reliability estimate? | parallel vs. alternate - parallel forms - are better developed - items have been selected so that the mean & variance has been shown to be equal ** **- alternate forms - similar but no guarantee that variance is the same (hence have introduced a source of error) testing is similar to process as test-retest - do one test then do the parallel or an alternate form. test sampling issues - problem: is test sampling issue (choice of items) - best items are usually the best of the items available (unless create both tests at the same time 30: 50 |
What is one of the biggest problems faced when using a parallel form or alternate form of a test? | test sampling issues - problem: is test sampling issue (choice of items) - best items are usually used when creating the initial version of the test (unless creating both tests at the same time) identifying source of error is it because it is not stable over time or is it because the different items (content) of the two tests are introducing error is it stable over time? (external) internal consistency across the two tests? (internal) 33: 50 |
Internal Consistency (Reliability) | Split-Half testing Split into two halves Obtain correlation coefficient |
What is the point of Split-Half testing? | To obtain internal consistency of full version - Spearman-Brown Formula Estimates internal consistency of a test that is twice the length |
When is the Spearman-Brown formula used? | To obtain internal consistency of full version - of split-half tests Estimates internal consistency of a test that is twice the length not used when more than one factor (heterogeneity) not appropriate for speed tests must have homogeneity when using split-half method because could end up with an imbalanced distribution of the factors across the two halfs Spearman-Brown Split-Half Coefficient rSB = 2rhh / (1+rhh ) rSB = 2 x 0.9/ (1+0.9) rSB = 1.8/ 1.9 rSB = 0.947 |
When would we use Cronbach's Alpha? | when we need an estimate to represent the sum of all of the individual variances in a split half test it estimates internal consistency for every possible split-half A generalised reliability coefficient for scoring systems that are graded by each item (sums all of them) used when items are graded (cannot not be used with dichotomous items) Essentially an estimate of ALL possible test-retest or split- half coefficients. α can range between 0 and 1 (ideally closer to 1) cannot measure mutliple traits - must be homogeneous |
When would we use Kuder-Richardson? 51:25 | when test is dichotomous tests every possible split-half correlations or test-retest mainly used in split-half |
What is acceptable range of reliability? | 53:35 Clinical – r > 0.85 acceptable Research – r > ~0.7 acceptable Reliabilities of Major Psychological Tests* INTERNAL CONSISTENCY WAIS – r = 0.87 MMPI – r = 0.84 TEST-RETEST WAIS – r = 0.82 MMPI – r = 0.74 |
summary of reliability |