Psychology /Psychological Testing: Chapter 6: Validity

Psychological Testing: Chapter 6: Validity

Psychology91 CardsCreated about 2 months ago

This flashcard set outlines the concept of validity in psychological assessment, focusing on the extent to which a test accurately measures what it claims to measure. It explains the importance of context in determining validity, the role of inference, and the process of validation through evidence gathering.

Print Embed Import Report

Validity

Used in conjunction with the maningfulness of a test score; what the test score truly means; judgment or estimate of how well a test measure what it purports to measure in a particular context

Tap to flipTap or swipe ↕ to flip

Space↑↓

←→Swipe ←→Navigate

SSpeak

FFocus

1/91

Key Terms

Term

Definition

Validity

Used in conjunction with the maningfulness of a test score; what the test score truly means; judgment or estimate of how well a test measure what i...

Inference

Logical result or deduction

Valid Test

The test has been shown to be valid for a particular use with a particular population of testtakers at a particular time; Validity is within reason...

Validation

Process of gathering and evaluating evidence about validity; both the test developer and the test user may play a role in the validation of a test ...

Local Validation Studies

May yield insights regarding a particular population of testtakers as compared to the norming sample described in a test; necessary when the test u...

How Validity is Conceptualized

Content Validity
Criterion-Related Validity
Construct Validity

Related Flashcard Decks

Psychology

Psychologie du Développement

Ce jeu de cartes couvre les concepts clés, l'histoire, et les méthodes de la psychologie du développement, y compris les contributions de figures influentes et les théories du développement.

10 cards

View Deck

Psychology

2023-2025 Year 12 A-Level Psychology: Psychopathology: The Cognitive Explanation

These flashcards explore how the cognitive approach explains depression as being linked to internal mental processes. They clarify that negative and irrational thoughts don’t directly cause depression but instead increase vulnerability to it, as proposed by cognitive psychologists Ellis and Beck.

50 cards

View Deck

Psychology

2023-2025 Year 12 A-Level Psychology: Psychopathology: The Behavioural, Emotional

These flashcards describe the three main categories of psychological characteristics — behavioural, cognitive, and emotional — and identify specific behavioural and cognitive symptoms of depression, such as disrupted sleep or eating patterns, poor concentration, negative thinking patterns, and black-and-white thinking.

23 cards

View Deck

Psychology

2023-2025 Year 12 A-Level Psychology: Psychopathology: The Behavioural Treatments Of Phobias

These flashcards explain behavioural methods used to treat phobias, including systematic desensitisation and flooding. They highlight how these therapies use classical conditioning principles to help individuals unlearn maladaptive responses by breaking the association between the conditioned stimulus and the fear response.

37 cards

View Deck

Psychology

2023-2025 Year 12 A-Level Psychology: Psychopathology: The Behavioural Explanation Of Phobias

These flashcards outline how the behaviourist approach explains phobias as learned behaviours. They describe the two-process model, which involves classical and operant conditioning, showing how fear responses are acquired through association and maintained through reinforcement.

30 cards

View Deck

Psychology

2023-2025 Year 12 A-Level Psychology - Research Methods: Sampling Techniques - KU

A Target Population is the large group of individuals that a researcher is interested in studying. It represents everyone who fits the criteria for the research, from which a smaller sample is usually selected for the actual study.

34 cards

View Deck

Study Tips

Press F to enter focus mode for distraction-free studying
Review cards regularly to improve retention
Try to recall the answer before flipping the card
Share this deck with friends to study together

Psychological Testing: Chapter 6: Validity

Term	Definition
Validity	Used in conjunction with the maningfulness of a test score; what the test score truly means; judgment or estimate of how well a test measure what it purports to measure in a particular context
Inference	Logical result or deduction
Valid Test	The test has been shown to be valid for a particular use with a particular population of testtakers at a particular time; Validity is within reasonable boundaries of a comtemplated usage
Validation	Process of gathering and evaluating evidence about validity; both the test developer and the test user may play a role in the validation of a test for a specific purpose
Local Validation Studies	May yield insights regarding a particular population of testtakers as compared to the norming sample described in a test; necessary when the test user plans to alter in some way the format, instructions, language, or contest of the test;
How Validity is Conceptualized	Content Validity Criterion-Related Validity Construct Validity
Trinitarian View of Validity	Construct validity is the umbrella validity;
Approaches to assessing validity	Content Validity Criterion-related Validity Construct Validity
Approaches to Assessing Validity	Scrutinize the test’s content Relate scores obtained on the test to other test scores or other measures Executing a comprehensive analysis of How the scores on the test relate to other test scores and measures How scores on the test can be understood within some theoretical framework for understanding the construct that he test was designed to measure
Face Validity	Relates more to what a test appears to measure to the person being tested than to waht the test actually measures; face validity is a judgment concerning how relevant the test items appear to be
High Face Validity	If it appears to measure what it purports to measure what it purports to measure on the face of it
Lack of Face Validity	Contributes to a lack of confidence in the perceived effectiveness of the test - with a consequential decrease in the testtaker’s cooperation or motivation to do his or her best
Content Validity	Describes a judgment of how adequately a test samples behavior representative of the universe of behavior that the test was designed to sample
Test Blueprint	Emerges for the structure of the evaluation; a plan regarding the types of information to be covered by the items, the number of items tapping each area of coverage, the organization of the items in the test, and so forth; represents the culmination of efforts to adequately sample the universe of content areas that conceivably could be sampled in such a test
Lawshe Test	A method for gauging agreement among raters or judges regarding how essential a particular problem is
C.H. Lawshe	Proposed that each rater repond to the following querstion for each item: Is the skill or knowledge measured by this item: Essential Useful But not Essential Not Necessary
Content Validity Ratio	Negative CVR - when fewer than half the panelists indicate essential, the CVR is negative Zero CVR - when exactly half the panelists indicate essential, the CVR is Zero Positive CVR - when more than half but not all the panelists indicate essential, the CVR ranges between .00 to .99
Criterion-Related Validity	Judgment of how adequately a test score can be used to infer an individual’s most probably standing on some measure of interest-the measure of interest being the criterion
Types of Validity Evidence under Criterion-Related Validity	Concurrent Validity Predictive Validity
Concurrent Validity	An index of the degree to which a test score is related to some criterion measure obtained at the same time (concurrently)
Predictive Validity	An index of the degree to which a test score preducts some criterion measure
Characteristics of a Criterion	Relevant Valid Uncontaminated
Criterion Contimination	Term applied to a criterion measure that has been based, at least in part, on predictor measures
Concurrent Validity	When test scores obtained at about the same time that the criterion measures are obtained, then the measures of the relationship between test scores and the criterion provide evidence of concurrent validity
Predictive Validity of a Test	Indicated when measures of the relationship between the test scores and a criterion measure obtained at a future time are measured; how accurately scores on the test predict some criterion measure
Criterion-Related Validity Based on	Validity Coefficient \| Expectancy Data
Validity Coefficient	Correlation coefficient that provides a measure of the relationship between test scores and scores on the criterion measure; affected by restriction or inflation of range; should be high enough to result in the identification and differentiation of testtakers with respect to target attributes
Pearson Correlation Coefficient	Used to determine the validity between two measures
Restriction	Whether the range of scores employed is appropriate to the objective of the corelational analysis Attrition in number of subjects may occur over the course of the study and the validity coefficient may be adversely affected
Incremental Validity	The degree to which an additional predictor explains something about the criterion measure that is not explained by predictors already in use
Expectancy Data	Provide information that can be used in evaluating the criterion-related validity of a test
Expectancy Table	Shows the percentage of people within specified test score intervals who subsequently were placed in various categories of the criterion; may be created from a scattergram according to the steps listed
Taylor-Russell Tables	Provide an estimate of the extent to which inclusion of a particular test in the selection system will actually improve selection; determining the increase over current procedures
Selection Ratio	Numerical value that reflects the relationship between the number of people to be hired and the number of people available to be hired
Base Rate	Refers to the percentage of people hired under the existing system for a particular position
Steps to Create an Expectancy Table	Draw a scatterplot such that each point in the plot represents a particular test score-criterion score combination; Criterion on Y axis Draw grid lines in such a way as to summarize the number of people who scored within a particular interval Counter the number of points in each cell (n) Count the total number of points within each Vertical interval, this number represents the number of people scoring within a particular test score interval Convert each cell frequency to a percentage; this represents the percentage of people obtaining a particular test score-criterion score combination; write percentages in the cells; enclose the percentages in parentheses to distinguish them from the frequencies On a separate sheet, create table headings and subheadings and copy the percentages into the appropriate cell tables If desired, write the number and percentage of cases per test-score interval; if the number of cases in any one cell is small, it is more likely to fluctuate in subsequent charts; if cell sizes are small, the user could create fewer cells or accumulate data over several years
Naylor-Shine Table	Entails obtaining the difference between the means of the selected and unselected groups to derive an index of what the test is adding to already established procedures; determines the increase in average score on some criterion measure
Utility of Tests	Usefulness or practical value of tests
Crobrach and Gleser	Developed the Decision Theory of Tests
Decision Theory of Test	Classification of decision problems Various selection strategies ranging from single-stage processes to sequential analyses Quantitative analysis of the relationship bet ween test utility, the selection ratio, the cost of the testing program, and expected value of the outcome Recommendation that in some instances job requirements be tailored to the applicant's ability instead of the other way around
Adaptive treatment	Tailoring job requirements to the applicant's ability instead of the other way around
Base Rate	Extent to which a particular trait, behavior, characteristic, or attribute exists in the population (expressed as a proportion)
Hit Rate	Defined as the proportion of people a test accurately identifies as possessing or exhibiting a particular trait, behavior, characteristic, or attribute
Miss Rate	The proportion of people the test fails to identify as having, or not having a particular characteristic or attribute
Miss	Amounts to an inaccurate Prediction
Categories of Misses	False Positive \| False Negative
False Positive	Miss wherein the test predicted that the testtaker did not possess the particular characteristic or attribute being measured when in fact the testtaker did not
False Negative	Miss wherein the test predicted the testtaker did not possess the particular characteristic or attribute being emasured when the testtaker actually did
Naylor-Shine Table	Entails obtaining the difference between the means of the selected and unselected groups to derive an index of what the test is adding to already established procedures; determines the increase in average score on some criterion measure
Utility of Tests	Usefulness or practical value of tests
Crobrach and Gleser	Developed the Decision Theory of Tests
Decision Theory of Test	Classification of decision problems Various selection strategies ranging from single-stage processes to sequential analyses Quantitative analysis of the relationship bet ween test utility, the selection ratio, the cost of the testing program, and expected value of the outcome Recommendation that in some instances job requirements be tailored to the applicant's ability instead of the other way around
Adaptive treatment	Tailoring job requirements to the applicant's ability instead of the other way around
Item Analysis Procedures	Employed in ensuring test homogeneity; one item analysis procedure focuses on the relationship between testtakers' scores on individual items and their score on the entire test
Hit Rate	Defined as the proportion of people a test accurately identifies as possessing or exhibiting a particular trait, behavior, characteristic, or attribute
Miss Rate	The proportion of people the test fails to identify as having, or not having a particular characteristic or attribute
Miss	Amounts to an inaccurate Prediction
Categories of Misses	False Positive \| False Negative
False Positive	Miss wherein the test predicted that the testtaker did not possess the particular characteristic or attribute being measured when in fact the testtaker did not
False Negative	Miss wherein the test predicted the testtaker did not possess the particular characteristic or attribute being emasured when the testtaker actually did
Construct Validity	Judgment about the appropriateness of inferences drawn from test scores regarding individual standings on a variable called construct
Construct	An informed, scientific idea developed or hypothesized to describe or explain behavior; unobservable, presupposed (underlying) traits that a test developer may invoke to describe test behavior or criterion performance
Evidence of Construct Validity	`Evidence of Homogeniety Evidence of changes with age Evidence of Pretest-Posttest Changes Evidence of Distint Groups Convergent Evidence Discriminant Evidence Factor Analysis`
Homogeneity	Refers to how uniform a test is in measuring a single concept
How Homogeneity Can be Increased	Use of Pearson r to correlate average subtest scores with an average total test score Reconstruction or Elimination of subtests that in the test developer's judgment do not correlate very well with the test as a whole For Dichotomously scored test:Eliminating items that do not show significant correlation coefficients with total test scores For Multipoint Scaled Tests: Items that do not show significant Spearman rank-order corellation coefficients are eliminated Coefficient Alpha: used in estimating homogeneity of a test composed of multiple choice items
Item Analysis Procedures	Employed in ensuring test homogeneity; one item analysis procedure focuses on the relationship between testtakers' scores on individual items and their score on the entire test
Evidence of Changes with Age	Tests should reflect progressive changes for constructs that could be expected to change over time
Evidence of Pretest-Posttest Changes	Evidence that test scores change as a result of some experience between a pretest and posttest can be evidence of a construct validity; Any intervening life experience could be predicted to yield changes in score from pretest to posttest
Method of Contrasted Groups	Demonstrating that scores on the test vary in a predictable way as a function of membership in some group; If a test is a valid measure of a particular construct, then test scores from groups of people who would be presumed to differ with respect to that construct should have correspondingly different test scores
Convergent Evidence	Comes from Correlations with tests purporting to measure an identical construct and from correlations with measure purporting to measure related constructs
Discriminant Evidence	When a validity coefficient shows little relationship between test scores and/or other variables with which scores on the test being construct-validated should not theoretically be correlated
Multitrait-Multimethod Matrix	Experimental Technique that measured both convergent and discriminant validity evidence; matrix or table that results from correlating variables (traits) within and between methods; values for any number of traits as obtained by various methods are inserted into the table, and the resulting matrix of correlations provides insight with respect to both the convergent and the discriminant validity of the methods used
Multritrait	Two or more traits
Multimethod	Two or more methods
Factor Analysis	Shorthand term for a class of mathematical procedures designed to identify factors or specific variables that are typically attributes, characteristics, or dimensions on which people may differ; employed as a data reduction method in which several sets of scores and the correlations between them are analyzed; identifies the factor or factors in common between test scores on subscales within a particular test or the factors in common between scores on a series of tests
Exploratory Factor Analysis	Entails estimating or extracting factors, deciding how many factors to reatin, and rotating factors to an interpretable orientation
Confirmatory Factor Analysis	A factor structure is explicitly hypothesized and is tested for its fit with the observed covariance structure of the measured variables
Factor Loading	Each test is thought of as a vehicle carrying a certain amount of one or more abilities; conveys information about the extent to which the factor determines the test scores or scores
Bias	Factor inherent in a test that systematically prevents, accurate, impartial measurement
Intercept Bias	When a test systematically underpredicts or overpredicts the performance of members of a particular group with respect to a criterion; derived from the point where the regression line intersects the Y-axis
Slope Bias	When a test systematically yields significatly different validity coefficient for members of different groups
Rating	Numerical or verbal judgment (or both) that places a person or an attribute along a continuum identified by a scale of numerical or word descriptors
Rating Scale	Scale of numerical or word descriptors
Rating Error	Judgment resulting from the intentional or uninternional misuse of a rating scale
Leniency/Generosity Error	Error in rating that arises from the tendency on the part of the rater to be lenient in scoring, marking, and/or grading
Severity Error	Opposite of Leniency/Generosity Error; when tests are scored very critically by the scorer
Central Tendency Error	The rater exhibits a general and systematic reluctance to giving ratings at either the positive or negative extreme; all the rater's ratings would tend to clusted in the middle of the rating continuum
Restriction-of-Range Errors	(Central Tendency, Leniency, Severity Errors) overcome through the use of Rankings
Rankings	A procedure that requires the rater to measure individuals against one another instead of against an absolute scale; By using rankings, the rater is forced to select first, second, third choices, etc.
Halo Effect	Describes the fact that, for some raters, some ratees can do no wrong; a tendency to give a particular ratee a higher rating than he or she objectively deserves because of the rater's failure to discriminate among conceptually distinct and potentially independent aspects of a ratee's behavior
Fairness	The extent to which a test is used in an impartial, just, and equitable way

Psychological Testing: Chapter 6: Validity

Validity

Key Terms

Related Flashcard Decks

Study Tips

Company

Explore

Study Tools