In addition to being reliable, measures must also be valid.Validity refers to whether a measure is truthful or genuine.In other words, a measure that is valid measures what it claims to measure. Several types of validity may be examined; we will discuss four types here.As with reliability, validity is measured by the use of correlation coefficients.For example, if research develop a new test to measure depression, they might establish the validity of the test by correlating scores on the new test with scores on an already established measure of depression, and as with reliability, we would expect the correlation to the positive.Unlike reliability coefficient, however, there is no established criterion for the strength of the validity coefficient. Coefficients as low as .20 or .30 may establish the validity of a measure (Anastasi & Urbina, 1997).For validity coefficients, the important thing is that are statistically significant at .05 or .01 level.We’ll explain this term in a later chapter, but in brief, it means that the results are most likely not due to chance.
A systematic examination of the test content to determine whether it covers a representative sample of the domain of behaviours to be measured assesses content validity. In other words, a test with content validity has items that satisfactorily assess the content being examined. To determine whether a test has content validity, you should consult experts in the area being tested. For example, when designing the GRE subject exam for psychology, professors of psychology are asked to establish that they represent relevant information from the entire discipline of psychology as we know it today.
Sometimes face validity is confused with content validity. Face validity simply addresses whether or not a test looks valid on its surface. Does it appear to be an adequate measure of conceptual variable?This is not really valid in technical sense, because it refers not to what the test actually measures but to what it appears to measure. Face validity relates to whether or not the test selected by the school board or measure student achievement “ appear” to be an actual measure of achievement? Face validity has more to do with rapport and public relations than with actual validity.( Anastasi & Urbina, 199)
To extent to which a measuring instrument accurately predicts behaviour or ability in a given area establishes criterion validity. Two types of criterion validity may be used, depending on whether the test is used to estimate present performance (concurrent validity). The SAT and GRE are examples of tests which have predictive validity because performance on the test correlates with later performance in college and graduate school, respectively. The tests can be used with some degree of accuracy to “predict” future behaviour. A test used to qualify whether or not someone qualifies as a pilot is a measure of concurrent validity. We are estimating the person’s ability at the present time , not attempting to predict future outcomes. Thus concurrent validation is used for diagnosis of existing status rather than prediction of future outcomes.
Construct validity is considered by many to be the most type of validity. The Construct validity of a test assesses the extent to which a measuring instrument accurately measures a theoretical construct or trait that it is designed to measure. Some examples of theoretical constructs or traits are verbal fluency, neuroticism, depression , anxiety, intelligence , and scholastic aptitude. One means of establishing Construct validity is by correlating performance on the test with performance on attest in which Construct validity has already been determined . For example, performance on a newly developed intelligence test might be correlated with performance on an existing intelligence test for which construct validity has been previously established . Another means of establishing Construct validity is to show that the scores on the new test differ across people with different levels of trait being measured. For example, if you are measuring depression, you can compare scores on the test for those known to be suffering from depression with scores for those not suffering from depression. The new measure has construct validity if it measures the construct of depression accurately.