Validity

DOE-HDBK-1205-97

Example: Ten trainees took a test and 9 of them missed question #5. Question #4

was missed by nobody but was testing an item very similar to that

covered by question #5.

Question #5 may be unreliable due to poor wording, unclear answers, a typographic error

that makes a wrong answer look correct, etc. This is a form of alternate question reliability.

Example: 8 of 10 trainees who missed question #7 chose answer (b). Does answer

(b) look too similar to the correct answer? Does the lesson plan support

the correct answer?

The above example could indicate the use of a method of testing known as key word and

tricky phrase testing. This type of testing causes the trainee to memorize and recall only

key words and tricky phrases to pass the test instead of requiring the trainee to learn the

material; thus it is a poor method to use.

Test items with poor reliability are easy to recognize. If trainees that are equal in knowledge

or ability have widely varying test scores, the test or test item may be unreliable. Or, if the

same trainee is tested twice on the same test or test item within a short period of time and

passes once and fails the next time, the test or test item may be unreliable. In both of these

cases the reliability should be questioned and the test or test item should be carefully

evaluated.

6.2 Validity

A valid test must measure exactly what it was intended to measure. A test can be reliable

but not valid, or valid but not reliable. A paper and pencil test can be reliable in measuring

knowledge of certain welding fundamentals but not valid for measuring welding skill.

Establishing the validity of tests can be a complicated and time consuming process. Validity

can be improved by:

Ensuring a good analysis of tasks has been conducted

Ensuring that knowledge and skill requirements have been identified