Example: Ten trainees took a test and 9 of them missed question #5. Question #4
was missed by nobody but was testing an item very similar to that
covered by question #5.
Question #5 may be unreliable due to poor wording, unclear answers, a typographic error
that makes a wrong answer look correct, etc. This is a form of alternate question reliability.
Example: 8 of 10 trainees who missed question #7 chose answer (b). Does answer
(b) look too similar to the correct answer? Does the lesson plan support
the correct answer?
The above example could indicate the use of a method of testing known as key word and
tricky phrase testing. This type of testing causes the trainee to memorize and recall only
key words and tricky phrases to pass the test instead of requiring the trainee to learn the
material; thus it is a poor method to use.
Test items with poor reliability are easy to recognize. If trainees that are equal in knowledge
or ability have widely varying test scores, the test or test item may be unreliable. Or, if the
same trainee is tested twice on the same test or test item within a short period of time and
passes once and fails the next time, the test or test item may be unreliable. In both of these
cases the reliability should be questioned and the test or test item should be carefully
A valid test must measure exactly what it was intended to measure. A test can be reliable
but not valid, or valid but not reliable. A paper and pencil test can be reliable in measuring
knowledge of certain welding fundamentals but not valid for measuring welding skill.
Establishing the validity of tests can be a complicated and time consuming process. Validity
can be improved by:
Ensuring a good analysis of tasks has been conducted
Ensuring that knowledge and skill requirements have been identified