OUTLINE FOR THE EVALUATION OF A TEST

Title of the Test
Author(s)
Publisher
Date of Publication
Date of previous editions
Forms Available
Cost

1. Manual

Does a manual accompany the test?
Adequacy
Is there a separate technical manual and at what cost?

2. Stated purpose of the test?

Definition of Construct
"Dumbed Down Tests" (tests designed for adults or adolescents redone for children)

3. Does the name of the test reflect the test content?

Do the names of the Individual Subtests (where applicable) reflect the content?

4. Form(s) of the items: (Oral, Hands-on, Multiple-choice, Fill-ins, etc.)

Are there problems with this form or content?
Is scoring ambiguous?
Do the items appear to measure what was intended? (e.g., Do reading items really test memory?)

5. Basis of the arrangement of the items in the test?

Subtests
Scales
Spiral Omnibus
Random
Hierarchical
Homogeneity: Changes within subtests
Distinctness
Sexism and other biases

6. Printing, format and arrangement of test items.

Easels and other hardware
Color use: does it help or hurt?
Readability

7. Protocols

Room to write
Answers to examinee
Report forms
Clarity
Ease of use
Do they encourage use of confidence bands? Do they offer 90% and 95% bands?

8. Directions for administration

Clarity and adequacy?
Location (manual/protocol/both)
Flexibility
Age appropriateness

9. Directions to the examinee

Clarity and adequacy
Natural or Stilted
Boehm's basic concepts
Alternative directions

10. Time limits and bonuses?

Are they justified?
Are there alternatives?

11. Teaching items?

Scored or unscored
Adequacy of instructions
Can you teach over and over?

12. Test materials

Child safety
Ease of use
Durability

13. Scoring

Is scoring easy? objective? subjective? arbitrary? agreed upon?
Are there adequate samples of correct answers?
Rotation errors: differences on tests
Are printed norms tables also available?
Is computer program necessary?
Is computer program provided?

14. Raw scores conversions

Interpolation
Which standard scores are reported?
Age scores: Why/why not
Grade scores: Why/why not
Percentiles
Standard scores:
Z
T
Stanines
Deviation quotients (M=100, s.d.=15 or 16)
Others

15. Standardization groups?

Total
Number per year of age
National representation
Breakdowns

16. For what groups is the test designed?

Recent
Relevant
Representational
Age
Grade
Sex
SES
Education
Geographic regions
Urban vs. rural
Ethnicity
Disabilities

17. Reliability coefficients

Internal (split halves)
Alternate forms
Test retest practice effect
inflation of r
Length of test
Test retest interval
SEm
SEest
Inter-rater reliability

18. Validity


For what purpose?
Content
are the questions appropriate ?
are there enough questions?
level of mastery being measured?
Criterion
concurrent vs. predictive
Construct
Discriminant use vs. divergent use

19. Factor analysis


Exploratory
Confirmatory
Rotations
Different groups
Variance
Common
Error
Specificity

20. User friendliness


Administrator
Client: Take it yourself

21. References


Antiquity
Authors of bibliography
Relevance to current edition

22. Interpretation


Base rate
Definitions for constructs and shared abilities
Multiple comparison tables (critical values)
Significance vs. abnormality (unusualness vs. importance) (scatter)
Testing the metaphysically handicapped (dead)
What a difference a day makes
Table Games
Floor and Ceilings
Descriptive terms
Errors
Cautions

Return to

Comments (0)

Post a Comment
* Your Name:
* Your Email:
(not publicly displayed)
Reply Notification:
Approval Notification:
Website:
* Security Image:
Security Image Generate new
Copy the numbers and letters from the security image:
* Message: