Validating test items with spss
You can also assess convergent and discriminant validity by using Multi-trait scaling or Multi-trait multi-method modeling (based on interitem correlations within and between scales), or, again, SEMs.Then, I would say that Item Response Theory would not help that much unless you are interested in shortening your questionnaire, filtering out some items that show differential item functioning, or use your test in some kind of a computer adaptive test. For polytomous ordered items, the most commonly used models are : Only the latter two are from the Rasch family, and they basically use an adjacent odds formulation, with the idea that subject has to "pass" several thresholds to endorse a given response category.Be aware that these models all suppose that the scale is unidimensional; i.e., there's only one latent trait.There are additional assumptions like, e.g., local independence (i.e., the correlations between responses are explained by variation on the ability scale).Therefore, we would not want to remove these questions.Removal of question 8 would lead to a small improvement in Cronbach's alpha, and we can also see that the "Corrected Item-Total Correlation" value was low (0.128) for this item.If your questions reflect different underlying personal qualities (or other dimensions), for example, employee motivation and employee commitment, Cronbach's alpha will not be able to distinguish between these.In order to do this and then check their reliability (using Cronbach's alpha), you will first need to run a test such as a principal components analysis (PCA).
It is worth running an exploratory factor analysis to check for that.
The Item-Total Statistics table presents the "Cronbach's Alpha if Item Deleted" in the final column, as shown below: This column presents the value that Cronbach's alpha would be if that particular item was deleted from the scale.
We can see that removal of any question, except question 8, would result in a lower Cronbach's alpha.
Cronbach's alpha is the most common measure of internal consistency ("reliability").
It is most commonly used when you have multiple Likert questions in a survey/questionnaire that form a scale and you wish to determine if the scale is reliable.
Each question was a 5-point Likert item from "strongly disagree" to "strongly agree".