How to interpret model fit results is probably one of the most frequently asked questions whenever Confirmatory Factor Analysis and Structural Equation Modeling (SEM) topics pop up.
Therefore I decided to write a simple guideline that explains where to find and how to interpret all model fit parameters in AMOS, their accepted value range based on a thoughtful literature review on the matter.
In simple terms, model fit in statistics measures the variance between observed and model-implied data using correlation and covariance matrices. Though calculating is a model that fits the data is not too complicated in AMOS, interpreting the results can be a bit challenging at times for students.
The model fit results in AMOS comprises of the following indexes/parameters:
- Chi-Square (CMIN)
- Goodness of Fit Index (GFI)
- Baseline Comparisons in Model Fit
- Parsimony-Adjusted Measures
- Non-Centrality Parameter (NCP)
- Index of Model Fit (FMIN)
- Root Mean Square Error of Approximation (RMSEA)
- Akaike Information Criterion (AIC)
- Expected Cross Validation Index (ECVI)
- Hoelter Index
Next, we will take each of the indexes above, provide a short description for each and add a model fit example for interpretation. Acceptable value ranges will be provided where applied as guidelines for when you write your research paper.
This article concludes with a table containing the most relevant model fit parameters, their ranges, and respective referencing.
Interpreting CMIN in Model Fit Results
CMIN stands for the Chi-square value and is used to compare if the observed variables and expected results are statistically significant. In other words, CMIN indicates if the sample data and hypothetical model are an acceptable fit in the analysis.
In Amos, the CMIN result can be found under View → Text Output → Model Fit → CMIN and looks like in the following table:
Model | NPAR | CMIN | DF | P | CMIN/DF |
Default model | 4 | 2.119 | 220 | .000 | 2.72 |
Saturated model | 19 | .000 | 0 | ||
Independence model | 24 | 3.765 | 253 | .000 | 5.35 |
Where:
NPAR = Number of Parameters for each model (default, saturated, and independence).
CMIN = Chi-square value. If significant, the model can be considered unsatisfactory.
DF = Degree of Freedom measures the number of independent values that can diverge without obstructing any limitations in the model.
P = the probability of getting a discrepancy as large as CMIN value if the respective model is correct.
CMIN/DF = discrepancy divided by degree of freedom.
The value of interest here is the CMIN/DF for the default model and is interpreted as follows:
- If the CMIN/DF value is ≤ 3 it indicates an acceptable fit (Kline, 1998).
- If the value is ≤ 5 it indicates a reasonable fit (Marsh & Hocevar, 1985)
Interpreting GFI in Model Fit Results
GFI stands for Goodness of Fit Index and is used to calculate the minimum discrepancy function necessary to achieve a perfect fit under maximum likelihood conditions (Jöreskog & Sörbom, 1984; Tanaka & Huba, 1985).
In Amos, the GFI result can be found under View → Text Output → Model Fit → RMR, GFI and looks similar to the table below:
Model | RMR | GFI | AGFI | PGFI |
Default model | .188 | .781 | .649 | .684 |
Saturated model | .000 | 1.000 | ||
Independence model | .194 | .525 | .455 | .507 |
Where:
RMR = Root Mean Square Residual. The smaller the RMR value the better. An RMR of 0 represents a perfect fit.
GFI = Goodness of Fit Index and takes values of ≤ 1 where 1 represents a perfect fit.
AGFI = Adjusted Goodness of Fit Index and indicates the degree of freedom (df) for testing the model. A value of 1 indicates a perfect fit. Unlike GFI, AGFI values do not stop at 0.
PGFI = Parsimony Goodness of Fit Index is a modification of GFI (Mulaik et al.,1989) and calculates the degree of freedom for the model.
The value of interest here is the GFI for the default model and interpreted as follows:
- A value of 1 represents a perfect fit.
- A value ≥ 0.9 indicates a reasonable fit (Hu & Bentler, 1998).
- A value of ≥ 0.95 is considered an excellent fit (Kline, 2005).
Interpreting Baseline Comparisons in Model Fit Results
Baseline Comparisons refers to the models automatically fitted by Amos for every analysis, respectively the default, saturated, and independence model.
In Amos, the Baseline Comparisons results can be found under View → Text Output → Model Fit → Baseline Comparisons and look similar to the table below:
Model | NFI Delta1 | RFI rho1 | IFI Delta2 | TLI rho2 | CFI |
Default model | .957 | .890 | .966 | .900 | .965 |
Saturated model | 1.000 | 1.000 | 1.000 | ||
Independence model | .000 | .000 | .000 | .000 | .000 |
Where:
NFI = Normed Fit Index also referred to as Delta 1 (Bollen, 1898b), and consists of values scaling between (terribly fitting) independence model and (perfectly fitting) saturated model. A value of 1 shows a perfect fit while models valued < 0.9 can be usually improved substantially (Bentler & Bonett, 1980).
RFI = Relative Fit Index and derived from NFI where values closed to 1 indicate a very good fit while 1 indicates a perfect fit.
IFI = Incremental Fit Index where values closed to 1 indicates a very good fit while 1 indicates a perfect fit.
TLI = Tucker-Lewis coefficient also known as Bentler-Bonett non-normed fit index (NNFI) ranges from (but not limited to) 0 to 1 where a value closer to 1 represents a very good fit while 1 represents a perfect fit.
CFI = Comparative Fit Index has value truncated between 0 and 1 where values closed to 1 show a very good fit while 1 represents the perfect fit (Hu & Bentler, 1999)
The value of interest here is CFI for the default model. A CFI value of ≥ 0.95 is considered an excellent fit for the model (West et al., 2012).
Interpreting Parsimony-Adjusted Measures in Model Fit Results
Parsimony-Adjusted Measures refers to relative fit indices that are adjusted for the majority of indices discussed so far.
Think of adjustments as penalties for less parsimonious models. In other words, the more complex the model the lower the fit index as generally a simpler explanation of a phenomenon is favored over a complex one.
In Amos, the Parsimony-Adjusted Measures results can be found under View → Text Output → Model Fit → Parsimony-Adjusted Measures and look something like this:
Model | PRATIO | PNFI | PCFI |
Default model | .970 | .984 | .991 |
Saturated model | .000 | .000 | .000 |
Independence model | 1.000 | .000 | .000 |
Where:
PRATIO = Parsimony Ratio that calculates the number of constraints in a model and is used to compute the PNFI and PCFI indices.
PNFI = Parsimony Normed Fixed Index expressing the result of parsimony adjustment (James, Mulaik & Brett, 1982) to the Normed Fixed Index (NFI).
PCFI = Parsimony Comparative Fix Index expressing the result of parsimony adjustment applied to the Comparative Fit Index (CFI).
Interpreting NCP in Model Fit Results
NCP stands for Non-Centrality Parameter expressing the degree to which a null hypothesis is false.
In Amos, the NCP results can be found under View → Text Output → Model Fit → NCP and look similar to this:
Model | NCP | LO 90 | HI 90 |
Default model | 2.201 | 2.871 | 7.889 |
Saturated model | .000 | .000 | .000 |
Independence model | 1.765 | 3.860 | 5.986 |
Where:
NCP = Non-Centrality Parameter value with boundaries expressed by LO (NcpLo) and Hi (NcpHi) respectively the lower and higher boundaries of 90% confidence interval for the NCP.
LO 90 = Lower boundary (NcpLo method) of a 90% confidence interval for the NCP.
HI 90 = Upper boundary (NcpHi method) of a 90% confidence interval for the NCP.
From the example table above, the population NCP for the default model is between 2.87 and 7.88 with a confidence level of approximately 90 percent.
Interpreting FMIN in Model Fit Results
FMIN stands for Index of Model Fit and is reported when CMIN does not have a positive result usually caused by a larger sample size.
In Amos, the FMIN results can be found under View → Text Output → Model Fit → FMIN and are displayed as in the following example:
Model | FMIN | F0 | LO 90 | HI 90 |
Default model | 1.590 | 1.019 | 1.371 | 1.683 |
Saturated model | .000 | .000 | .000 | .000 |
Independence model | 1.181 | 1.524 | 1.542 | 1.522 |
Where:
FMIN = Index of Model Fit with boundaries expressed by LO and Hi respectively the lower and higher boundaries of 90% confidence interval for the FMIN. A value closer to 0 represents a better model fit for the observed data with 0 being the perfect fit.
F0 = Confidence interval
LO 90 = Lower boundary of a 90% confidence interval of the FMIN.
HI 90 = Higher boundary of a 90% confidence interval of the FMIN.
Interpreting RMSEA in Model Fit Results
RMSEA stands for Root Mean Square Error of Approximation and measures the difference between the observed covariance matrix per degree of freedom and the predicted covariance matrix (Chen, 2007).
In Amos, the RMSEA results can be found under View → Text Output → Model Fit → RMSEA and are expressed as in the following table:
Model | RMSEA | LO 90 | HI 90 | PCLOSE |
Default model | .073 | .074 | .077 | .000 |
Independence model | .035 | .035 | .038 | .000 |
Where:
RMSEA = Root Mean Square Error of Approximation where values higher than 0.1 are considered poor, values between 0.08 and 0.1 are considered borderline, values ranging from 0.05 to 0.08 are considered acceptable, and values ≤ 0.05 are considered excellent (MacCallum et al, 1996).
LO 90 = Lower boundary (RmseaLo) of a 90% confidence interval of the RMSEA.
HI 90 = Higher boundary (RmseaHi) of a 90% confidence interval of the RMSEA.
PCLOSE = P-value of the null hypothesis
The value of interest here is represented by RMSEA in the default model field where values ≤ 0.05 indicate a better model fit (MacCallum et al, 1996).
Interpreting AIC in Model Fit Results
AIC stands for Akaike Information Criterion (Akaike, 1987) and is used to measure the quality of the statistical model for the data sample used. The AIC is a score represented by a single number and used to determine model is the best fit for the data set.
In Amos, the AIC results can be seen under View → Text Output → Model Fit → AIC as shown in the example below:
Model | AIC | BCC | BIC | CAIC |
Default model | 4.201 | 1.647 | 7.728 | 6.728 |
Saturated model | 2.000 | 8.698 | 7.811 | 7.101 |
Independence model | 0.765 | 3.823 | 6.749 | 4.749 |
Where:
AIC = Akaike Information Criterion score useful only when compared with other AIC scores of the same data set. The lower the AIC value the better.
BCC = Browne-Cudeck Criterion specifically used to analyze the moment structures and impose a higher penalty for less parsimonious models.
BIC = Bayes Information Criterion applies a higher penalty to complex models in contrast with AIC, BCC, CAIC and therefore has a greater propensity of choosing parsimonious models.
CAIC = Consistent Akaike Information Criterion (Atilgan & Bozdogan, 1987) is reported only when the means and intercepts are not explicit in the case of a single group. CAIC applies a penalty for complex models higher than AIC and BCC but less severe than BIC.
Interpreting ECVI in Model Fit Results
ECVI stands for Expected Cross Validation Index (Browne & Cudeck, 1993) measures the predicting future of a model using simple transformation of chi-square similar to AIC (excepting the constant scale factor).
In Amos, the ECVI results can be found under View → Text Output → Model Fit → ECVI and look like in the example below:
Model | ECVI | LO 90 | HI 90 | MECVI |
Default model | 2.881 | 1.233 | 3.545 | 2.900 |
Saturated model | .434 | .434 | .434 | .529 |
Independence model | 2.301 | 1.319 | 3.299 | 2.309 |
Where:
ECVI = Expected Cross Validation Index where a smaller value represents a better model fit.
LO 90 = lower limit of a 90% confidence interval for the population ECVI.
HI 90 = upper limit of a 90% confidence interval for the population ECVI.
MECVI = excepting a scale factor used in computation, the MECVI is similar to Browne-Cudeck Criterion (BCC).
Interpreting HOELTER Index in Model Fit Results
The Hoelter index is used to measure if the chi-square is significant or not.
In Amos, the Hoelter Index results can be found under View → Text Output → Model Fit → HOELTER with indices expressed as follows:
Model | HOELTER.05 | HOELTER.01 |
Default model | 228 | 201 |
Independence model | 241 | 208 |
Where:
HOELTER .05 = measures if the sample size can be accepted at the 0.05 level for the default model. To paraphrase, if your sample size is higher than the value specified for the default model at 0.05 level, the default model should be rejected.
HOELTER .01 = calculates if the sample size for the default model can be accepted at the 0.01 level. Respectively, if the sample size is higher than the number specified for the default model at 0.01 level, you may reject the default model.
Model Fit Cheat Sheet
This model fit cheat sheet summarizes some of the most important parameters and their accepted values accordingly to the literature.
Acronym | Explication | Accepted fit | Reference |
Likelihood Ratio | P-value | ≥ 0.05 | Joreskog & Surbom (1996); |
Relative X2 | (X2/df) | ≤ 2 = acceptable fit | Tabachnick & Fidell (2007); |
CMIN/DF | Chi-square divided by Degree of Freedom | ≤ 3 = acceptable fit ≤ 5 = reasonable fit | Kline (1998); Marsh & Hocevar (1985); |
GFI | Goodness of Fit Index | 1 = perfect fit ≥ 0.95 = excellent fit ≥ 0.9 = acceptable fit | Kline (2005); Hu & Bentler (1998); |
AGFI | Adjusted Goodness of Fit Index | ≥ 0.90 = acceptable fit | Tabachnick & Fidell (2007); |
CFI | Comparative Fit Index | 1 = perfect fit ≥ 0.95 = excellent fit ≥ .90 = acceptable fit | West et al. (2012); Fan et al. (1999); |
RMSEA | Root Mean Square Error of Approximation | ≤ 0.05 = reasonable fit | MacCallum et al (1996); |
RMR | Root Mean Squared Residual | ≤ 0.05 = acceptable fit ≤ 0.07 = acceptable fit | Diamantopoulos & Siguaw (2000); Steiger (2007); |
SRMR | Standardized Root Mean Squared Residual | ≤ 0.05 = acceptable fit | Diamantopoulos & Siguaw (2000); |
CN | Critical N | ≥ 200 = acceptable fit | Joreskog & Sorbom (1996); |
Cite this article in your research paper:
[citationic]
References
Atilgan, T., & Bozdogan, H. (1987, June). Information-theoretic univariate density estimation under different basis functions. A paper presented at the First Conference of the International Federation of Classification Societies, Aachen, West Germany.
Bentler, P. M., & Bonett, D. G. (1980). Significance tests and goodness of fit in the analysis of covariance structures. Psychological Bulletin, 88(3), 588–606. https://doi.org/10.1037/0033-2909.88.3.588
Bollen, K.L. 1989. Structural Equations with Latent Variables. New York: John Wiley.
Browne, M. W., & Cudeck, R. (1993). Alternative ways of assessing model fit. In K. A. Bollen and J. S. Long (Eds.), Testing structural equation models (pp. 136-162). Newbury Park, CA: Sage.
Chen, F. F. (2007). Sensitivity of goodness of fit indexes to lack of measurement invariance. Structural Equation Modeling, 14(3), 464–504. https://doi.org/10.1080/10705510701301834
Diamantopoulos, A. & Siguaw, J. A., (2000). Introduction to LISREL: A guide for the uninitiated. London: SAGE Publications, Inc.
Fan X, Thompson B, Wang L (1999) Effects of sample size, estimation methods, and model specification on structural equation modeling fit indexes. Struct Equ Modeling 6(1):56–83
Hu, L.-t., & Bentler, P. M. (1998). Fit indices in covariance structure modeling: Sensitivity to under parameterized model misspecification. Psychological Methods, 3(4), 424–453. https://doi.org/10.1037/1082-989X.3.4.424
James, L., Mulaik, S., & Brett, J. M. (1982). Causal Analysis: Assumptions, Models, and Data. Sage Publications.
Joreskog, K. G., & Sorbom, D. (1996). LISREL8 User’s reference guide. Mooresville Scientific Software.
Kline, R. B. (1998). Principles and practice of structural equation modeling. Guilford Press.
Kline, R. B. (2005). Principles and practice of structural equation modeling (2nd ed.). Guilford Press.
Tabachnick, B. G., & Fidell, L. S. (2007). Using multivariate statistics (5th ed.). Allyn & Bacon/Pearson Education.
MacCallum, R.C., Browne, M.W., and Sugawara, H., M. (1996), “Power Analysis and Determination of Sample Size for Covariance Structure Modeling,” Psychological Methods, 1 (2), 130-49.
Marsh, H. W., & Hocevar, D. (1985). Application of confirmatory factor analysis to the study of self-concept: First- and higher-order factor models and their invariance across groups. Psychological Bulletin, 97(3), 562–582. https://doi.org/10.1037/0033-2909.97.3.562
Steiger, J. H. (2007). Understanding the limitations of global fit assessment in structural equation modeling. Personality and Individual Differences, 42(5), 893–898. https://doi.org/10.1016/j.paid.2006.09.017
Tanaka, J. S., & Huba, G. J. (1985). A fit index for covariance structure models under arbitrary GLS estimation. British Journal of Mathematical and Statistical Psychology, 38(2), 197–201. https://doi.org/10.1111/j.2044-8317.1985.tb00834.x
West, R. F., Meserve, R. J., & Stanovich, K. E. (2012). Cognitive sophistication does not attenuate the bias blind spot. Journal of Personality and Social Psychology, 103(3), 506–519. https://doi.org/10.1037/a0028857