Cluster Analysis of Personality Types Using Respondents’ Big Five Personality Traits

Jennifer Chi (1) , Yeong Nain Chi (2)
(1) School of Behavioral and Brain Sciences, The University of Texas at Dallas, Richardson, TX 75080, United States
(2) Department of Agriculture, Food, and Resource Sciences, University of Maryland Eastern Shore, Princess Anne, MD 21853, United States
Fulltext View | Download


This study utilized a mixed model approach, incorporating k-means clustering analysis for data examination, discriminant analysis for classification, and multilayer perceptron neural network analysis for prediction. After removing inadequate samples and outliers, the total number of observations was 19,692 for this study, which was collected through an interactive online personality test (i.e., Big Five Personality Traits) in 2012. The empirical results based on the k-means clustering analysis identified four different personality clusters using the total score of Big Five Personality Traits (Extraversion, Neuroticism, Agreeableness, Conscientiousness, and Openness to Experience). The empirical results obtained from the k -means clustering analysis revealed the presence of four distinct personal clusters, determined by the total scores of the Big Five Personality Traits. The accuracy of the clustering analysis was further tested using discriminant analysis, which indicated significant difference among the cluster means and correctly classified 95.5% of the original grouped cases. For predictive modeling, a multilayer perceptron neural network framework was used. The network had a 5-6-4 structure and was employed to determine the personality classification of participants. Notably, the model achieved 99.4% accuracy in correctly classifying the training grouped cases and 99.2% accuracy for the testing grouped cases. The results of this study offer valuable insights into understanding the personalities of participants, with implications for various domains such as psychology, social sciences, cultural studies, and economics.

Article Details

How to Cite
J. Chi and Y. N. Chi, “Cluster Analysis of Personality Types Using Respondents’ Big Five Personality Traits”, Int. J. Data. Science., vol. 4, no. 2, pp. 116-135, Dec. 2023.


Ahmed, F. E. (2005). Artificial neural networks for diagnosis and survival prediction in colon cancer. Molecular Cancer, 4:29, 1-12.

Ahmad, H., Asghar, M. Z., Khan, A. S., & Habib, A. (2020). A Systematic Literature Review of Personality Trait Classification from Textual Content. Open Computer Science, 10(1), 175-193.

Beatley, T. (1991). Protecting biodiversity in coastal environments: introduction and overview. Coastal Management, 19(1), 1–19.

Bishop, C. M. (2006). Pattern recognition and machine learning. New York, NY: Springer Science + Business Media.

Bose, B. K. (2007). Neural network applications in power electronics and motor drives - an introduction and perspective. IEEE Transactions on Industrial Electronics, 54(1), 14-33.

Child, D. (2006). The essentials of factor analysis (3rd ed.). New York, NY: Continuum International Publishing Group.

Churchill, G. A., Jr. & Iacobucci, D. (2005). Marketing research: methodological foundations (9th ed.). Mason, OH: Thomson/South-Western.

Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16(3), 297-334.

De Gooijer, J. G. & Hyndman, R. J. (2006). 25 years of time series forecasting. International Journal of Forecasting, 22(3), 443-473.

Do, L. N. N., Taherifar, N., & Vu, H. L. (2019). Survey of neural network-based models for short-term traffic state prediction. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 9(1), 1-24.

El-Amir, H., & Hamdy, M. (2020). Deep learning pipeline: building a deep learning model with TensorFlow. Berkeley, CA: Apress.

Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7, 179-188.

Freudenstein, J. P., Strauch, C., Mussel, P., & Ziegler, M. (2019). Four personality types may be neither robust nor exhaustive. Nature Human Behaviour, 3(10), 1045-1046.

Gaisendrees, C., Kreuser, N., Lyros, O., Becker, J., Schumacher, J., Gockel, I., Kersting, A., & Thieme, R. (2020). Classification of personality traits using the Big Five Inventory-10 in esophageal adenocarcinoma patients. Annals of Esophagus, 3:22, 1-8.

Gardner, M. W., & Dorling, S. R. (1998). Artificial neural networks (the multilayer perceptron) - a review of applications in the atmospheric sciences. Atmospheric Environment, 32(14), 2627-2636.

Gerlach, M., Farb, B., Revelle, W., & Amaral, L. A. N. (2018). A robust data-driven approach identifies four personality types across four large data sets. Nature Human Behaviour, 2(10), 735-742.

Gerlach, M., Revelle, W., & Amaral, L. A. N. (2019). Reply to: Four personality types may be neither robust nor exhaustive. Nature Human Behaviour, 3(10), 1047-1048.

Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. The MIT Press.

Greene, W. H. (2008). Econometric analysis (6th ed.). Upper Saddle River, New Lersey: Prentice Hall.

Haykin, S. S. (2009). Neural Networks and Learning Machines (3rd ed.). Upper Saddle River, New Jersey: Pearson Education, Inc, 2009.

Hornik, K., Stinchcombe, M., & White, H. (1989). Multilayer feedforward networks are universal approximators. Neural Networks, 2, 359-366.

IBM (2019). IBM SPSS neural networks 26. Armonk, NY: IBM Corporation.

John, O. P., & Srivastava, S. (1999). The big-five trait taxonomy: history, measurement, and theoretical perspectives. In Pervin, L. A., & John, O. P. (Eds.), Handbook of personality: Theory and research (Vol. 2, pp. 102–138). New York: Guilford Press.

Katahira, K., Kunisato, Y., Yamashita, Y., & Suzuki, S. (2020). Commentary: A robust data-driven approach identifies four personality types across four large data sets. Frontiers in Big Data, 3(8), 1-3.

Khan, A. S., Ahmad, H., Asghar, M. Z., Saddozai, F, K., Arif, A., & Kalid, A. (2020). Personality classification from onlinr text using machine learning approach. International Journal of Advanced Computer Science and Applications, 11(3), 460-476.

Ramchoun, H., Janati Idrissi, M. A., Ghanou, Y., & Ettaouil, M. (2017). New modeling of multilayer perceptron architecture optimization with regularization: an application to pattern classification. IAENG International Journal of Computer Science, 44(3), 261-269.

Rossberger, R. J. (2014). National personality profiles and innovation: The role of cultural practices. Creativity and Innovation Management, 23(3), 331–348.

Sheela, K. G., & Deepa, S. N. (2013). Review on methods to fix number of hidden neurons in neural networks. Mathematical Problems in Engineering, 2013, Article ID 425740, 1-11.

Souri, A., Hosseinpour, S., & Rahmani, A. M. (2018). Personality classification based on profiles of social networks’ users and the five-factor model of personality. Human-centric Computing and Information Sciences, 8(1), 8-24.

Tabatchnick, B. G., & Fidell, L. S. (2013). Using multivariate statistics (6th ed.). Boston: Pearson Education, Inc.

Talasbek, A., Serek, A., Zhaparov, M., Moo-Yoo, S., Kim, Y., & Jeong, G. (2020). Personality classification experiment by applying k-means clustering. International Journal of Emerging Technologies in Learning, 15(16), 162-177.

Zacharis, N. Z. (2016). Predicting student academic performance in blended learning using artificial neural networks. International Journal of Artificial Intelligence and Applications, 7(5), 17-29.