Stroke Analysis and Prediction Using PySpark, Suport Vector Machine and Random Forest Regression

Main Article Content

Aid Semic
Sulejman Karamehic


Stroke is a medical condition in which the blood vessels in the brain rupture, causing brain damage. Symptoms may appear if the brain's flow of blood and other nutrients is disrupted. Stroke is the leading cause of death and disability worldwide, according to the World Health Organization (WHO). Early awareness of the numerous stroke warning symptoms can assist to lessen the severity of the stroke. To forecast the likelihood of a stroke happening in the brain, many machine learning (ML) models have been developed. This research uses a range of physiological parameters and machine learning algorithms, such as Support Vector Machine with extensive Exploratory Data Analysis, Random Forest Regression and PySpark. By using this methodologies and algorithms we got very high accuracy score results which are described down below.

Article Details

How to Cite
A. Semic and S. Karamehic, “Stroke Analysis and Prediction Using PySpark, Suport Vector Machine and Random Forest Regression”, Int. J. Data. Science., vol. 3, no. 2, pp. 62-70, Sep. 2022.


J. Lawton, “Ground-breaking discoveries in cardiovascular diseases”.

P. P. Weissberg, “The future of cardiovascular research”.

S. Khan, “Stroke-related mortality in the united states–mexico border area of the united states”.

T. N. A. a. H. N. Alsmadi, “Prediction of Covid-19 patients states using Data mining techniques”.

H. W. C. S. N. N. J. B. V. D. J. Soumyabrata Dev, “A predictive analytics approach for stroke prediction using machine learning and neural networks”.

Y. C. J. H. Aditya Khosla, “An integrated machine learning approach to stroke prediction.”.
R. Rahman, “Heart Stroke Dataset”.

“Spark Apache,” [Na mreži]. Available:

“SciKitLearn,” [Na mreži]. Available:

“Towards Data Science,” [Na mreži]. Available:

H. P. G. G. V. P. P. K. B. Harshitha K V, “Stroke Prediction Using Machine Learning Algorithms,” International Journal of Innovative Research in Engineering & Management (IJIREM), tom 8, br. 4, 2021.

J. A. T. Rodríguez, “Stroke prediction through Data Science and Machine Learning Algorithms”.

G. L. A. K. Gangavarapu Sailasya, “Analyzing the Performance of Stroke Prediction using ML Classification Algorithms,” (IJACSA) International Journal of Advanced Computer Science and Applications , tom 12, br. 6, 2021.