Chronic Diseases System Based on Machine Learning Techniques

People with chronic diseases are at increased risk of poor health outcomes and health disparities. Due to the massive lack of health care monitoring o f these diseases that needs daily control at home and hospital when they go to their doctor. Handling of patient data and records and the treatment of chronic diseases is hard and time-consuming. There is no technology employment to help doctors make diagnoses faster and much more accurate as wel l as recognize patients who may take advantage of the new types of treatments. As we sta rt with health measurements recording, the patients of chronic diseases usually need to take n ot of daily measurements on paper. Sometimes as well Patients forget to measure or to record the m. These situations said before, will get the doctor not fully to understand the case because of the lost measures. We need to rely on a better way to store the measurements for a longer time and in a safe place. Moreover, it's difficult for the AR TIC LE INFO ABSTRACT


Introduction
People with chronic diseases are at increased risk of poor health outcomes and health disparities. Due to the massive lack of health care monitoring of these diseases that needs daily control at home and hospital when they go to their doctor. Handling of patient data and records and the treatment of chronic diseases is hard and time-consuming. There is no technology employment to help doctors make diagnoses faster and much more accurate as well as recognize patients who may take advantage of the new types of treatments. As we start with health measurements recording, the patients of chronic diseases usually need to take note of daily measurements on paper. Sometimes as well Patients forget to measure or to record them. These situations said before, will get the doctor not fully to understand the case because of the lost measures. We need to rely on a better way to store the measurements for a longer time and in a safe place. Moreover, it's difficult for the A R T I C L E I N F O This paper aims to improve the quality of the patient's life and provide them with the lifestyle they need. And we have the intention to obtain this by creating a mobile application that analyzes the patient's data such as diabetes, blood pressure, and kidney. Then, implement the system to diagnose patients of chronic diseases using machine learning techniques such as classification. Our idea is to recommend a lifestyle for the patient and make the doctor participate in it by writing notes. In this paper, machine learning classifiers were used to predict whether the person is prone to some chronic diseases. Blood pressure, diabetes and kidney are considered in this work. For hypertension, Tree algorithm has shown 100% accuracy, which was the best one. Chronic Kidney Disease (CKD) is a significant public health concern with rising prevalence. With a set of considered attributes such as specific gravity, albumin, serum creatinine, hemoglobin, packed cell volume and hypertension used to predict if the person has Kidney disease or not. For kidney, Random Forest algorithm has shown 100% accuracy, which was the best one among other algorithms tested. We considered attributes such as pregnancies, glucose, blood pressure, skin thickness, insulin, diabetes pedigree function, age and BMI of a person to diagnose whether a patient has diabetes based on specific diagnostic measurements or not. For diabetes, neural networks have shown the best accuracy. It was 76.3%. patient to decide on a food diet and exercises in a way that suits his measurements and his health statuss. Given the problems before in patients, society includes the need to provide communication for people who have chronic disease the progress toward achieving the goal in higher quality health care and improved population health outcomes. The proposed system intends to streamline the process of monitoring the patient's glucose level, blood pressure and kidney by both the patient and their doctor(s). A mobile application will be developed for reading, storing, and sharing data of blood pressure, glucose level and kidney. The solution will offer to continue and remote monitoring of the patient's health. Patient's data may be handled in a more efficient way which is using machine learning and classification algorithms to help doctors in diagnosis and patient's in following their medical status. Therefore, diagnosis will be faster and more accurate. We are planning to replace paper measurement recording by saving them in the application's database, so it is kept for a longer time and in a safe place. Furthermore, we will notify patients when they forget to record measurements so the doctor will fully understand the case because of the all measures are stored. Because of the complexity of deciding the daily lifestyle, patients tend to ignore this important thing that affects health status. Moreover, the application will allow the patient to track their exercise and diet, depending on their health conditions. The patients will have an exercise recommendation that is consistent with his status and doesn't tire or hurt them. The system domain is healthcare domain where we will focus on serving patients of all ages by organizing their measurements for some chronic diseases and try to avoid problems early by giving them warnings when there is a noticeable difference in their measurements as well as providing the appropriate diet and exercise.
The patients can follow their medical condition and see any developments in the situation, through the application. Moreover, the doctor can view the patient's record and medical status, so that they can write medical notes and answer any questions or concerns sent by the patient. We will use data mining and machine learning algorithms to build the model.
The main advantage of our system is that our system combines three different major chronic diseases, while most of the systems focus on only one disease. We are planning to implement three different machine learning models for each disease dataset for the classifications. For algorithms, we may use the Decision tree, Support Vector Machine, Artificial Neural Network, k-nearest neighbor (k-NN) and Naïve Bayes classifier. Three algorithms for each disease's model and see what the best one of them is. Datasets are mostly would be separated for each disease. Accuracy will be defined in Experimental results and analysis.
The organization of the paper is as follows. Section 2 introduces related works survey. In section 3, the background information are given. In section 4, the proposed methodology for machine learning is discussed. In section 5, requirement analysis , In section 6, requirement analysis . In section 7, the results and analysis. In section 8, the conclusion and future works are presented. And finally, section 9, refences were provided.

Hypertension related work survey
Hudson Fernandes Golino and others have written a paper about Predicting Increased Blood Pressure Using Machine Learning algorithm named Classification tree. The main features for prediction were obesity specifically blood mass index (BMI), waist (WC), hip circumference (HC) and waist-hip ratio (WHR). The dataset was first split into two subsets, one for each sex. For women, it contained 225 records and another for men with 153 records. Fifteen trees were calculated in the training group for each sex, using different numbers and combinations of predictors. The overall sensitivity and specificity for the best model (tree no. 15) was 72% and 86.25% for women. For men's testing sample, it showed that sensibility decreased to 52.38% and the specificity decreased to 69.70% (Golino et al. , 2014).
Zhang B and others predicted blood pressure from physiological index data. They used a support vector machine regression (SVR) algorithm to solve the key gap between the need for continuous measurement for prophylaxis and the lack of an effective method for continuous measurement. They have used a real-time Data-set with Eighteen participants (12 males, 6 females). The main  features include PTT, HR, PPG, I, II, III, aVF, aVR, aVL and SpO2. The results of the algorithm  were compared with those obtained from two classical machine

Diabetes related work survey
In this paper the Benamina et al, they presented a study about an expert's information is intense experience through practice and education in a clinical healthcare field. Case-based reasoning (CBR) has emerged as a major research area within the search for a problem-solving paradigm and decision support, and the Cases retrieval is the important step in (CBR). They were tested using fuzzy logic and data mining to improve the response time and the accuracy of the retrieval of similar cases. The goal of fuzzy logic to simplify the complexity of computing the similarity between diabetic patients who require different monitoring plans. After comparing the accuracy of the Fispro Fuzzy DT is 81%, Weka Decision tree 73% and JColibri k-NN is 66%. The results indicate how the proposed fuzzy decision tree helps to improve the accuracy of diagnosing diabetes mellitus patient's classification and retrieval step of CBR reasoning, also monitoring plan corresponding to the result, with the classification option ( Diabetes disease increased with a high rate due to the unhealthy lifestyle around the world. Habibzadeh et al, studied the consequences of two groups chosen randomly with type 2 diabetes that first group they learn to take care of themselves and the second group is the usual lifestyle. After doing Statistical analysis using independent t-test the result of self-organization (t =11.24, p < 0.001), self-adjustment (t = 7.53, p < 0.001), interaction with health experts (t = 7.31, p < 0.001), blood sugar self-monitoring (t = 6.42, p < 0.001), adherence to the proposed diet (t = 5.22, p < 0.001), and total self-management (t = 10.82, p < 0.001) were increased in the first group because they learn about take care of themselves. So, it is vital to understand and encourage the importance of patient self-management. They recommended using the self-management of society and patients with type 2 diabetes disease (Habibzadeh et al., 2017).
In this paper Tamilvanan and Bhaskaran, study about data mining is the process of discovering patterns in large data sets. They use Classification techniques to find out in which group each data instance is related within a given dataset. It is used for classifying data into different classes to proof which is more efficient of Naive Bayes, Random Forest, and NB-Tree algorithms. It is used in the medical dataset for diabetes disease. Jayalakshmi, T. and Santhakumaran, A. work aimed at using artificial neural networks (ANNs) for the prediction of diabetes. The researchers identified challenges with the dataset including missing values, which ANNs have difficulty in interpreting. The paper claimed that with the use of K-Nearest Neighbor (KNN) algorithm to replace missing values, the accuracy of diabetes prediction reached 99%. S. Mustafa, K., Watan, I., G. AND Enteesha, D.,M. work was unique in its effort to address noise in the dataset. They used K-Means Clustering to split the data into several groups, and manually eliminated the minority class in some of these groups. Decision tree was trained and tested on the remainder of the data points and reached a 98% accuracy in predicting diabetes.

Kidney related work survey
Due to the 2 Billion Riyals assigned for renal replacement therapy in Saudi Arabia, Alassaf R. and others (2018), proposed four machine learning techniques: ANN, SVM, NB, and k-NN to develop a model that seek to reduce the number of patients and the costs required for therapy, by diagnosing chronic kidney disease accurately. The dataset has been collected from King Fahd University Hospital (KFUH) in Khobar. From their experimental result, ANN, SVM, Naïve Bayes achieved a testing accuracy of 98.0% while k-NN has achieved an accuracy of 93.9%. The following are some of the earlier works in the field of using machine-learning algorithms to diagnose CKD. They used the same dataset from the UCI Machine Learning Repository with different machine learning algorithms.
Alimran A. and others (2018), performed a comparative analysis with three modern classifiers namely: Logistic Regression, feed-forward neural networks and wide & deep learning to diagnose CKD. The dataset is obtained from the UCI Machine Learning Repository. The performance of these algorithms was measured by f1-score, precision, recall and AUC score were used for logistic regression and an additional loss score was considered for the feed-forward neural networks and wide & deep model. However, Feed-forward neural network resulted in 0.99 f1-score, 0.97 accuracies, 0.99 recall and 0.99 AUC score as the best performing CKD diagnostic method. Whereas Logistic regression generated the lowest result among all and wide & deep learning with a larger number of hidden layers and neurons found to be efficient for bigger datasets.
Using the same dataset, Charleonnan A. and others (2016), developed a system which is used to predict chronic kidney disease using machine learning predictive models: including K-nearest neighbors (KNN), support vector machine (SVM), logistic regression (LR), and decision tree classifiers. The authors compared the performance of the four classifiers with SVM. Five-time averages of sensitivity and specificity were illustrated and showed that SVM's sensitivity is slightly higher than other methods at 0.99, where Logistic sensitively at 0.94, Decision Tree at 0.93 and KNN's 0.96.
Using the UCI Repository, Salekin and Stankovic (2016) have developed an automated machine learning solution to detect CKD and explore 24 parameters related to kidney disease. The dataset used for evaluation suffers from noisy and missing data. With three different classifiers, they evaluate solutions: K-NN, RF, and neural networks. To reduce over-fitting and recognize the most significant predictive characteristics for chronic kidney disease, they used two methods to perform feature reduction: wrapper method and LASSO regularization. Also, through cost analysis considering all 24 attributes they identify a cost-effective highly accurate detection classifier using only 5 attributes: specific gravity, albumin, diabetes mellitus, hypertension, and hemoglobin. By using this approach, they achieved a detection accuracy of 0.993 using F-measure.

Hypertension related work survey Hypertension Background Information
Hypertension is a term used to describe high blood pressure. Blood flow is based on the beat from which blood is pumped by the heart. The pressure does not always stay at the same rate. This varies at a specific point in time depending on the activities. Hypertension results in an abnormal pressure of the main arteries for a long time. (Cunha,2011).

Hypertension Types
There are two main categories of hypertension flow. These include hypertension of the essential(primary) and secondary levels. it, including salt sensitivity, kidney chemical imbalance, insulin resistance, family history and age. Essential hypertension is usually seen in combination with type 2 diabetes, high cholesterol, and central obesity.
Secondary hypertension: In this case, high blood pressure is caused by another illness, such as kidney disease or certain cancers (especially adrenal gland cancer). Most people with secondary hypertension are likely to have an endocrine or kidney defect that, when corrected could get blood pressure back to normal levels. Secondary hypertension can also be caused by certain medications (especially NSAIDS [Motrin/ibuprofen] and steroids) (Milechman et al., 2014).

Hypertension Representation:
Blood pressure is represented as the ratio of Systolic Blood Pressure (SBP) over Diastolic Blood Pressure (DBP). BP = SBP / DBP (1) Systolic blood pressure is the pressure in the arteries as the heart contracts and pumps the blood forward into the arteries, while diastolic is the pressure resulting from the contraction of the arteries (Zareian,2004; Cunha 2011). Impact factors: There are many reasons for increasing blood pressure as said before. The occurrence is correlated with many underlying factors. Such factors include age, heavy consumption of salt, lack of exercise, and genetic factors (Cunha et al. ,2011).

Impact of age:
Among humans, ageing is a continuous and cumulative process that results among decreased physiological function across all body systems (Franceschi et al., 2008). Age affects the heart performance in pumping blood. If the age of a person increases, then pathways of the heart's pacemaker system deposit fat, which will affect the heart performance while pumping the blood. When we age, the elasticity nature of arteries also decreases, they become stiff. In such a situation, to pump the blood throughout the body through arteries, the heart has to push the blood using more force, this may, in turn, increasing blood pressure (Nimmala et al. ,2018).

Impact of Obesity:
Excessive processing of body fat and weight is a significant public health concern. The Body Mass Index (BMI) can be determined based on weight and height. Additional fat in the body needs oxygen and nutrients to live. It increases the workload of the heart because it must pump more blood through additional blood vessels. The more circulating blood also means more pressure on the artery walls (Nimmala et al. ,2018).

Types of Diabetes
Type 1 diabetes: Autoimmune reacts where the body's defense system attacks cells that produce insulin. As a result, the body produces very little insulin, which is known to cause type 1 diabetes. The exact causes of this are said to be linked to a combination of genetic and environmental conditions. Type 1 diabetes can develop in children and adolescents and may occur at any age. When you have type 1 diabetes, your body produces very little or no insulin, meaning we need to inject insulin daily to keep your blood glucose levels under control. If insulin cannot reach people with type 1 diabetes, they will die. People with type 1 diabetes need insulin injections daily to control their blood glucose levels. Type 1 diabetes in a family member slightly in-creases the risk of developing the disease, and environmental factors and exposure to certain viral infections are risk factors for type 1 diabetes. The search for risk factors in type 1 diabetes remains restricted (WebMD, 2019).

Type 2 diabetes:
The most common among adults and represents about 90% is type 2 diabetes of all cases of diabetes. This type is characterized by insulin levels not working properly, blood sugar levels continue to rise, and more insulin is released due to insulin resistance as the body does not respond to insulin completely. For some people with type 2 diabetes, this may eventually lead to depletion of the pancreas, leading to less and less production of insulin, leading to high blood sugar levels (hyperglycemia). When you have type 2 diabetes, the body does not benefit from the insulin it produces. Due to high levels of obesity, physical inactivity and malnutrition it is increasing in children, adolescents and younger adults. Type 2 diabetes may be diagnosed in older adults (WebMD, 2019).  (WebMD, 2019)

Patients of diabetes lifestyle
This has proved during 20 years of medical research that a healthy lifestyle can prevent diabetes type 2 from occurring in the first place and even reflect its progress, a substantial and long-term study. Patients can manage and monitor your diabetes by focusing on changes in your lifestyle. Manage stress, eat healthy, physical activity. These things affect the quality of diabetes lifestyle. Patients must be active in walking and doing some exercises. (Chen et al., 2015). Make better food choices program such as Mediterranean-an, vegetarian and lower carbohydrate. (Coughlin, 2017). Stress reduction is beneficial for the improvement of health such as yoga postures, breathing exercises (pranayama), meditation. (Yadav et al., 2015). Eating healthy is essential for people with diabetes because of its effect eating food on blood sugar (WebMD,2019). Should be focused on eating as much as the body needs by relying on specific tactics to eat well by keeping a food record. Can use a smartphone to help them regulate their diet (Monique Tello, 2019).

Chronic kidney disease background
Chronic kidney disease (CKD) is abnormal kidney function and structure. It is common, frequently unrecognized and often exists together with other conditions. It occurs when the kidneys are damaged and could not filter the blood properly. Chronic kidney disease, which is also called chronic kidney failure, describes the gradual loss of kidney function. Your kidneys perform many vital functions, such as filtering wastes from the blood, manage fluids in your blood, control blood pressure. (Noia T.D et al., 2013) There are five stages of CKD. The most serious one is stage 5 because, at this stage, the kidneys are unable to do most of their functions. The stages are determined based on the patient's Glomerular Filtration Rate (GFR).

Glomerular Filtration Rate (GFR)
Glomerular filtration rate (GFR) is the number used to figure out a person's stage of kidney disease. A math formula using the person's age, race, gender and their serum creatinine can be used to calculate a GFR. A doctor will order a blood test to measure the serum creatinine level. Creatinine is a waste product that comes from muscle activity. When kidneys are working well, they remove creatinine from the blood. As kidney function slows, blood levels of creatinine rise.

Impact factors:
The most common causes of kidney disease are diabetes and high blood pressure • Impact of Heart Disease and Stroke: Having kidney sickness will increase the chances of also having heart sickness and stroke. • Impact of diabetes: CKD attributable to diabetes referred to as diabetic kidney disease, is described via reduced kidney characteristic or the presence of kidney injury for at least three months, regardless of kidney function (Chronic Kidney Disease Basics | Chronic Kidney Disease Initiative | CDC, n.d.). • Impact of high blood pressure: High blood strain is a leading cause of CKD. Over time, high blood strain can harm blood vessels in the course of your body. This can reduce the blood provide to necessary organs like the kidneys. High blood pressure additionally damages the tiny filtering units in your kidneys. As a result, the kidneys may additionally end removing wastes and extra fluid from your blood (National Kidney Foundation, n.d). • Impact of Obesity: Obesity results in complex metabolic abnormalities which have wideranging effects on diseases affecting the kidneys. Some of the harmful renal consequences of obesity may be mediated by downstream comorbid conditions such as diabetes mellitus or hypertension. Still, there are also effects of adiposity which could impact the kidneys directly Being obese doubles your risk of developing CKD compared to someone who has a healthy body weight, while overweight people increase their risk of developing CKD by 1.5 times (Cva K., 2017).

Proposed Methodology
In this paper, we used a data-mining classification technique. Data mining is the concept of extracting knowledge from large amounts of data. Classification is the process of finding a model that describes and distinguishes data classes or concepts, to be able to use the model to predict the label of a data record or to represent a descriptive analysis of data record for taking effective decisions. The classification mod-el consists of two stages: In stage 1, the training stage, the model is trained by a set of records, whose class labels are already known. In stage 2, the testing stage, the model goes to predict class labels of a collection of files, whose class labels are unknown, also called as test records. There are different classifiers, but for experimental analysis, we used classifiers supported by Orange3 tool. Orange3 supports various machine-learning (ML) algorithms. As we have compared our experimental results with Naïve Bayes, k-NN, tree, SVM, random forest, neural network and many other classifiers, the rest of this section discusses these ML algorithms. Experimental analysis is done on three different datasets. We used 70% records to train the model, and 30% records to test the model.
Data Preprocessing: Nowadays real-world datasets are highly susceptible to noisy missing or inconsistent data due to their usually vast size (often several gigabytes or more) and probable sourcing from various, heterogeneous sources. Low-quality data would result in low-quality mining outcomes. There are a variety of pre-processing techniques. Data cleaning is one of these techniques, it may be used to eliminate noise and fix data inconsistencies. Data integration as well, merges data from different sources into a cohesive data store, such as a data warehouse. Data transformations, such as standardization, can be applied. Data reduction can minimize data size by aggregating, deleting redundant functions, or clustering. These methods are not separated from each other, they may work together. (Han, et al, 2012). We will do the needed cleaning before applying algorithms on datasets.

Naïve Bayes classifier
Bayesian classifiers are statistical classifiers. We can estimate the probability of class membership, such as the likelihood that a given tuple belongs to a specific class (Han et al. ,2012).
Naive Bayes is among the simplest probabilistic classifiers it is based on Bayes theorem. It constructs a classification model by learning the conditional probabilities of each input attribute (Nimmala et al. ,2018). The same model is used to predict the class membership of input instance using the following equation: where P(x|y) is defined as the probability of observing x, given that y occurs. P(x|y) is called posterior probability P(y|x), P(x), and P(y) are called prior probabilities.

K-NN classifier
K-nearest neighbor (k-NN) it's a data mining technique. It tries to classify an unknown sample based on the known classification of its neighbors. Let us say a set of samples with known classification is available. Each sample should be classified similarly to its surrounding samples. If the classification of a sample is unknown, then it could be predicted by considering the classification of its nearest neighbor. (Mascherano et al., 2009).It has been widely used in the area of pattern recognition. Nearest-neighbor classifiers are based on learning by analogy, that is, by comparing a given test tuple with training tuples that are similar to it. N attributes describe the training tuples. Each tuple represents a point in an n-dimensional space. In this way, all of the training tuples are stored in n-dimensional pattern space. When given an unknown tuple, a knearest-neighbor classifier searches the pattern space for the k training tuples that are closest to the unknown tuple. These k training tuples are the k "nearest neighbors" of the unknown tuple (Pearson., 2019).

Support Vector Machine classifier
Support vector machines is based on supervised learning algorithm which can be used for both classification and regression problems. In this algorithm, we plot each data item as a point in ndimensional space (where n is number of features you have) with the value of each feature being the value of a particular coordinate. Then, we perform classification to maximize the separation between data points and the hyper-plan that best separates the features into different domains. The points closest to the hyperplane are called as the support vector points and the distance of the vectors from the hyperplane are called the margins (Yadav A, 2018).
The main idea of SVM is to find the optimal hyperplane between data of two classes in the training data [2]. SVM finds the hyperplane by solving optimization problem: Where 0 ≤ai ≤ C for i=1, 2, … , n The problem is how to construct a decision boundary that correctly classifies an input pattern that is not necessarily in the training set (Teng S. , et al. , 2010).

Artificial neural network algorithm
An artificial neural network (ANN) is a computational system, where information is processed collectively, in parallel throughout a network of nodes (neuron). In ANN the individual elements of the network, the neurons (nodes), read an input, process it, and generate an output. To create an ANN is necessary to put together a number of neurons. They are arranged on layers. A network has to have an input layer (which carries the values of outside variables) and an output layer (the predictions or the result). Weighting Factors: Neurons usually receive more than one input at the same time. Each input has its own relative weight which gives the input the impact that it needs on the processing element's summation function. Weights are adaptive coefficients within the network that determine the intensity of the input signal as registered by the artificial neuron. They are a measure of an input's connection strength. These strengths can be modified in response to various training sets and according to a network's specific topology or through its learning rules. Activation Function: ANN use various functions other than activation functions, and most of them use sigmoid functions, which are also called logistic functions. (Su-Hyun Han. at al, 2018)The sigmoid function has the advantage that it is very simple to calculate compared to other functions. The sigmoid function is expressed as the following equation:

Requirements Analysis
We have two approaches to analyze requirements and they are Structured Analysis and Object-Oriented Analysis. We decided to use a structured approach to do the requirement analysis phase. Because we are focusing on the process and data more than objects. It which requires use case, Data Flow Diagram (DFD) and Entity Relationship diagram (ERD).

Use Case Diagram
As graphic diagrams describing and displaying relationships between users and actors (usually users and external systems) (

Data Flow Diagram (DFD)
Process design can be created using a data flow diagram (DFD). DFD is used in a visual view to describe system requirements. Data flow diagram is made up of: Processes, Data Flows, Data Stores and External Entities. Process is an activity or function performed for a particular business purpose, data flow is either a single piece of data or a logical set of several pieces of data, datastore is a collection of data that is stored in a certain way, external entity is an internal individual, organization or system that is external to the system, but the system interacts with it. (Ibrahim & Yen ,2010); (Liu & Tang, 1991).

Entity Relationship diagram (ERD)
The Entity-Relationship diagram has been widely used in structured analysis and conceptual modeling. The ER approach is easy to understand, powerful to model real-world problems and readily translated into a database schema. The typical semantic constructs of the ER model and its variations we consider the following features:

System Architecture
An architecture diagram is a graphical representation to illustrate, explain, and communicate thoughts about overall the system structure and the user requirements that the system must include using a bunch of icons and lines. Also, show formal structure, behavior of a system and fundamental organization of a system, their relationships to each other and represent database or memory representations with components that give a useful, implementable meaning.

System Flowchart:
We did 2 of them to describe the system. Flowchart A: describes the flow of the whole system Flowchart B: this one describes the machine learning part of the system.

Experimental results and analysis
We used Orange3 from Anaconda-Navigator; it is a data mining tool used to test many machine learning algorithms. Such as Support Vector Machine (SVM) and Artificial Neural Network (ANN) algorithms to test these algorithms on the three different datasets.

Hypertension Experimental results and analysis
We found a paper (Golino et  Orange3 results for hypertension dataset: We tested 6 algorithms KNN, Tree, SVM, Random forest, Neural network , and Naïve Bayes. We found the best accuracy 100% which is tree algorithm.

Diabetes Experimental results and analysis
We have used a dataset with 786 records downloaded from https://www.kaggle.com/uciml/pimaindians-diabetes-database. Each patient has only one record containing pregnancies, glucose, blood pressure, skin thickness, insulin, diabetes pe-degree function, age and BMI. We tested 6 algorithms KNN, Tree, SVM, Random forest, Neural network , and Naïve Bayes. We found the best accuracy 76.3% which is neural networks.

Kidney Experimental results and analysis:
We have used a Kidney dataset from UCI repository https://archive.ics.uci.edu/ml/datasets/Chronic_Kidney_Disease with 400 records. Each patient has only one record containing 24 features and one target which is CKD or non-CKD, it deter-mined if the patient have a kidney disease or not.
Applying algorithms using Orange3 tool: We tested 6 algorithms KNN, Tree, SVM, Random forest, Neural network , and Naïve Bayes. We found the best accuracy 100% which is random forest.

Conclusion
While writing this paper, we have used three different datasets for blood pressure id, gender, age ,hc, wc,whr, sbp, dbp, BMI and hypertension of a person were used to predict whether a person is prone to HBP or not. We used different classifiers to predict whether a person susceptible to have hypertension or not. Among all classification algorithms used for experimental Tree algorithm has shown the best accuracy when comparing with other algorithms.
For diabetes , we have used pregnancies, glucose, blood pressure, skin thickness, insulin, diabetes pedigree function, age and BMI of a person to predict whether a person is prone to diabetes or not. We used different classifiers to predict whether a person susceptible to have diabetes or not. Among all classification algorithms used for experimental analysis the neural network algorithm has shown the best accuracy when comparing with other algorithms.
For kidney diseases, the predictive models were presented by using machine learning methods including support vector machine (SVM) and artificial neural network (ANN) classifiers to predict chronic kidney disease. From the experimental results, it can be seen that random forest classifier gives the highest accuracy. Therefore, it can be concluded that random forest classifier is appropriated for predicting the chronic kidney disease in our case. In future; we would like to consider other attributes such as anger, anxiety, and cholesterol level to improve the prediction performance of the classifiers.