Skin Cancer Classification Application Using Machine Learning

Melanoma is one of the predominant types of skin cancer. The affected number has been increasing year after year. Although the deaths can be minimized by early detection and there is where the problem exists and consulting a dermatologist may not always guarantee the success of early detection and diagnoses. At first, the dermatologist examines the skin visually and decides whether it’s a type of skin cancer or a skin allergy. The accuracy of the diagnosis directly corresponds to the experience of the dermatologist. Even a small error in the inspection of the skin might end a life of a person so it is really necessary to have a standard and supporting system which can help dermatologists to identify and diagnose the patients is necessary. So with the advancements in image processing and deep learning algorithms have unleashed the potential to classify and identify the type of skin cancer with a single click of an image. The traditional method involves a lot of pre-processing steps and if something goes wrong in that step the model doesn’t perform well. The accuracy won’t be up to the mark this is where the Convolutional Neural Networks come into the picture. These models don’t require any feature extraction or with some minimal pre-processing steps to be done and it consumes a huge amount of data to be well trained. In this paper, we will compare the transfer learning with end-to-end trained custom deep learning models. It classifies the images into seven different classes. The model is deployed on the web locally which will be handy for the dermatologist to use it as a User Interface for assisting with the identification. The model with the transfer learning shows good results than the one which is trained from scratch. The plots show the difference between them and the way in which they train.


Introduction
Humans are becoming more vulnerable as the decades pass by. More and more diseases are affecting and the concern is that the mortality rate due to cancer is increasing. Melanoma is a type of skin cancer that affects the surface of the skin. This type of cancer might be caused due to high exposure to UV [1]. Taking global warming into consideration the danger bar is raised high. These are other factors such as increasing high-temperature climatic conditions and many more. The most common types of skin cancer include melanoma, basal, and squamous cell carcinoma. [21]. Even though it is visible to our naked eye unlike other cancers we don't care about that too much is some cases. There a lot of cases where the patients don't even realize that they have this medical conduction. Moreover, some take it lightly as some kind of allergy and don't treat it properly. By doing this they bring the danger to their doorsteps. The dataset used is takeup from the ISIC Melanoma is one of the predominant types of skin cancer. The affected number has been increasing year after year. Although the deaths can be minimized by early detection and there is where the problem exists and consulting a dermatologist may not always guarantee the success of early detection and diagnoses. At first, the dermatologist examines the skin visually and decides whether it's a type of skin cancer or a skin allergy. The accuracy of the diagnosis directly corresponds to the experience of the dermatologist. Even a small error in the inspection of the skin might end a life of a person so it is really necessary to have a standard and supporting system which can help dermatologists to identify and diagnose the patients is necessary. So with the advancements in image processing and deep learning algorithms have unleashed the potential to classify and identify the type of skin cancer with a single click of an image. The traditional method involves a lot of pre-processing steps and if something goes wrong in that step the model doesn't perform well. The accuracy won't be up to the mark this is where the Convolutional Neural Networks come into the picture. These models don't require any feature extraction or with some minimal preprocessing steps to be done and it consumes a huge amount of data to be well trained. In this paper, we will compare the transfer learning with end-to-end trained custom deep learning models. It classifies the images into seven different classes. The model is deployed on the web locally which will be handy for the dermatologist to use it as a User Interface for assisting with the identification. The model with the transfer learning shows good results than the one which is trained from scratch. The plots show the difference between them and the way in which they train.
(International Skin Imaging Collaboration) 2018: Skin Lesion Analysis Towards Melanoma Detection Channelge [24]. The most common type of cancer is Basal cell carcinoma which is not deadly as melanoma [3]. These are a total of seven types that are classified by the model. The squamous cell carcinoma is another type that accounts for about 20% of skin cancer and also not as deadly as melanoma. Early identification of these has a high rate of recovery. Fig. 1 shows the sample image that is used to train the model.

Existing System
From a dermatologist perspective, the suspicious skin has to be visually examined, and then if it requires more study the image is captured in a high-resolution camera that reviles hidden details of the layers of the skin. The detection is directly based on the experience of the physician which has not standard accuracy [9]. This can be automated with the help of state-of-art algorithms, it has been proven that these kinds of classifications are done with great accuracy [15]. The best accuracy of the k-nearest neighbors (KNN) algorithm is found to be 79% and with that as a baseline. [8] If we see CNN models that can easily outperform those models in terms of accuracy. The features are extracted from the images manually and support vector machine (SVM) learning algorithm is used for classification and with an accuracy of 93.1% [9]. These systems use manual or with some automated feature selection process to train and classify the types of cancer.

Methodology
The region of the skin is masked with auto threshold segmentation and it can also be done by manually setting up the pixel value. The color frequency can also be used to do the same kind of cancer region segmentation. The region of the cancer is masked to give a clear view of the pixel where the cancer is present [14]. The input image consisted of three color values. By tunning it to the desired value the maks can be created even accurately. Since it provides better visualization of the region rather than doing an auto segmentation. The masked region of cancer is shown in Fig.2

49
Vol.  One of the main focus of the paper is to make it easily accessible to the physicians for supporting them. Fig.3 shows a visual representation of the web application and who it works. So the web application consists of a single framework that responds to the physician's request. The model is first created and then the model is used in the backend to classify the class. When the request comes in the image is taken back to the model and the prediction is made and the result is displayed in the web application. In this way, the physician will be assisted in the diagnose of skin cancer.
This web application must be hosted in a cloud server so that it is accessible to all. If suppose the dermatologist feels that the model is misclassifying a certain image wrongly. Then the model can be re-trained on those particular sets of images to make it more accurate. The hardest part is getting diverse images for all types of skin cancers. If you can enable the model with those images the model will eventually be more accurate on the real-time images.
When the application is hosted the home page will have an upload button where the image has to be uploaded and after that, a preview of the image will be shown to verify the uploaded image. On prediction, the image is taken into the flask framework where the class of the image is predicted and the name of the class is returned to the user interface.  Fig. 4 shows the custom model that is been built by using various layers of deep learning so that the model exhibits high accuracy. The size of the input image is 224x224 in the RGB(Red,Blue,Green) format. MobileNet V2 is used in the front portion of the model to increase performance [16]. The output of that model is again passed into several other layers to get the most out of the model. They consist of convolutional layers with batch normalization and the ReLu activation function is used. The same is stacked up multiple time and finally, the output layer is a neural network with seven output nodes with the flatten layer as the previous layer. [19] International  The dropout is added to ensure that there is no overfitting in any stage of training the model. Each of the layers contains various layers of convolution, Activation, and max-pooling. CNN architecture has been a powerful asset in image processing and detection tasks [25]. They have been dominating the classification field in terms of accuracy in the prediction of images. The last neural network in the one that does the actual prediction work. There are two categories in which the model is trained. The first one is trained using the weights from the ImageNet and the other one is trained end-to-end from scratch.

Confusion Matrix
The easiest way to check whether the model performance good is by using confusion matrix. The confusion matrix gives an overall summary of the predictions that are made. The tabulated format can be easily interpreted. It shows both the positive and negative errors in a single table format. Fig. 9 shows the confusion matrix that is plotted at the end of the training. A total of 939 images are classified and the performance metrics are analyzed. We observe that the Melanocytic nevi which have a large number of images are classified most accurately. The confusion matrix is for the end-to-end trained model Fig. 9 Confusion Matrix of the model used

Web Application Output
The home page of the application is shown in Fig. 10, Once you upload the image a preview will be shown. The preview of the uploaded image is shown as a confirmation and you can make the predictions from there as shown in Fig. 11.

Conclusion
The entire framework is deployed in a local server which needs to be hosted on a cloud platform to make it accessible for wider adoption and usage. The trained custom model achieves an accuracy of around 95%. The custom model with the transfer learning is more accurate than the model that is trained end-to-end from scratch. The model works fine with most of the images due to the fact that the dataset is very complex to train due to the similarity in the types. The types are too similar in nature so that the model is still struggling a little on that. Collecting more dataset images on those types will be handly when it comes to classifying such kind. The wide adoption of this web application will benefit and make the model even more accurate. The same web application can be altered in such a way that it fits other classification applications as well. This is made possible since the model is constructed in a generic way to fit medical images. Periodic updates and new tech components can be added as per the needs. Further work can be on the database management for the physicians and getting their details for collaborative work. It will be really helpful in case of a large outbreak.