Implementing Support Vector Machine Classification to Breast Cancer Cells.
As any field Machine Learning algorithms open a breakthrough in health care systems. In particular, there is one method, which is the best algorithm to detect hazardous cells or tumors, is Support Vector Machines(SVMs).
SVMs are a set of supervised learning methods used for classification, regression and outliers detection. in the data science world. The advantages of support vector machines are: Effective in high dimensional spaces and use a subset of training points in the decision function (called support vectors), it is also memory efficient and versatile. 1) https://scikit-learn.org/stable/modules/svm.html#svm-classification
In my survey, I used Breast Cancer Wisconsin (Diagnostic) Data Set, which can be obtained from Machine Learning Repository (UCI) 2) https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic) where is a good data scientist repository platform, to predict if the cancer diagnosis is benign or malignant based on several observations/features.
In this data set 30 features are used, some examples are:
– radius (mean of distances from center to points on the perimeter)
– texture (standard deviation of gray-scale values)
– smoothness (local variation in radius lengths)
– compactness (perimeter^2 / area – 1.0)
– concavity (severity of concave portions of the contour)
– concave points (number of concave portions of the contour)
– fractal dimension (“coastline approximation” )
Datasets are linearly separable using all 30 input features. Number of Instances are 569. Class Distribution is 212 Malignant, 357 Benign. Target class is Malignant and Benign.
You can see all the codes and detail analysis on Kaggle.com 3) https://www.kaggle.com/resulcaliskan/breast-cancer-classification . After implementing SVMs model on this data set, results are excellent.
SVM Model can separate malignant and benign tumors easily. (0: Malignant, 1: Benign)
As seen on the above chart model’s predicting accuracy results are amazing. I tested 114 cases which are model never see beforehand and the model predicted only 3 cases wrong the other 111 cases are predicted true. These 3 cases are a Type-I False prediction ( This means model says a patient is diseased, but actually not which is good.) Model precision is 0.97 that is an incredibly good score.
In conclusion, ML techniques are able to classify tumors effectively into malignant and benign tumors with 97% accuracy which is really great. The technique can rapidly evaluate breast masses and classify them in an automated fashion. In developing world this early breast cancer detection can dramatically save many lives.
Stay healthy with machine learning SVMs models. Happy new years to all.
Freelance Data Analyst.
References [ + ]