The rising occurrence of melanoma has led to the development of computer-aided diagnosis systems for classifying dermoscopic images. The PH² dataset, a dermoscopic image database, was created to enable comparative studies on segmentation and classification algorithms.
Dataset:
There are 200 images with their lesion mask, of 3 classes (80 common nevi, 80 atypical nevi, and 40 melanomas).
This code is a Python script for detecting skin lesions and classifying them into three types: Common Nevus, Atypical Nevus, and Melanoma. The script imports necessary libraries, defines functions to process and analyze images, and extracts various features from the images. These features include color values, symmetry, irregularity, and average phase.
The dataset used in this script is the PH2 dataset, and the file paths are specific to Google Colab's file structure. The script reads in the dataset, processes the images, and extracts features for each image. The features are then used to calculate average values for each class. The code also visualizes the data in 3D scatter plots for better understanding of the differences between classes.
At the end, the script calculates and prints the differences between the average values of each class to provide insight into the classification process.
Read the file “PH2_dataset.txt” and extract the classes and all the images in the class. It return a dictionary of {class: [files of the classes]}
File avalaible on my github for your use.
In the start I implemented a lot of random parameters, then I judged each parameter on the base of its quality (# of right classifications it makes). Then I remove the bad parameters and just use the good parameters, there is a lot of space for improvement in this part.
It’s a nested loop that, that visits and apply the parameter finding techniques on each image and at the end it calculates the average of the parameters for each class. we can see that the segregation between Atypical and common is not that good.
These are these 16 parameters I used in my Final code. all the other parameters were giving errors. The classification is based in Voting mechanism rather then Distance. So I can check the quality of each parameter separately.
This function basically cuts down the image from vertical center, and horizontal center, mirror one half and Xor it to the other half. Which gives all the points which are not symmetric on the given axis
by xor-ing we can calculate the amount of curvature of the mole. and the #of white pixels define the shape.
All the color operations are dealt with within this function, it takes the image split it in BGR channels , then we calculate the histogram of the colors and find the averages, mode and standard deviation of the given images.
Irregularly is to apply a threshold on the image and count the white pixels, it give the over all gray distribution over the image. and phase is calculated by Applying the horizontal and vertical Sobel, and return the magnitudes and avg (phases) of the image. As the parameters
The code is sensitive to the parameters so maybe the code wont give the exact accuracy. But it ranges between 58-75% for different training and testing sets. The final accuracy over whole data was 63.3%
This was a really Interesting and Challanging task. And I would like to Thank Dr Usman Akram for assigning it as an assignment.