Lets get started with week-4
My mentor and I decided the tasks for this week given as :
Starting with the image clustering
This week our main focus was on the image part of the clustering (general and bio metric ) so I will be starting the blog with the image clustering .
Clustering of the images is done based on the similarity of the images, however the similarity can mean to be similar looking images, similar background images or similar size and etc. Our aim is to cluster based on similar content.
Feature Extraction:: It is the extraction of the useful information from the raw data that solves the problem. In computer vision there are various feature descriptors(color,edge,texture) that transform image from one to another. Feature extraction can be done manually (that we have done on tehe audio data in previous weeks ) or automatically(using transfer learning)
Transfer learning: In transfer learning, a Deep learning model is trained by a large dataset in which thousands or millions of samples exist. The learning of such a trained DL model is transferred using transfer learning to allow the DL model to work on another small dataset with just hundreds or a few thousands of images. In feature extraction we use the representations learned by a previous network to extract meaningful features from new samples.
This week I tried these pretrained models.
1. VGG16
2. VGG19
3. Resnet50
4. Inceptionv3
S No. | Code Type | Code link |
---|---|---|
1 | Image feature Extraction | Image feature extraction code |
2 | Image clustering with Gridsearch | clustering-code with Grid |
3 | Image clustering without Gridsearch | clustering-code withoutGrid |
S No. | Evaluation criteria | Results link |
---|---|---|
1 | Using grid search | resultswithGridsearch.html |
2 | Without using grid search | resultswithGridsearch.html |
3 | Using dimensionality reduction technique such as pca and with gridsearch | results with pca-with grid.html |
Getting started with the Bio-metric clustering
I started with the face clustering. First I extracted the face encodings (128 d vector representations) from the images using the face-recognition library in python.
The feature extraction code can be found here. I performed the DBSCAN algorithm with and without grid search, both performed good but with grid search is better
S No. | Code Type | Code link |
---|---|---|
1 | Face Encoding Extraction | face-extractioncode |
2 | Face clustering | clustering-code |
3 | Results visualization | results-viz.html |
4 | Complete code | complete_bio-metric.html |