91-DOU : Day #2

Course : Cluster Analysis & Unsupervised ML in Python

Video Mins completed : 50 mins

Last Video # completed : 22

Notes

K-Means Clustering

Cost function : Coordinate distance

Soft K-means : Assigns probability of a point belonging to a certain cluster based on the distance from the cluster mean.

Better than Hard K-means which assigns a 100% probability to one class.

K-Means clustering fails for data clusters shaped as

  1. donut
  2. elongated clusters
  3. different density clusters.

Can only look for spherical clusters

Disadvantages

  1. Need to choose K
  2. Local Minima tripping the clustering
  3. Sensitive to initial configuration
  4. Doesn’t take into account the density of the cluster

91-DOU : Day #1

Course : Cluster Analysis & Unsupervised ML in Python

Video Mins completed : 125 mins

Last Video # completed : 13

Notes

Clustering application

  1. Categorization
  2. Search : Closest neighbors for an item
  3. Density estimation : Finding probability distribution in the data.

Implemented exercises to understand the core logic of K-Means clustering. This was unnecessary. Implementation could have been skipped. Need to move faster.