91-DOU : Day #2

Course : Cluster Analysis & Unsupervised ML in Python

Video Mins completed : 50 mins

Last Video # completed : 22

Notes

K-Means Clustering

Cost function : Coordinate distance

Soft K-means : Assigns probability of a point belonging to a certain cluster based on the distance from the cluster mean.

Better than Hard K-means which assigns a 100% probability to one class.

K-Means clustering fails for data clusters shaped as

  1. donut
  2. elongated clusters
  3. different density clusters.

Can only look for spherical clusters

Disadvantages

  1. Need to choose K
  2. Local Minima tripping the clustering
  3. Sensitive to initial configuration
  4. Doesn’t take into account the density of the cluster