91-DOU : Day #4

Course : Python for Machine Learning and Data Science Masterclass

Video Mins completed : 126 mins

Last Video # completed : 206

K-Means clustering – Scale data if mix of numeric and encoded features present to ensure that clustering and distance measures do not get affected.

Find the correlation between the features and labels to know which features have highest bearing on the clustering

To figure out the ideal K value, check the SSD(sum sqrd dist) model.inertia_ param of the model for a range of K’s. If for the change of K the value has not dropped significantly, it indicates a cutoff value which can be taken as cluster value.

Intro to chloropleth which allows clusters to be represented on a map.

https://plotly.com/python/choropleth-maps
https://medium.com/@nirmalsankalana/k-means-clustering-choosing-optimal-k-process-and-evaluation-methods-2c69377a7ee4