(eCornell) Clustering Documents With Unsupervised Machine Learning
About This Course
In this course, you will focus on measuring distance � the dissimilarity of various documents. The goal is to discover how alike or unlike various groups of text documents are to one another. At scale, this is a problem you might encounter if you need to group thousands of products together purely by using their product description or if you would like to recommend a movie to someone based on whether they liked a different movie. You will work with several different data sets and use both hierarchical and k-means clustering to create clusters, and you will practice with several distance measures to analyze document similarity. Finally, you will create visualizations that help to convey similarity in powerful ways so stakeholders can easily understand the key takeaways of any clustering or distance measure that you create. The course is provided by eCornell in partnership with Genashtim.
What You'll Learn
Use and evaluate hierarchical clustering to group similar documents
Use and evaluate k-means clustering to group similar documents and measure quality