Assignment 4 Part 1&2 - Regression and K-Means Clustering

Question # 00803439 Posted By: dr.tony Updated on: 04/23/2021 05:56 AM Due on: 04/23/2021
Subject Education Topic General Education Tutorials:
Question
Dot Image

Assignment 4

Data Science for Business Due 04/25/2021

Assignment 4

 

Part 1: Regression

Correlation and regression analysis are related in the sense that both deal with relationships among variables. The correlation coefficient is a measure of linear association between two variables. Values of the correlation coefficient are always between -1 and +1

In the following Linear Regression applet there are 10 points plotted in the coordinate plane. The line in the graph represents the best fit line for these 10 points. The correlation coefficient symbol is r.

https://www.geogebra.org/m/rJj6yr6C#material/nFJp7McJ

Interact with this applet by repositions the points (by dragging the points) before start answering the following questions:

1. Reposition the points so that the correlation coefficient (r) to 1. What does it mean to have r =1?

2. Reposition the points so that the correlation coefficient (r) to -1. What does it mean to have r =-1?

3. Reposition the points so that the correlation coefficient (r) to 0 or very close to zero. What does it mean to have r =0?

Include screenshots for every part and make a comparison between the three different scenarios in terms of the correlation between the two variables. Discuss your results.

Part 2: K-Means Clustering

In the following link you will find a visualization to the K-Means Clustering Algorithm.

https://www.naftaliharris.com/blog/visualizing-k-means-clustering/

Read the article and try to test the visualization before start answering the following questions:

In the following questions, use the same dataset to make comparisons between the three different strategies: (1) you choose the centroids, (2) Randomly, or (3) choose the farthest point.

1. Choose the first strategy to initial the centroids by “choosing them by yourself”. Include screen shots for the steps. How many iterations the algorithm did till it finds the best clusters?

2. Choose the second strategy to randomly choose the centroids. How many iterations the algorithm did till it finds the best clusters?

3. Choose the third strategy by using the Farthest point as the centroids. How many iterations the algorithm did till it finds the best clusters?

 

Discuss your conclusion about using the three different strategies. Add any interesting facts/notes that you found when tried this visualization.

Dot Image
Tutorials for this Question
  1. Tutorial # 00798785 Posted By: dr.tony Posted on: 04/23/2021 06:01 AM
    Puchased By: 2
    Tutorial Preview
    The solution of Assignment 4 Part 1&2 - Regression and K-Means Clustering...
    Attachments
    Assignment_4_Part_12_-_Regression_and_K-Means_Clustering.ZIP (18.96 KB)

Great! We have found the solution of this question!

Whatsapp Lisa