Questions

STAT5003 (NE) STAT5003 Practice Tutorial Quiz 3

Multiple dropdown selections

1.Given the dendrogram obtained from hierarchical clustering of the mtcars dataset, is Masserati Bora more similar to Chrysler Imperial or to Cadillac Fleetwood? Maserati Bora is more similar to the Chrysler Imperial because it is closer on the horizontal axis. Evaluate each statement and indicate whether it is true or false: A. PCA is a supervised learning algorithm used for classification tasks. [ Select ] FALSE TRUE B. PCA transforms the original features into a new set of correlated features called principal components. [ Select ] FALSE TRUE C. The first principal component captures the maximum variance in the data. [ Select ] TRUE FALSE D. If we run PCA multiple times, the results will differ depending on the seed (set.seed()) used. [ Select ] TRUE FALSE Why it is important to set the scale = TRUE when performing PCA? [ Select ] To increase the computational efficiency of the PCA algorithm. To ensure that each feature contributes equally to the analysis by standardizing the data. To automatically select the optimal number of principal components. To ensure that the principal components are orthogonal to each other. model <- prcomp(numeric.dat, scale = TRUE) Given two Kernel Density Estimations (KDE) of the same dataset, one created with a small bandwidth and the other with a larger bandwidth, which of the following statements is more likely to be true? [ Select ] The KDE with a small bandwidth will produce a smoother and more generalized density estimate. The KDE with a larger bandwidth will capture finer details and more peaks in the data distribution. The KDE with a small bandwidth will be more sensitive to noise and show more peaks in the density estimate. The KDE with a larger bandwidth will overfit to the data and show more fluctuations in the density estimate. Match each ggplot graph with the corresponding model it represents Graph 1 A scatter plot for the linear regression model with fitted values and residuals. Graph 2 [ Select ] A biplot for the PCA. A scatter plot for the linear regression model with fitted values and residuals. Given two t-SNE visualizations of the author dataset, one created with a small perplexity and the other with a large, which of the statement is more likely to be true? [ Select ] Graph 2 is more likely to be created by a small perplexity as it focues more on local structures and distinct clusters. Graph 1 is more likely to be created by a large perplexity as it shows finer details. Graph 1 is more likely to be created by a small perplexity as it is more stable and reproducible. Graph 2 is more likely to be created by a large perplexity as it emphasises global data structure. plot of chunk tsne

View Explanation

Verified Answer

Please login to view

Step-by-Step Analysis

Here is a step-by-step walkthrough of each statement and option, laid out clearly to evaluate their truthfulness and underlying concepts. Restated question components and options: 1) A statement about Maserati Bora’s similarity to Chrysler Imperial vs Cadillac Fleetwood, followed by several true/false sub-questions and matching tasks. 2) A multiple-dropdown item set: - A: PCA is a supervised learning algorithm used for classification tasks. [Select] FALSE TRUE - B: PCA transforms the original features into a new set of correlated features called principal components. [Select] FALSE TRUE - C: The first principal component captures the maximum variance in the data. [Select] TRUE FALSE - D: If we run PCA multiple times, the results will differ depending on the seed (set.seed()) used. [Select] TRUE FALSE 3) Why it is important to scale = TRUE when performing PCA? [Select] options about standardization, efficiency, component count, orthogonality. 4) Model demonstration: model <- prcomp(numeric.dat, scale = TRUE) and a KDE bandwidth comparison question with four statements. 5) Graph matching: Match each ggplot graph with the corresponding model it represents (Graph 1 and Graph 2 options). 6) t-SNE question: Given two t-SNE visualizations with different perplexities, which statement is more likely true? [Select] options. 7) A final line “plot of chunk tsne” appears as context. Now, evaluating each option in turn: A) Maserati Bora is equally similar to both Chrysler Imperial and Cadillac Fleetwood. - This statement claims equal similarity to both cars, but the accompanying setup in the prompt asserts Maserati Bora is closer to the Chrysler Imperial on the ho......Login to view full explanation

Log in for full answers

We've collected over 50,000 authentic exam questions and detailed explanations from around the globe. Log in now and get instant access to the answers!