题目
2254 BIOSC 1544 SEC1000 Exam #2
单项选择题
Consider the following Python code: import random import matplotlib.pyplot as plt from sklearn.linear_model import LinearRegression random.shuffle(active_compounds) split_index = int(len(active_compounds) * 0.6) # Use 60% for training train_set, test_set = active_compounds[:split_index], active_compounds[split_index:] features = ["logP", "num_hbd", "num_hba", "mw", "num_rotatable_bonds"] target = "pKi" X_train = [[molecule[feat] for feat in features] for molecule in train_set] y_train = [molecule[target] for molecule in train_set] X_test = [[molecule[feat] for feat in features] for molecule in test_set] y_test = [molecule[target] for molecule in test_set] model = LinearRegression() model.fit(X_train, y_train) def plot_predictions(model, X, y): y_pred = model.predict(X) plt.scatter(y, y_pred) plt.xlabel("True pKi (uM)") plt.ylabel("Predicted pKi (uM)") plt.show() print("R² score:", model.score(X, y)) print("Training set evaluation:") plot_predictions(model, X_train, y_train) print("Testing set evaluation:") plot_predictions(model, X_test, y_test) Based on this code, which of the following statements best describes a potential issue or limitation in how the model is trained or evaluated?
选项
A.The R² score is not a valid metric for evaluating regression models.
B.The function plot_predictions incorrectly plots the true values on the x-axis and predictions on the y-axis.
C.The model cannot be used for prediction because it was trained only on a subset of molecular descriptors instead of the full dataset.
D.The model assumes a linear relationship between molecular features and pKi, which may not be valid if the true relationship is nonlinear.
E.The code incorrectly selects training and testing data by sorting molecules before splitting, introducing bias.
查看解析
标准答案
Please login to view
思路分析
The question presents a Python snippet that trains a simple LinearRegression model on a subset of molecular descriptors and then evaluates it on both the training and test sets using a custom plotting function. We will evaluate each option in turn.
Option 1: The R² score is not a valid metric for evaluating regression models.
- This claim is not accurate in general. R² (coefficient of determination) is a standard metric for regression that measures how well the predicted values approximate the true values relative to the mean of the observed data. While it has limitations (e.g., it can be misleading for highly skewed data or non-linear residuals), it is still a valid and commonly used evaluation metric for regression models. The code......Login to view full explanation登录即可查看完整答案
我们收录了全球超50000道考试原题与详细解析,现在登录,立即获得答案。
类似问题
Which code can be used directly to predict prices for new HDB resale cases using a trained Linear Regression model?
Many people struggle to get loans due to insufficient or non-existent credit histories. And, unfortunately, this population is often taken advantage of by untrustworthy lenders. Home Credit strives to broaden financial inclusion for the unbanked population by providing a positive and safe borrowing experience. In order to make sure this underserved population has a positive loan experience, Home Credit makes use of a variety of alternative data--including telco and transactional information--to predict their clients' repayment abilities. Home Credit can provide three types of loans: Cash loans are one-time loans for any purpose Consumer loans will be for a specific item such as a refrigerator, washing machine or car. Revolving loans allow a client to borrow up to a limit, repay the loan and then borrow again. The company would like to improve their ability to select clients who will successfully repay loans, so that additional money can be loaned to future borrowers. Here are the variables available for the analysis: Variable Description Application_id ID of the loan application Loan_type Cash, Consumer or Revolving (see description above) Loan_term_months Number of months until loan maturity (pay off due date) Education_level Highest education level (None, High School, 2-Year College, etc.) Own_car_flag Client owns a car (true/false) Own_home_flag Client owns a home (true/false) Months_in_current_residence Number of months residing at current apartment or home Monthly_income_amount Average total monthly income for the household, including tips and informal payments (ex. Venmo) Total_consumer_debt Total debt for the household, including home, car and credit card debt Credit_bureau_score Credit rating score (FICO) Cash_savings_total Total amount of money available as cash Cell_phone_payments_last_12_months Number of completed cell phone payments in the last year Profession Description of the employment of the primary borrower Loan_amount Amount of the requested loan Default Final outcome of the loan account ('true' indicates that the account was not paid in full by the end of the loan term) Underwriter_notes Text notes from interviews with the prospective client Loan_purpose Description of the reason for the loan Predicting loan_amount requires what algorithm? (We are not predicting a yes or no outcome rather we are predicting a numeric value)
神经网络中需要有多少个神经元才能解决一维的线性回归任务? How many neurons are necessary in a neural network to solve a linear regression task in one dimension?
Recall that a single-neuron network for linear regression with two features looks as follows. Let’s assume I have already trained this network, and I ended up with the following optimal values: w1=1, w2=1, b=0. Now let’s say I have a test sample that I'd like to make a prediction on. It has feature values x1=2, x2=3, and target value 10. (i) What is z for this test sample? [ Select ] 20 10 15 5 25 (ii) What is g(z) for this test sample? [ Select ] 25 10 15 20 5 (iii) What is yhat for this test sample? [ Select ] 25 15 5 20 10 (iv) What is y for this test sample? [ Select ] 25 20 5 10 15 (v) What is J for this test sample? [ Select ] 15 5 25 20 10 (vi) I now have another test sample I'd like to make a prediction on. Which of these is a complete list of all the values that might change: [ Select ] x1, x2, y x1, x2, w1, w2, b, z, g(z), yhat, y, J x1, x2, z, g(z), yhat, y, J w1, w2, b z, g(z), yhat, J
更多留学生实用工具
希望你的学习变得更简单
加入我们,立即解锁 海量真题 与 独家解析,让复习快人一步!