题目
Artificial Intelligence Lecture 3 quiz
简答题
In Lecture 2, we built a classifier between human-written password (e.g., WinterDragon99!) and random password (e.g., 2@*7N!bx?2c). We designed features, e.g., the number of consecutive letters and numbers. Now you need to work on a modified problem: we removed all numbers and obtained a new dataset: https://github.com/liususan091219/cs541/blob/main/lectures/lecture3/. However, the old feature now only achieves error rate = 0.36 on this new dataset. Observe this new dataset, design features to improve this error rate. You should start by reproducing this error rate on the notebook below, then revise featureExtractor to reduce the error rate to below 0.2: https://colab.research.google.com/drive/16MFcWCs7H44lVSjzAf8y3PhqHvm8xfMB?usp=sharing Links to an external site. Note: You must have entered the correct answer before 6:50 to receive the bonus points. No bonus point if getting the correct answer after 6:50. 1.5 bonus points if error rate < 0.2. Raise your hand if you achieved an error rate < 0.2.

查看解析
标准答案
Please login to view
思路分析
The prompt describes a machine learning task rather than a multiple-choice question, so we’ll reason through the problem setup and potential feature-design directions step by step.
First, restating the task in my own words: you had a classifier that distinguished human-written passwords from random passwords using features like the number of consecutive letters and numbers. On a modified dataset where all numbers were removed, the old feature set yielded an error rate of 0.36. The goal is to inspect the new dataset, reproduce that 0.36 error rate, and then engineer features to push the error rate down to below 0.2, ideally via a revised featureExtractor.
Next, consider why removing all digits from passwords could degrade the original feature effectiveness: features that relied on numeric patterns (such as runs of digits, digit-place usage, or digit-letter transitions) become less informative or even misleading when digits are absent in the data. Consequently, the model may rely on less discrimin......Login to view full explanation登录即可查看完整答案
我们收录了全球超50000道考试原题与详细解析,现在登录,立即获得答案。
类似问题
Assuming you are collecting data about traffic accidents in Melbourne in order to develop a predictive model. Would it be better to collect “more data” (e.g. the locations of accidents over many years) or “more types of data” (e.g. the types of vehicles involved, the weather conditions, etc.)? Give a brief justification.[Fill in the blank]
Which of the following is NOT an advantage of feature engineering?
Why don’t we use the ID (e.g. student ID, social security number) as an input variable in a prediction problem?
In a consumer society, many adults channel creativity into buying things
更多留学生实用工具
希望你的学习变得更简单
加入我们,立即解锁 海量真题 与 独家解析,让复习快人一步!