An exhausted TA for CMU’s Introduction to Deep Learning (11-785) has reached their limit. After weeks of spelunking through Piazza threads filled with cryptic stack traces, malformed attention plots, and variable names like dWhx_v3_final_fr, they’ve decided enough is enough. Rather than manually traversing the combinatorial space of student mistakes in HWP2, the TA sets out to automate out some of the pain.

To do so, they scrape every relevant Piazza post from the past N semesters (where N ≈ too many), including:

Descriptions of bugs (both coherent and not)
Fragments of semi-functional student code
TA replies ranging from thoughtful diagnostics to "try turning it off and on again"
Post metadata (HW category, resolution status etc..)

Now armed with this glorious mess of historical bug data, the TA wants to train a deep learning system capable of:

Bug Detection: Detecting whether a post involves a bug
Bug Classification: Predicting the likely type of bug (e.g., “you forgot to mask” or “why is your loss negative?”)
Bug Retrieval: Retrieving similar past bug reports

The dream? Never having to answer “Why is my CER 600?” at 2am again.

For the following question, assume the TA is able to construct some representative dataset with the following properties:

Each Piazza post is labeled as either bug or not bug.
For posts labeled as bug, one mutually exclusive bug type is provided.

For each bug-labeled post, one or more aligned fix explanations are available.

Which of the following modeling approaches is best suited to detect whether a Piazza post describes a bug or not? (Select all that apply)

Question

An exhausted TA for CMU’s Introduction to Deep Learning (11-785) has reached their limit. After weeks of spelunking through Piazza threads filled with cryptic stack traces, malformed attention plots, and variable names like dWhx_v3_final_fr, they’ve decided enough is enough. Rather than manually traversing the combinatorial space of student mistakes in HWP2, the TA sets out to automate out some of the pain.

To do so, they scrape every relevant Piazza post from the past N semesters (where N ≈ too many), including:

Descriptions of bugs (both coherent and not)
Fragments of semi-functional student code
TA replies ranging from thoughtful diagnostics to "try turning it off and on again"
Post metadata (HW category, resolution status etc..)

Now armed with this glorious mess of historical bug data, the TA wants to train a deep learning system capable of:

Bug Detection: Detecting whether a post involves a bug
Bug Classification: Predicting the likely type of bug (e.g., “you forgot to mask” or “why is your loss negative?”)
Bug Retrieval: Retrieving similar past bug reports

The dream? Never having to answer “Why is my CER 600?” at 2am again.

For the following question, assume the TA is able to construct some representative dataset with the following properties:

Each Piazza post is labeled as either bug or not bug.
For posts labeled as bug, one mutually exclusive bug type is provided.

For each bug-labeled post, one or more aligned fix explanations are available.

Which of the following modeling approaches is best suited to detect whether a Piazza post describes a bug or not? (Select all that apply)

BlackTom AI · Accepted Answer

The question asks which modeling approaches are best suited to detect whether a Piazza post describes a bug, with the additional setup that each post is labeled bug or not bug, and for bug posts a specific bug type is provided.

Option 1: An Encoder-Decoder Transformer, trained on aligned (post → fix) examples where posts that don’t trigger generation are assumed to be bug-free. This approach resembles a generative retrieval/generation setup rather than a discriminative binary detector. It presumes you’ll generate a fix for every post, and uses the generation signal to infer bugs, which is indirect and can be brittle for detection. It also relies on a problematic assumption that non-generation implies bug-free, which can lead to false negatives/positives if the model is uncertain or if non-bug posts resemble fixable content.

Option 2: A pre-trained bidirectional Transformer encoder (e.g., BERT) trained on a large corpus of unlabeled text via masked language modeling, with a classification head fine-tuned on the using binary cross-entropy. This is a standard, robust setup for text classification: leverage powerful contextual representations from BERT, then train a lightweight classifier to distinguish bug vs not-bug posts using binary cross-entropy loss. It benefits from unlabeled pretraining, handles long-range dependencies in text, and adapts well to the task with relatively modest labeled data.

Option 3: An RNN encoder with max-pooling across time and softmax classifier trained over C known bug categories using cross-entropy loss. While this could be used for multiclass bug-type classification, it is not ideal for the fundamental detection task (bug vs not bug). RNNs generally lag behind transformers on many text tasks due to less efficient context capture and training challenges, and training over C bug categories adds unnecessary complexity for the binary detection step.

Option 4: A pairwise CNN encoder architecture trained with contrastive loss to minimize distance between similar bug embeddings. This setup is tailored for learning embedding spaces where similar posts are close, which is great for retrieval or clustering of bug reports, but it is not a direct, straightforward discriminator for bug vs non-bug classification. Without an explicit classifier head or a calibrated decision boundary, it’s harder to obtain reliable binary decisions.

Option 5: A reconstruction autoencoder trained on all posts labeled as a bug, where thresholding on reconstruction loss is used as a proxy for bug detection. Training solely on bug posts and using reconstruction loss to detect bugs is problematic: posts that are not buggy might still reconstruct well, and some buggy posts could reconstruct easily too, leading to poor discrimination. This approach lacks a direct mechanism to separate bug from non-bug posts unless you impose a separate calibration that may not generalize well.

Option 6: A clustering algorithm applied to the latent space of a Variational Autoencoder (VAE) trained on bug-labeled posts, where each resulting cluster is interpreted as a latent bug category. This focuses on modeling bug types and their structure, not on the binary detection task. Since non-bug posts are not modeled in the same latent space, you’d still need a separate decision rule to flag non-bug posts, making this suboptimal for direct detection.

Option 7: An MLP based single softmax classifier over C+1 classes where the C bug classes are augmented with a “no bug” class. This is a straightforward multiclass detector that includes a dedicated no-bug class. While workable, it can be less flexible than a binary classifier when the data are imbalanced or when generalizing across unseen bug types; it also forces a fixed mapping between bugs and classes rather than learning a simple yes/no decision boundary.

In summary, multiple approaches exist for the detection task, but the most direct, well-supported, and typically effective choice is to use a strong contextual encoder (like BERT) fine-tuned with a binary cross-entropy objective for bug vs not-bug classification. This leverages rich contextual representations and a simple, calibrated decision threshold, aligning well with the labeled data setup described.

11785/11685/11485 Quiz-14

View Explanation

Log in for full answers

Similar Questions

The supervised machine learning approach to sentiment analysis is based on:

Text classification utilizes a unique set of classification algorithms, distinct from algorithms used in predictive analytics with numeric data:

Which of the following is a better process for building a text classification model?

Which of the following is technically a text classification task?

In a consumer society, many adults channel creativity into buying things

Economic stress and unpredictable times have resulted in a booming industry for self-help products

People born without creativity never can develop it

More Practical Tools for Students Powered by AI Study Helper

Homework AI Solver

Stylized AI Paper Writer

Plagiarism Checker Assistant

Citation AI Academic Writing Tool

In-Class Translation Assistant

AI Note Generator

AI Quiz Answers

Past Exam Questions from University Test Bank

Smart Practice Assistant

Adaptive Practice

Making Your Study Simpler