Accounting /Machine Learning Exam 1 CS 4375 Part 2

Machine Learning Exam 1 CS 4375 Part 2

Accounting33 CardsCreated 3 months ago

This deck covers key concepts and definitions related to machine learning, including error metrics, optimization techniques, regression analysis, and model evaluation.

MSE or RMSE

Mean square error, or root, are useful in comparing two models built on the same training set. Root mean square is in terms of Y.
Tap or swipe ↕ to flip
Swipe ←→Navigate
SSpeak
FFocus
1/33

Key Terms

Term
Definition
MSE or RMSE
Mean square error, or root, are useful in comparing two models built on the same training set. Root mean square is in terms of Y.
Loss function
describes how much accuracy we lose in our model.
Gradient Decent
One of the most commonly used optimization techniques. Will not be bogged down for any large datasets. The algorithm starts with some value for the pa...
Risidual vs. Fitted
Plots the residuals with a red trend line. You want to see a fairly horizontal red line. Otherwise the plot is showing some variation in the data that...
Normal Q - Q
If the residuals are normally distributed, you will see a fairly straight diagonal line following the dashed line.
Scale-Location
You want to see a fairly horizontal line with points distributed equally around it. If not then your data may not be of same variance.

Related Flashcard Decks

Study Tips

  • Press F to enter focus mode for distraction-free studying
  • Review cards regularly to improve retention
  • Try to recall the answer before flipping the card
  • Share this deck with friends to study together
TermDefinition
MSE or RMSE
Mean square error, or root, are useful in comparing two models built on the same training set. Root mean square is in terms of Y.
Loss function
describes how much accuracy we lose in our model.
Gradient Decent
One of the most commonly used optimization techniques. Will not be bogged down for any large datasets. The algorithm starts with some value for the parameter w and keeps changing them in an interactive loop until it finds a minimum. If step size is too small it will take too long to calculate, too large and we might step over minimum.
Risidual vs. Fitted
Plots the residuals with a red trend line. You want to see a fairly horizontal red line. Otherwise the plot is showing some variation in the data that your model did not capture.
Normal Q - Q
If the residuals are normally distributed, you will see a fairly straight diagonal line following the dashed line.
Scale-Location
You want to see a fairly horizontal line with points distributed equally around it. If not then your data may not be of same variance.
Residuals vs. Leverage
This plot will indicate leverage points which are influencing the regression line.
Outlier
a data point with an unusual y value
Leverage Point
a data point with an unusual x value
Ocam's Razor
When choosing between two likely explanations, pick the simpler one.
High bias, low variance model
Is likely to underfit and not capture the shape of the data. Happens with simpler models such as linear and logistic regression.
Low bias, high variance model
captures too much complexity and noise in the data, may not generalize well new data.
Confounding variable (interaction effect)
a variable that correlate with both the target and a predictor.
additive assumption
each predictor contributes to the model independently of other predictors.
Regularization
The added term penalizes large coefficients. It helps prevent overfitting in small datasets using complex models.
Deviance residual
a mathematical transformation of the loss function, and quantifies a given point's contribution to the overall likelyhood. (similar to a RSS stat in Linreg)
Null deviance
measures the lack of fit of the model, considering only the intercept.
Residual deviance
measure the lack of fit of the entire model. (we want to see Residual Dev <<< Null Dev)
AIC
Akaike Information Criterion is useful for comparing models. It penalizes overly complex models.
Difference between linear and logistic regression
Whereas the coefficient of a linear regression predictor quantifies the difference in the target variable as the predictor changes, in logistic regression, the coefficient quantifies the difference in the log odds of the target variable.
Problem with a log reg model
Not enough data (8) => feed more data. Unbalanced dataset (1 7) < (4 4) => oversample minority class, undersample majority
Kappa
is a statistic that attempts to adjust accuracy by accounting for the possibility of a correct prediction by chance alone. Closer to 1 the better agreement.
ROC curve
Shows tradeoff between predicting true positives while avoiding false positives.
AUC
Area under the curve for a ROC curve quantifies the predictive value. A value of .5 means no predictive value, 1 being a perfect classifier.
Stochastic Gradient Descent
Used when there are a large amount of data which would bog down the original. Does the processes in batches randomly.
likelihood
It quantifies how likely it is that we would see the data given the Survived instances
prior
distribution is learned from the dataset.
Strengths of Naive Bayes
• Works well with small data sets • Easy to implement • Easy to interpret • Handles high dimensions well
Weaknesses of Naive Bayes
May be outperformed by other classifiers for larger data sets Guesses are made for values in the test set that did not occur in the training data If the predictors are not independent, the naive assumption that they are may limit the performance of the algorithm
Logistic Regression Strengths
• Separates classes well if they are linearly separable • Computationally inexpensive • Nice probabilistic output
Logistic Regression Weaknesses
Prone to underfitting; Not flexible enough to capture complex non-linear decision boundaries
Linear Regression Strengths
Relatively simple algorithm with an intuitive explanation because the coefficients quantify the effect of predictors on the target variable. Works well when the data follows a linear pattern. Has low variance.
Linear Regression Weaknesses
High bias because it assumes a linear shape to the data.