Accounting /Machine Learning Exam 1 CS 4375 Part 2
Machine Learning Exam 1 CS 4375 Part 2
This deck covers key concepts and definitions related to machine learning, including error metrics, optimization techniques, regression analysis, and model evaluation.
MSE or RMSE
Mean square error, or root, are useful in comparing two models built on the same training set. Root mean square is in terms of Y.
Tap or swipe ↕ to flip
Swipe ←→Navigate
SSpeak
FFocus
1/33
Key Terms
Term
Definition
MSE or RMSE
Mean square error, or root, are useful in comparing two models built on the same training set. Root mean square is in terms of Y.
Loss function
describes how much accuracy we lose in our model.
Gradient Decent
One of the most commonly used optimization techniques. Will not be bogged down for any large datasets. The algorithm starts with some value for the pa...
Risidual vs. Fitted
Plots the residuals with a red trend line. You want to see a fairly horizontal red line. Otherwise the plot is showing some variation in the data that...
Normal Q - Q
If the residuals are normally distributed, you will see a fairly straight diagonal line following the dashed line.
Scale-Location
You want to see a fairly horizontal line with points distributed equally around it. If not then your data may not be of same variance.
Related Flashcard Decks
Study Tips
- Press F to enter focus mode for distraction-free studying
- Review cards regularly to improve retention
- Try to recall the answer before flipping the card
- Share this deck with friends to study together
Term | Definition |
---|---|
MSE or RMSE | Mean square error, or root, are useful in comparing two models built on the same training set. Root mean square is in terms of Y. |
Loss function | describes how much accuracy we lose in our model. |
Gradient Decent | One of the most commonly used optimization techniques. Will not be bogged down for any large datasets. The algorithm starts with some value for the parameter w and keeps changing them in an interactive loop until it finds a minimum. If step size is too small it will take too long to calculate, too large and we might step over minimum. |
Risidual vs. Fitted | Plots the residuals with a red trend line. You want to see a fairly horizontal red line. Otherwise the plot is showing some variation in the data that your model did not capture. |
Normal Q - Q | If the residuals are normally distributed, you will see a fairly straight diagonal line following the dashed line. |
Scale-Location | You want to see a fairly horizontal line with points distributed equally around it. If not then your data may not be of same variance. |
Residuals vs. Leverage | This plot will indicate leverage points which are influencing the regression line. |
Outlier | a data point with an unusual y value |
Leverage Point | a data point with an unusual x value |
Ocam's Razor | When choosing between two likely explanations, pick the simpler one. |
High bias, low variance model | Is likely to underfit and not capture the shape of the data. Happens with simpler models such as linear and logistic regression. |
Low bias, high variance model | captures too much complexity and noise in the data, may not generalize well new data. |
Confounding variable (interaction effect) | a variable that correlate with both the target and a predictor. |
additive assumption | each predictor contributes to the model independently of other predictors. |
Regularization | The added term penalizes large coefficients. It helps prevent overfitting in small datasets using complex models. |
Deviance residual | a mathematical transformation of the loss function, and quantifies a given point's contribution to the overall likelyhood. (similar to a RSS stat in Linreg) |
Null deviance | measures the lack of fit of the model, considering only the intercept. |
Residual deviance | measure the lack of fit of the entire model. (we want to see Residual Dev <<< Null Dev) |
AIC | Akaike Information Criterion is useful for comparing models. It penalizes overly complex models. |
Difference between linear and logistic regression | Whereas the coefficient of a linear regression predictor quantifies the difference in the target variable as the predictor changes, in logistic regression, the coefficient quantifies the difference in the log odds of the target variable. |
Problem with a log reg model | Not enough data (8) => feed more data. Unbalanced dataset (1 7) < (4 4) => oversample minority class, undersample majority |
Kappa | is a statistic that attempts to adjust accuracy by accounting for the possibility of a correct prediction by chance alone. Closer to 1 the better agreement. |
ROC curve | Shows tradeoff between predicting true positives while avoiding false positives. |
AUC | Area under the curve for a ROC curve quantifies the predictive value. A value of .5 means no predictive value, 1 being a perfect classifier. |
Stochastic Gradient Descent | Used when there are a large amount of data which would bog down the original. Does the processes in batches randomly. |
likelihood | It quantifies how likely it is that we would see the data given the Survived instances |
prior | distribution is learned from the dataset. |
Strengths of Naive Bayes | • Works well with small data sets • Easy to implement • Easy to interpret • Handles high dimensions well |
Weaknesses of Naive Bayes | May be outperformed by other classifiers for larger data sets Guesses are made for values in the test set that did not occur in the training data If the predictors are not independent, the naive assumption that they are may limit the performance of the algorithm |
Logistic Regression Strengths | • Separates classes well if they are linearly separable • Computationally inexpensive • Nice probabilistic output |
Logistic Regression Weaknesses | Prone to underfitting; Not flexible enough to capture complex non-linear decision boundaries |
Linear Regression Strengths | Relatively simple algorithm with an intuitive explanation because the coefficients quantify the effect of predictors on the target variable. Works well when the data follows a linear pattern. Has low variance. |
Linear Regression Weaknesses | High bias because it assumes a linear shape to the data. |