Accounting /Machine Learning Exam 1 CS 4375 Part 1
Machine Learning Exam 1 CS 4375 Part 1
This deck covers key concepts in machine learning, including types of learning, data structures, statistical measures, and differences from traditional programming.
Machine Learning
trains computers to accurately recognize patterns in data for purposes in data analysis, prediction, and/or action selection by autonomous agents.
Tap or swipe ↕ to flip
Swipe ←→Navigate
SSpeak
FFocus
1/30
Key Terms
Term
Definition
Machine Learning
trains computers to accurately recognize patterns in data for purposes in data analysis, prediction, and/or action selection by autonomous agents.
Fields that make up machine learning
AI, Stat, CS, Probability
Supervised Learning
seeks to learn about one feature based on combinations of other features.
batch learning
train algorithm on a portions of data, then using the remaining data to test and evaluate how well the trained model can perform on previously unseen ...
Informative algorithm
They provide information in the form of data analysis or prediction. They provide information in the form of data analysis or prediction
Active algorithm
Reinforcement Learning teaches active agents to identify optimal actions given the current environment and what has been learned in past experience
Related Flashcard Decks
Study Tips
- Press F to enter focus mode for distraction-free studying
- Review cards regularly to improve retention
- Try to recall the answer before flipping the card
- Share this deck with friends to study together
Term | Definition |
---|---|
Machine Learning | trains computers to accurately recognize patterns in data for purposes in data analysis, prediction, and/or action selection by autonomous agents. |
Fields that make up machine learning | AI, Stat, CS, Probability |
Supervised Learning | seeks to learn about one feature based on combinations of other features. |
batch learning | train algorithm on a portions of data, then using the remaining data to test and evaluate how well the trained model can perform on previously unseen data. Named batch because data is fed in one batch. |
Informative algorithm | They provide information in the form of data analysis or prediction. They provide information in the form of data analysis or prediction |
Active algorithm | Reinforcement Learning teaches active agents to identify optimal actions given the current environment and what has been learned in past experience |
Supervised Learning | Refers to scenarios where each data instance has a label. This label is used to train the algorithm so that labels can be predicted for future data items. |
Unsupervised Learning | refers to scenarios where data does not have labels and the goal is simply to learn more about the data. |
Regression | the target (or label) is a real numbered quantitative value, like trying to predict market value of a home given square footage. Falls under supervised. |
Classification | the target is qualitative, a class, like predicting if a borrower is a good credit risk or not, given their income, outstanding credit balance, and other predictors. |
Machine Learning vs. Traditional Programming | In traditional programming, all knowledge is explicitly encoded in the algorithm by programmers, knowledge must be known beforehand. In Machine Learning knowledge is inferred from data, and outputs a model not results. |
Two situations where traditional programming falls short | The first type is when it is not possible to encode all the rules needed to solve a problem(human faces). The second type of situation in which traditional programming cannot be used to solve a problem is when the scale of the problem is too large. |
Qualitative feature | Take on only a finite set of values, also called factors, or categorical data. |
Quantitative feature | numeric |
target/response | is a function of features/predictors |
Vector | a sequential structure structure with one or more elements of the same type |
List | an ordered collection of objects not necessarily the same type. Can contain other lists as elements. |
Matrix | A 2d object with elements of the same type. |
Array | A n-dimensional object with the same type. |
Dataframe | A 2d structure where each column may be a different type. |
R-squared | a measure of goodness of fit that ranges from 0 to 1, closer to 1 the more variance in the model is explained by the predictors. Used because RSE is hard to interpret. |
Standard error | gives us an estimate of variation in the coefficient estimate and can be used to predict a confidence interval for the coefficient. |
t statistic | measures the number of standard deviations our estimate coefficient is from 0. |
p-value | if p value is small we can reject the null hypothesis. Typical cutoffs are .05 and .01. The p-value is more accurate if the data size is larger than 30. |
null hypothesis | there is no relationship between the predictor and target variable. |
RSS | The sum of squared errors |
RSE | measures how off our model is from the data, the lack of fit of the model. Measure in units of y. |
TSS | The total sum of squares, is a measure of how far off y values tend to be from the mean. |
F-statistic | Takes into account all of the predictors to determine if they are significant predictors of Y. It proves evidence against the null hypothesis that the predictors are not really predictors. |
Residuals | The difference between predicted and actual observation. They represent vertical lines drawn from the the data points to the regression line. They are averaged then squared to remove negatives giving the mean squared error. |