Wednesday, October 23, 2024

Log 5

 

I want to talk about several parts this week, including the difference between how machine learning and statistics work, and considerations for Machine Learning.

First of all, what’s the difference between these two seemingly similar yet different analysis? Linear regression analysis in statistics use a method called ordinary least squares estimation, it is a representative method; it finds the regression coefficients by minimizing the distance between each observation and hyperplane and verifies if they are important in the statistics. The equation was actually quite overwhelming to me, but I find that it’s not that hard to understand when I looked up the definition, equation, and how it’s done. To understand the whole process in statistics, understanding what a regression coefficient is is the first and foremost step. It’s the most crucial part of trying to make a prediction, and it is not possible to do it until we have this coefficient which is a number indicating how much influence the independent variable has to the dependent variable; we then use the independent variable to calculate, predict the dependent variable and after finishing it, we can get the result! The equation we use for calculating this result is as follows: . here is the dependent variable we want, you can see the small arrow above the character y, for it is a number that is a predictive number calculated by the regression analysis model rather than an actual observation data, andstands for the regression coefficient we are introducing, which is also accompanied by , a dependent variable, when combined together, it signifiesthatis the expected value when the dependent variable

is increased by one unit. Now we have finished the statistics part, but what about our main character, Machine Learning? It also uses least squares estimation; however, its advances in computing have led people to prefer to use gradient descent for rapid analysis compared to least squares estimation! The gradient descent method is to select an arbitrary weight first, and proceed by input learning data and keeps updating the weights until the minimum error on the error surface is reached. When it’s reached, it means the model is ready and we can then use the model of Machine learning to predict results. I deem the difference between these two, though not explained thoroughly and clearly, is the processing speed and the different terms used in both predictions; for example, in the linear regression equation of machine learning, H (x) = Wx + b, and the W corresponds to the slope of the independent variable x in the gradient descent. The W is expressed as a regression coefficient in statistics, but as a weight in Machine learning.

There are two main considerations for performing Machine Learning, one is underfitting, and the other one is overfitting. Underfitting means that the model is too simple to adequately explain the input data and is thus less predictive. I consider it quite severer because you simply can’t predict things with this little amount of data. The judgement of whether the model is underfitted is by examining the cost, which is the error between the model’s predicted value and its actual value. The closer it is, the better predictions it can offer. Adding additional training data is one way of compensating the problem of underfitting. Overfitting, on the other hand, is a case in which the model is too closely fitted to the training data. You may consider it a good thing, and it’s what I thought when first looked at the concept, having high in predictive power, it sounds pretty good and appealing to me. However, when the model is applied to actual fields, the predictive power is greatly reduced. I have come up with a simple example of the concept: overfitting is the kind of player who always does good in practice but always gives poor numbers and performances in actual games. The reason for this overfitting problem is quite easy to understand, because major machine learning algorithms build models in an inductive manner. As a result, how should we fix the problem? The solution is to go through a series of model validation procedures to check the validity and the feasibility of the model.


No comments:

Post a Comment