# Maximum Likelihood Estimation

**Introduction**

Maximum likelihood is a widely used estimation technique with applications in many fields, including time series modeling, panel data, discrete data, and even machine learning. This blog overviews the fundamentals of maximum likelihood estimation.

**Description**

A approach for determining values for a model’s parameters is known as maximum likelihood estimation. The parameter values are chosen to maximise the likelihood that the model’s defined process produced the observed data.

The above concept may seem a little confusing, so let’s look at an example to better grasp it.

Let’s pretend we’ve collected ten data points from a process. Each data point, for example, could indicate the time it takes a student to answer a given exam question in seconds. The figure below depicts these ten data points.

We must first choose whatever model we believe best describes the data generation process. This is a crucial section. We should, at the very least, have a decent sense of which model to utilise. This is normally the result of having some domain knowledge, but we won’t go into that here.

We’ll suppose that a Gaussian (normal) distribution can appropriately characterise the data production process for these data. Because the majority of the 10 points are centred in the middle, with only a few points spread to the left and right, the graphic above supports a Gaussian distribution.

(It’s probably not a good idea to make such a choice on the spur of the moment with only 10 data points, but since I generated them, we’ll go with it.)

The Gaussian distribution has two parameters, as you may recall. ‘The mean’ and ‘The standard deviation’ are two terms used to describe the average and standard deviation of a set of data. Curves differ depending on the values of these parameters (just like with the straight lines above). We’re curious as to which curve was most likely responsible for the data points we saw. Maximum likelihood estimate is a method for determining the values of and resulting in the best-fitting curve for the data.

As a result of solving a maximisation problem, a maximum likelihood estimator $widehat heta$ of $ heta _0$ is obtained:

To put it another way, $widehat heta$ is the parameter that optimises the sample $xi $’s likelihood. The maximum likelihood estimator of $ heta $ is called $ widehat heta $.

The symbol $widehat heta$ will be used in the following to signify both a maximum likelihood estimator (a random variable) and a maximum likelihood estimate (a random variable realisation): the meaning will be evident from the context.

As a solution of , the same estimator $widehat heta$ is produced.

## Maximum likelihood estimator

As a result of solving a maximisation problem, a maximum likelihood estimator $widehat heta$ of $ heta _0$ is obtained:

To put it another way, $widehat heta$ is the parameter that optimises the sample $xi $’s likelihood. The maximum likelihood estimator of $ heta $ is called $ widehat heta $.

The symbol $widehat heta$ will be used in the following to signify both a maximum likelihood estimator (a random variable) and a maximum likelihood estimate (a random variable realisation): the meaning will be evident from the context.

The same estimator is obtained as a solution of by maximising the likelihood function’s natural logarithm Because the logarithm is a strictly rising function, solving this issue is equivalent to solving the previous one. The logarithm of the probability is known as log-likelihood, and it is represented by the symbol.

**Calculating the Maximum Likelihood Estimates**

Now we can go on to learning how to calculate the parameter values now that we have an intuitive concept of what maximum likelihood estimation is. The values we discover are known as maximum likelihood estimates (MLE).

We’ll use an example to show this once more. Suppose we have three data points this time and we suppose that they have been generated by a process that is properly represented by a Gaussian distribution. These are the numbers 9, 9.5, and 11. How do we calculate the maximum likelihood estimates of the parameter values of the Gaussian distribution μ and σ?

Maximum Likelihood Estimation, or MLE for short, is one solution to probability density estimation.

Maximum Likelihood Estimation treats the problem as an optimization or search problem, in which we seek a collection of parameters that produces the greatest fit for the data sample’s joint probability (X).

To begin, a parameter called theta must be defined, which determines both the probability density function and the parameters of the distribution. It could be a set of numerical numbers that fluctuate over time and correspond to various probability distributions and their parameters.

Here we want to maximise the probability of detecting data from the joint probability distribution given a certain probability distribution and its parameters in Maximum Likelihood Estimation, which is expressed formally as:

- P(X | theta)
- that the MLE of θ is
**ˆθ = max(X1,···,Xn)**.

## The log likelihood

The above calculation for total probability is difficult to differentiate, the natural logarithm of the expression is typically always used to simplify it. Because the natural logarithm is a monotonically growing function, this is perfectly good. This indicates that when the value on the x-axis rises, so does the value on the y-axis (see figure below). This is significant because it guarantees that the highest value of the probability log occurs at the same place as the original probability function. As a result, rather than using the original likelihood, we can use the simpler log-likelihood.

## Example Applications of Maximum Likelihood Estimation

Maximum likelihood estimation is effective in a wide range of empirical applications due to its adaptability. It can be used in a wide range of models, from simple linear regression to complex choice models.

There are two applications in this :

- The linear regression model
- The probit model

## conclusion

Congratulations! You should have a better knowledge of the foundations of maximum likelihood estimation after reading today’s blog. We’ve gone through the following topics in particularly concepts of maximum likelihood extimation and how to calculate it.