Univariate Outlier Detection Based On Normal Distribution ~ Research Mining

Saturday, 3 May 2014

Univariate Outlier Detection Based On Normal Distribution

May 03, 2014 5 comments

Detection of Univariate Outlier Based On Normal Distribution
Data involving only one attribute or variable are called univariate data. For simplicity, we often choose to assume that data are generated from a normal distribution. We can then learn the parameters of the normal distribution from the input data, and identify the points with low probability as outliers.
Let’s start with univariate data. We will try to detect outliers by assuming the data follow a normal distribution.
Univariate outlier detection using maximum likelihood:

Suppose a city’s average temperature values in July in the last 10 years are, in value-ascending order, 24.0°C, 28.9°C, 28.9°C, 29.0°C, 29.1°C, 29.1°C, 29.2°C, 29.2°C, 29.3°C and 29.4°C. Let’s assume that the average temperature follows a normal distribution, which is determined by two parameters: the mean, μ, and the standard deviation, σ.
We can use the maximum likelihood method to estimate the parameter μ and σ. That is, we maximize the log-likelihood function

Where n is the total number of samples, which is 10 in this sample.
Taking derivatives with respect to μ and σ2 and solving the result system of first order conditions leads to the following maximum likelihood estimates:

In this example, we have

Accordingly, we have .

The most dividing value, 24.0ºC, is 4.61ºC away from the estimated mean. We know that the region contains 99.7% data under the assumption of normal distribution. Because

the probability that the value 24.0ºC is generated by the normal distribution is less than 0.15%, and thus can be identified as an outlier.

Bryan27 May 2021 at 14:21
Thanks for sharing a piece of knowledgeable information with us, I look for such article along time, today i find it finally I got some valuable information in your article. It was awesome to read your blog.Our Python Training In Virginia for data science and Python helps all developers to become better programmers.
ReplyDelete
Replies
selda17 August 2023 at 04:52
kuşadası
milas
çeşme
bağcılar
çanakkale
7FTFF
ReplyDelete
Replies
Jyoti singhal12 December 2023 at 13:18
Thanks for giving this information it is very useful post. Learn More Data Analytics Course In South Delhi
ReplyDelete
Replies
Anshita Panchal 9 January 2024 at 11:53
Enroll in our Best Mern Stack Course In South Delhi to explore the depths of full-stack web development, guided by the principles of the MERN technology stack.
ReplyDelete
Replies
PicLinks26 May 2024 at 10:06
Thank Yo So Much For Sharing Trusted Tron Mining Website tronmining.online
ReplyDelete
Replies

Add comment

Research Mining

Saturday, 3 May 2014

Univariate Outlier Detection Based On Normal Distribution

5 comments:

Comment

Recent

BTemplates.com

Search This Blog

Blog Archive

Labels

Translate

Report Abuse

About Me

Featured post

Mahalanobis Distance using R code

Weekly

Labels

Blog Archive

Labels

Blogroll

About