An important aspect of an outlier detection technique is the nature of the desired outlier. Outliers can be classified into following three categories:
If an individual data instance can be considered as anomalous with respect to the rest of data, then the instance is termed as a point outlier. This is the simplest type of outlier and is the focus of majority of research on outlier detection. For example, in Figure 1, points o1 and o2 as well as points in region O3 lie outside the boundary of the normal regions, and hence are point outliers since they are different from normal data points. As a real life example, if we consider credit card fraud detection with data set corresponding to an individual's credit card transactions assuming data definition by only one feature: amount spent. A transaction for which the amount spent is very high compared to the normal range of expenditure for that person will be a point outlier.
Contextual Outliers:
If a data instance is anomalous in a specific con-text (but not otherwise), then it is termed as a contextual outlier (also referred to as conditional outlier [1]). The notion of a context is induced by the structure in the data set and has to be specified as a part of the problem formulation. Each data instance is defined using two sets of attributes:
Contextual attributes. The contextual attributes are used to determine the context (or neighborhood) for that instance. For example, in spatial data sets, the longitude and latitude of a location are the contextual attributes. In time series data, time is a contextual attribute which determines the position of an instance on the entire sequence.
Behavioral attributes. The behavioral attributes define the non-contextual characteristics of an instance. For example, in a spatial data set describing the average rainfall of the entire world, the amount of rainfall at any location is a behavioral attribute.
The anomalous behavior is determined using the values for the behavioral attributes within a specific context. A data instance might be a contextual outlier in a given context, but an identical data instance (in terms of behavioral attributes) could be considered normal in a different context. This property is key in identifying contextual and behavioral attributes for a contextual
Contextual outlier t2 in a temperature time series. Temperature at time t1 is same as that at time t2 but occurs in a different context and hence is not considered as an outlier.
Contextual outliers have been most commonly explored in time-series data [2] and spatial data [3]. Figure 3 shows one such example for a temperature time series which shows the monthly temperature of an area over last few years. A temperature of 35F might be normal during the winter (at time t1) at that place, but the same value during summer (at time t2) would be an outlier. A six ft tall adult may be a normal person but if viewed in context of age a six feet tall kid will definitely be an outlier.
A similar example can be found in the credit card fraud detection with contextual as time of purchase. Suppose an individual usually has a weekly shopping bill of $100 except during the Christmas week, when it reaches $1000. A new purchase of $1000 in a week in July will be considered a contextual outlier, since it does not conform to the normal behavior of the individual in the context of time (even though the same amount spent during Christmas week will be considered normal).
The choice of applying a contextual outlier detection technique is determined by the meaningfulness of the contextual outliers in the target application domain. Applying a contextual outlier detection technique makes sense if contextual attributes are readily available and therefore defining a context is straightforward. But it becomes difficult to apply such techniques if defining a context is not easy.
Collective Outliers:
If a collection of related data instances is anomalous with respect to the entire data set, it is termed as a collective outlier. The individual data instances in a collective outlier may not be outliers by themselves, but their occurrence together as a collection is anomalous. Figure 4 illustrates an example which shows a human electrocardiogram output [4]. The highlighted region denotes an outlier because the same low value exists for an abnormally long time (corresponding to an Atrial Premature Contraction). It may be noted that low value by itself is not an outlier but its successive occurrence for long time is an outlier.
Collective outlier in an human ECG output corresponding to an
Atrial Premature Contraction.
As an another illustrative example, consider a sequence of actions occurring in a computer as shown below: ……...http-web, buffer-overflow, http-web, http-web, smtp-mail, ftp, http-web, ssh, smtp-mail, http-web, ssh, buffer-overflow, ftp, http-web, ftp, smtp-mail, httpweb…… The highlighted sequence of events (buffer-overflow, ssh, ftp) correspond to a typical web based attack by a remote machine followed by copying of data from the host computer to remote destination via ftp. It should be noted that this collection of events is an outlier but the individual events are not outliers when they occur in other locations in the sequence.
Collective outliers have been explored for sequence data [5,6], graph data [7], and spatial data [8]. It should be noted that while point outliers can occur in any data set, collective outliers can occur only in data sets in which data instances are related. In contrast, occurrence of contextual outliers depends on the availability of context attributes in the data. A point outlier or a collective outlier can also be a contextual outlier if analyzed with respect to a context. Thus a point outlier detection problem or collective outlier detection problem can be transformed to a contextual outlier detection problem by incorporating the context information.
Reference:
Forrest, S., Warrender, C., and Pearlmutter, B. 1999. Detecting intrusions using system calls: Alternate data models. In Proceedings of the 1999 IEEE ISRSP. IEEE Computer Society, Washington, DC, USA, 133 - 145.
Goldberger, A. L., Amaral, L. A. N., Glass, L., Hausdorff, J. M., Ivanov, P. C., Mark, R. G., Mietus, J. E., Moody, G. B., Peng, C.-K., and Stanley, H. E. 2000. PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for com-plex physiologic signals. Circulation 101, 23, e215 - e220. Circulation Electronic Pages: http://circ.ahajournals.org/cgi/content/full/101/23/e215.
Kou, Y., Lu, C.-T., and Chen, D. 2006. Spatial weighted outlier detection. In Proceedings of SIAM Conference on Data Mining.
Noble, C. C. and Cook, D. J. 2003. Graph-based outlier detection. In Proceedings of the 9th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM Press, 631 - 636.
Sekar, R., Bendre, M., Dhurjati, D., and Bollineni, P. 2001. A fast automaton-based method for detecting anomalous program behaviors. In Proceedings of the IEEE Symposium on Security and Privacy. IEEE Computer Society, 144.
Song, X., Wu, M., Jermaine, C., and Ranka, S. (2007). Conditional outlier detection. IEEE Transactions on Knowledge and Data Engineering 19, 5, 631-645.
Sun, P., Chawla, S., and Arunasalam, B. 2006. Mining for outliers in sequential databases. In SIAM International Conference on Data Mining.
Weigend, A. S., Mangeas, M., and Srivastava, A. N. (1995). Nonlinear gated experts for time-series – discovering regimes and avoiding overfitting. International Journal of Neural Systems 6, 4, 373-399.
- Point Outliers
- Contextual Outliers
- Collective Outliers.
If an individual data instance can be considered as anomalous with respect to the rest of data, then the instance is termed as a point outlier. This is the simplest type of outlier and is the focus of majority of research on outlier detection. For example, in Figure 1, points o1 and o2 as well as points in region O3 lie outside the boundary of the normal regions, and hence are point outliers since they are different from normal data points. As a real life example, if we consider credit card fraud detection with data set corresponding to an individual's credit card transactions assuming data definition by only one feature: amount spent. A transaction for which the amount spent is very high compared to the normal range of expenditure for that person will be a point outlier.
Contextual Outliers:
If a data instance is anomalous in a specific con-text (but not otherwise), then it is termed as a contextual outlier (also referred to as conditional outlier [1]). The notion of a context is induced by the structure in the data set and has to be specified as a part of the problem formulation. Each data instance is defined using two sets of attributes:
Contextual attributes. The contextual attributes are used to determine the context (or neighborhood) for that instance. For example, in spatial data sets, the longitude and latitude of a location are the contextual attributes. In time series data, time is a contextual attribute which determines the position of an instance on the entire sequence.
Behavioral attributes. The behavioral attributes define the non-contextual characteristics of an instance. For example, in a spatial data set describing the average rainfall of the entire world, the amount of rainfall at any location is a behavioral attribute.
The anomalous behavior is determined using the values for the behavioral attributes within a specific context. A data instance might be a contextual outlier in a given context, but an identical data instance (in terms of behavioral attributes) could be considered normal in a different context. This property is key in identifying contextual and behavioral attributes for a contextual
Contextual outlier t2 in a temperature time series. Temperature at time t1 is same as that at time t2 but occurs in a different context and hence is not considered as an outlier.
Contextual outliers have been most commonly explored in time-series data [2] and spatial data [3]. Figure 3 shows one such example for a temperature time series which shows the monthly temperature of an area over last few years. A temperature of 35F might be normal during the winter (at time t1) at that place, but the same value during summer (at time t2) would be an outlier. A six ft tall adult may be a normal person but if viewed in context of age a six feet tall kid will definitely be an outlier.
A similar example can be found in the credit card fraud detection with contextual as time of purchase. Suppose an individual usually has a weekly shopping bill of $100 except during the Christmas week, when it reaches $1000. A new purchase of $1000 in a week in July will be considered a contextual outlier, since it does not conform to the normal behavior of the individual in the context of time (even though the same amount spent during Christmas week will be considered normal).
The choice of applying a contextual outlier detection technique is determined by the meaningfulness of the contextual outliers in the target application domain. Applying a contextual outlier detection technique makes sense if contextual attributes are readily available and therefore defining a context is straightforward. But it becomes difficult to apply such techniques if defining a context is not easy.
Collective Outliers:
If a collection of related data instances is anomalous with respect to the entire data set, it is termed as a collective outlier. The individual data instances in a collective outlier may not be outliers by themselves, but their occurrence together as a collection is anomalous. Figure 4 illustrates an example which shows a human electrocardiogram output [4]. The highlighted region denotes an outlier because the same low value exists for an abnormally long time (corresponding to an Atrial Premature Contraction). It may be noted that low value by itself is not an outlier but its successive occurrence for long time is an outlier.
Collective outlier in an human ECG output corresponding to an
Atrial Premature Contraction.
As an another illustrative example, consider a sequence of actions occurring in a computer as shown below: ……...http-web, buffer-overflow, http-web, http-web, smtp-mail, ftp, http-web, ssh, smtp-mail, http-web, ssh, buffer-overflow, ftp, http-web, ftp, smtp-mail, httpweb…… The highlighted sequence of events (buffer-overflow, ssh, ftp) correspond to a typical web based attack by a remote machine followed by copying of data from the host computer to remote destination via ftp. It should be noted that this collection of events is an outlier but the individual events are not outliers when they occur in other locations in the sequence.
Collective outliers have been explored for sequence data [5,6], graph data [7], and spatial data [8]. It should be noted that while point outliers can occur in any data set, collective outliers can occur only in data sets in which data instances are related. In contrast, occurrence of contextual outliers depends on the availability of context attributes in the data. A point outlier or a collective outlier can also be a contextual outlier if analyzed with respect to a context. Thus a point outlier detection problem or collective outlier detection problem can be transformed to a contextual outlier detection problem by incorporating the context information.
Reference:
Forrest, S., Warrender, C., and Pearlmutter, B. 1999. Detecting intrusions using system calls: Alternate data models. In Proceedings of the 1999 IEEE ISRSP. IEEE Computer Society, Washington, DC, USA, 133 - 145.
Goldberger, A. L., Amaral, L. A. N., Glass, L., Hausdorff, J. M., Ivanov, P. C., Mark, R. G., Mietus, J. E., Moody, G. B., Peng, C.-K., and Stanley, H. E. 2000. PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for com-plex physiologic signals. Circulation 101, 23, e215 - e220. Circulation Electronic Pages: http://circ.ahajournals.org/cgi/content/full/101/23/e215.
Kou, Y., Lu, C.-T., and Chen, D. 2006. Spatial weighted outlier detection. In Proceedings of SIAM Conference on Data Mining.
Noble, C. C. and Cook, D. J. 2003. Graph-based outlier detection. In Proceedings of the 9th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM Press, 631 - 636.
Sekar, R., Bendre, M., Dhurjati, D., and Bollineni, P. 2001. A fast automaton-based method for detecting anomalous program behaviors. In Proceedings of the IEEE Symposium on Security and Privacy. IEEE Computer Society, 144.
Song, X., Wu, M., Jermaine, C., and Ranka, S. (2007). Conditional outlier detection. IEEE Transactions on Knowledge and Data Engineering 19, 5, 631-645.
Sun, P., Chawla, S., and Arunasalam, B. 2006. Mining for outliers in sequential databases. In SIAM International Conference on Data Mining.
Weigend, A. S., Mangeas, M., and Srivastava, A. N. (1995). Nonlinear gated experts for time-series – discovering regimes and avoiding overfitting. International Journal of Neural Systems 6, 4, 373-399.
I got this web page from my buddy who shared with me on the topic of this
ReplyDeletesite and now this time I am visiting this web site and reading
very informative posts here.
Feel free to visit my web site : http://christianlouboutinoutlet.webeden.info/#christian louboutin outlet
I was pretty pleased to discover this site. I need to to thank you for ones time due to this wonderful read!
ReplyDelete! I definitely liked every bit of it and i also
have you bookmarked to check out new information on your web site.
Review my webpage - Cheap Anquan Boldin Jersey
Do you have a spam issue on this website; I also am
ReplyDeletea blogger, and I was wanting to know your situation;
many of us have created some nice methods
and we are looking to trade strategies with others, why not shoot me an
e-mail if interested.
my weblog: AIR JORDAN
We are a group of volunteers and starting a
ReplyDeletenew scheme in our community. Your website offered us with valuable information to work on.
You have done an impressive job and our entire community
will be thankful to you.
Also visit my site: コーチ
I do not know if it's just me or if perhaps everyone else experiencing problems with your website. It seems like some of the text in your content are running off the screen. Can someone else please provide feedback and let me know if this is happening to them as well? This might be a issue with my web browser because I've had this happen before.
ReplyDeleteThanks
Check out my weblog - クロエ
I'm truly enjoying the design and layout of your website. It's a very easy on the eyes which makes
ReplyDeleteit much more pleasant for me to come here and visit more often.
Did you hire out a developer to create your theme?
Exceptional work!
Here is my web page - www.christianlouboutinoutletstorex2013.com
I am not sure where you're getting your info, but great topic. I needs to spend a while finding out much more or working out more. Thank you for great info I used to be looking for this info for my mission.
ReplyDeleteFeel free to visit my web blog; クロエ バッグ
Your style is really unique compared to other folks I've read stuff from. Thanks for posting when you've
ReplyDeletegot the opportunity, Guess I will just book mark this web site.
Feel free to visit my website: ミュウミュウ 財布
Hello! I just wanted to ask if you ever have any problems with hackers?
ReplyDeleteMy last blog (wordpress) was hacked and I ended up losing months of hard work due to no backup.
Do you have any solutions to protect against hackers?
my website: miu miu 財布
Thank you for the auspicious writeup. It in fact was
ReplyDeletea amusement account it. Look advanced to far added agreeable from you!
By the way, how could we communicate?
Also visit my page - コーチ アウトレット
Hey! This post could not be written any better! Reading this post reminds me of my good old room mate!
ReplyDeleteHe always kept chatting about this. I will forward this write-up to him.
Fairly certain he will have a good read. Thanks for
sharing!
my page: トリーバーチ店舗
Woah! I'm really enjoying the template/theme of this site. It's simple, yet effective.
ReplyDeleteA lot of times it's very hard to get that "perfect balance" between superb usability and appearance. I must say you have done a awesome job with this. In addition, the blog loads extremely fast for me on Internet explorer. Outstanding Blog!
my web site - クリスチャンルブタン
Excellent website you have here but I was wondering if you knew of any message boards that cover
ReplyDeletethe same topics discussed here? I'd really love to be a part of online community where I can get feed-back from other knowledgeable people that share the same interest. If you have any suggestions, please let me know. Kudos!
Feel free to surf to my page :: ミュウミュウ 店舗
Just desire to say your article is as amazing.
ReplyDeleteThe clarity for your put up is simply nice and i could think you're a professional in this subject. Well together with your permission let me to grasp your RSS feed to keep up to date with coming near near post. Thanks 1,000,000 and please continue the rewarding work.
Feel free to visit my website クリスチャンルブタン
Link exchange is nothing else however it is simply placing the other person's webpage link on your page at appropriate place and other person will also do same for you.
ReplyDeletemy weblog :: ミュウミュウ
Thanks for one's marvelous posting! I truly enjoyed reading it, you could be a great author.I will make sure to bookmark your blog and will often come back from now on. I want to encourage you to continue your great work, have a nice evening!
ReplyDeletemy web site :: skinnyandflavored.blogspot.de
I seriously love your blog.. Very nice colors & theme. Did you create this web site yourself?
ReplyDeletePlease reply back as I'm trying to create my own personal site and would love to know where you got this from or exactly what the theme is named. Thank you!
Feel free to visit my website ... クリスチャンルブタンメンズ
This post will help the internet viewers for creating new web site or even a blog from start to end.
ReplyDeleteAlso visit my blog post; クリスチャンルブタン
After exploring a few of the blog posts on your web site,
ReplyDeleteI seriously appreciate your technique of blogging. I added it to my bookmark webpage list and
will be checking back soon. Take a look at my website too and tell me what you think.
Also visit my webpage: コーチ 財布
I am really impressed together with your writing
ReplyDeleteskills as neatly as with the layout on your blog. Is this
a paid subject matter or did you modify it yourself?
Either way keep up the nice quality writing,
it's uncommon to see a nice weblog like this one today..
Feel free to visit my web site トリーバーチ 長財布
I'm really impressed with your writing skills and also with the layout on your weblog. Is this a paid theme or did you customize it yourself? Either way keep up the excellent quality writing, it's rare to see a great blog like this one these
ReplyDeletedays.
Here is my web site :: rodneydurso.com
Its like you read my mind! You seem to know a lot about this,
ReplyDeletelike you wrote the book in it or something.
I think that you could do with a few pics to drive the message home a little bit, but other than that, this
is excellent blog. A great read. I'll certainly be back.
Also visit my web blog - トリーバーチ 財布
I believe what you said made a bunch of sense. But, think on this, what
ReplyDeleteif you were to write a killer title? I ain't saying your content isn't solid.
, but suppose you added a title that grabbed folk's attention? I mean "Types of Outliers" is kinda plain. You should look at Yahoo's home
page and watch how they create article headlines to get viewers interested.
You might try adding a video or a related pic or two
to get readers interested about everything've written. Just my opinion, it would bring your posts a little bit more interesting.
My web-site: クリスチャンルブタン
Hello, I desire to subscribe for this blog to obtain most up-to-date updates,
ReplyDeletetherefore where can i do it please assist.
Also visit my website - クロエ
Howdy fantastic website! Does running a blog such as this require a great deal of work?
ReplyDeleteI have very little knowledge of computer programming however I had
been hoping to start my own blog in the near future.
Anyway, should you have any suggestions or tips for new
blog owners please share. I understand this is
off subject but I just needed to ask. Thanks!
Here is my blog; クリスチャンルブタン
Hurrah! At last I got a webpage from where I know how to really get
ReplyDeleteuseful data regarding my study and knowledge.
my web page: トリーバーチ トート
This paragraph is actually a fastidious one it assists new internet users, who are wishing in
ReplyDeletefavor of blogging.
Feel free to surf to my site :: トリーバーチ
I do not create a leave a response, but I browsed a few of the comments on "Types of Outliers".
ReplyDeleteI do have a few questions for you if it's okay. Could it be simply me or do some of these responses come across like they are left by brain dead folks? :-P And, if you are posting on other sites, I would like to keep up with you. Would you post a list of every one of all your shared pages like your twitter feed, Facebook page or linkedin profile?
Take a look at my site - http://web-design.na.by/
My brother suggested I might like this blog. He was totally right.
ReplyDeleteThis post actually made my day. You can not imagine simply how much time I had spent for this info!
Thanks!
Also visit my blog post; ミュウミュウ
What's up every one, here every person is sharing such knowledge, thus it's pleasant to read this webpage, and I used to pay a
ReplyDeletequick visit this website all the time.
Also visit my website; ミュウミュウ
What's Going down i'm new to this, I stumbled upon this
ReplyDeleteI have discovered It absolutely helpful and it has aided me out
loads. I hope to contribute & help other users like its helped me.
Good job.
My web blog: クリスチャンルブタン
Excellent weblog right here! Additionally your site a lot up fast!
ReplyDeleteWhat host are you the use of? Can I am getting your associate hyperlink on your
host? I want my site loaded up as fast as yours lol
Also visit my weblog: ミュウミュウ
These are actually wonderful ideas in regarding blogging.
ReplyDeleteYou have touched some good factors here. Any way keep up wrinting.
Visit my webpage クリスチャンルブタン 販売
Hello! Do you know if they make any plugins to help with Search Engine
ReplyDeleteOptimization? I'm trying to get my blog to rank for some targeted keywords but I'm not seeing very good gains.
If you know of any please share. Appreciate it!
my blog post; クロエ
Great site you have here but I was curious about if you knew of
ReplyDeleteany community forums that cover the same topics talked about in
this article? I'd really love to be a part of community where I can get feedback from other knowledgeable people that share the same interest. If you have any suggestions, please let me know. Appreciate it!
Also visit my homepage; クロエ
I constantly spent my half an hour to read this blog's posts all the time along with a cup of coffee.
ReplyDeleteAlso visit my website ... クリスチャンルブタン 販売
Good way of describing, and pleasant post to take facts concerning my presentation topic, which i am going to
ReplyDeletepresent in institution of higher education.
My web-site; クロエ
Have you ever considered publishing an e-book or guest authoring on other blogs?
ReplyDeleteI have a blog based upon on the same subjects you discuss
and would really like to have you share some stories/information.
I know my subscribers would value your work. If
you are even remotely interested, feel free to shoot me an e-mail.
My website; クロエ
Hello thank you very much .
ReplyDeleteCan any one tell me please how to know if the algorithm X supports which type of outlier
görüntülü
ReplyDeleteucretli show
2O22T
THANK YOU AND I HAVE A NEAT OFFER: HOW MANY HOUSES HAVE BEEN RENOVATED ON HOMETOWN TOP HOME RENOVATION COMPANIES
ReplyDelete