ETD PDF

A Genetic Algorithm-based Local Outlier Factor for Efficient Big Data Stream Processing

Citation

Alghushairy, Omar Saleh. (2021-05). A Genetic Algorithm-based Local Outlier Factor for Efficient Big Data Stream Processing. Theses and Dissertations Collection, University of Idaho Library Digital Collections. https://www.lib.uidaho.edu/digital/etd/items/alghushairy_idaho_0089e_12040.html

Title:
A Genetic Algorithm-based Local Outlier Factor for Efficient Big Data Stream Processing
Author:
Alghushairy, Omar Saleh
Date:
2021-05
Keywords:
Data Science Outlier Detection
Program:
Computer Science
Subject Category:
Computer science
Abstract:

Interest in outlier detection methods is increasing because detecting outliers is an important operation for many applications such as detecting fraud transactions in credit card, network intrusion detection and data analysis in different domains. We are now in the big data era, and an important type of big data is data stream. With the increasing necessity for analyzing high-velocity data streams, it becomes difficult to apply older outlier detection methods efficiently. Local Outlier Factor (LOF) is a well-known outlier algorithm. A major challenge of LOF is that it requires the entire dataset and the distance values to be stored in memory. Another issue with LOF is that it needs to be recalculated from the beginning if any change occurs in the dataset. This research proposes a novel local outlier detection algorithm for data streams, called Genetic-based Incremental Local Outlier Factor (GILOF). Moreover, we further improved the GILOF performance in data streams by proposing a new calculation method for LOF, called Local Outlier Factor by Reachability distance (LOFR). The improved algorithm for local outlier detection in data stream is called the Genetic-based Incremental Local Outlier Factor by Reachability distance (GILOFR). The GILOF and GILOFR algorithms work without any previous knowledge of data distribution, and they are able to execute in limited memory. The outcomes of our experiments with various real-world datasets demonstrate that the proposed algorithms have very good performance in execution time and accuracy of outlier detection.

Description:
doctoral, Ph.D., Computer Science -- University of Idaho - College of Graduate Studies, 2021-05
Major Professor:
Ma, Xiaogang
Committee:
Soule, Terence ; Sheldon, Frederick; Song, Jia
Defense Date:
2021-05
Identifier:
Alghushairy_idaho_0089E_12040
Type:
Text
Format Original:
PDF
Format:
application/pdf

Contact us about this record

Rights
Rights:
In Copyright - Educational Use Permitted. For more information, please contact University of Idaho Library Special Collections and Archives Department at libspec@uidaho.edu.
Standardized Rights:
http://rightsstatements.org/vocab/InC-EDU/1.0/