TY - JOUR
T1 - Big Cyber Security Data Analysis with Apache Mahou
AU - Adekanbmi, Omotola
AU - Wimmer, Hayden
AU - Kim, Jongyeop
PY - 2023/6/30
Y1 - 2023/6/30
N2 - Machine learning classifiers are known algorithms used to classify network intrusion detection due to the drastic growth of data, new tools are being required to handle such a large amount of data within a short time frame. In this Paper, we present a Model using the Apache Mahout Framework to train machine learning classifiers Random Forest (RF), Logistic Regression (LR), and Naïve Bayes (NB) on CSE-CIC-IDS2018 dataset using Chi-Square and ANOVA f-test filter-based feature selection technique on an Apache Hadoop Framework. The performance of classifiers is measured in terms of Accuracy, Kappa, Precision, Recall, and F1- Score for a comparative analysis of the various machine learning classifiers.
AB - Machine learning classifiers are known algorithms used to classify network intrusion detection due to the drastic growth of data, new tools are being required to handle such a large amount of data within a short time frame. In this Paper, we present a Model using the Apache Mahout Framework to train machine learning classifiers Random Forest (RF), Logistic Regression (LR), and Naïve Bayes (NB) on CSE-CIC-IDS2018 dataset using Chi-Square and ANOVA f-test filter-based feature selection technique on an Apache Hadoop Framework. The performance of classifiers is measured in terms of Accuracy, Kappa, Precision, Recall, and F1- Score for a comparative analysis of the various machine learning classifiers.
KW - HDFS
KW - Hadoop
KW - Intrusion Detection System (IDS)
KW - Logistic Regression
KW - Mahout
KW - MapReduce
KW - Naïve Bayes
KW - Random Forest
UR - https://digitalcommons.georgiasouthern.edu/information-tech-facpubs/174
UR - https://doi.org/10.1109/SERA54885.2022.9806807
U2 - 10.1109/SERA54885.2022.9806807
DO - 10.1109/SERA54885.2022.9806807
M3 - Article
JO - IEEE/ACIS 20th International Conference on Software Engineering Research, Management and Applications (SERA) Proceedings
JF - IEEE/ACIS 20th International Conference on Software Engineering Research, Management and Applications (SERA) Proceedings
ER -