Abstract
Machine learning classifiers are known algorithms used to classify network intrusion detection due to the drastic growth of data, new tools are being required to handle such a large amount of data within a short time frame. In this Paper, we present a Model using the Apache Mahout Framework to train machine learning classifiers Random Forest (RF), Logistic Regression (LR), and Naïve Bayes (NB) on CSE-CIC-IDS2018 dataset using Chi-Square and ANOVA f-test filter-based feature selection technique on an Apache Hadoop Framework. The performance of classifiers is measured in terms of Accuracy, Kappa, Precision, Recall, and F1- Score for a comparative analysis of the various machine learning classifiers.
Original language | American English |
---|---|
Pages (from-to) | 83-90 |
Number of pages | 8 |
Journal | IEEE/ACIS 20th International Conference on Software Engineering Research, Management and Applications (SERA) Proceedings |
DOIs | |
State | Published - May 25 2022 |
Scopus Subject Areas
- Management of Technology and Innovation
- Computer Networks and Communications
- Computer Science Applications
- Software
- Safety, Risk, Reliability and Quality
Keywords
- HDFS
- Hadoop
- Intrusion Detection System (IDS)
- Logistic Regression
- Mahout
- MapReduce
- Naive Bayes
- Random Forest