Big Cyber Security Data Analysis with Apache Mahou

Omotola Adekanbmi, Hayden Wimmer, Jongyeop Kim

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

Machine learning classifiers are known algorithms used to classify network intrusion detection due to the drastic growth of data, new tools are being required to handle such a large amount of data within a short time frame. In this Paper, we present a Model using the Apache Mahout Framework to train machine learning classifiers Random Forest (RF), Logistic Regression (LR), and Naïve Bayes (NB) on CSE-CIC-IDS2018 dataset using Chi-Square and ANOVA f-test filter-based feature selection technique on an Apache Hadoop Framework. The performance of classifiers is measured in terms of Accuracy, Kappa, Precision, Recall, and F1- Score for a comparative analysis of the various machine learning classifiers.

Disciplines

  • Computer Sciences

Keywords

  • HDFS
  • Hadoop
  • Intrusion Detection System (IDS)
  • Logistic Regression
  • Mahout
  • MapReduce
  • Naïve Bayes
  • Random Forest

Fingerprint

Dive into the research topics of 'Big Cyber Security Data Analysis with Apache Mahou'. Together they form a unique fingerprint.

Cite this