Optimized common parameter set extraction framework by multiple benchmarking applications on a big data platform

Jongyeop Kim, Abhilash Kancharla, Jongho Seol, Indy N. Park, Nohpill Park

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

This research proposes the methodology to extract common configuration parameter set by applying multiple benchmarking applications include TeraSort, TestDFSIO, and MrBench on the Hadoop distributed file system. The parameter search space conceptually conducted named Ω(x) to hold status of all parameter values and its evaluation results for every stage to eventually reduce benchmarking cost. In the process of determining parameter set for each stage, one parameter and its associated values selected which is reduced system performance in terms of overall execution time difference that are measured by multiple applications on a Hadoop cluster. The experimental results demonstrate the proposed extended greedy manner provide a feasible benchmark model for the multiple MapReduce tasks. This model classified several candidate parameter value sets that can be reduced the overall execution time by 27% of the values against Hadoop default settings. Moreover, we propose e-heuristic greedy with alternative parameter selection model to evaluate second candidate parameter value which will lead global optimum by returning back to the previous stage if local minimum is not found at the current stage compare to the previous ones.

Original languageEnglish
Pages (from-to)195-203
Number of pages9
JournalInternational Journal of Networked and Distributed Computing
Volume6
Issue number4
DOIs
StatePublished - Sep 2018

Keywords

  • Big data
  • Configuration
  • Hadoop
  • Performance tuning

Fingerprint

Dive into the research topics of 'Optimized common parameter set extraction framework by multiple benchmarking applications on a big data platform'. Together they form a unique fingerprint.

Cite this