A neural network for discovery of record layouts

Azita Bahrami, Ray R. Hashemi, John R. Talburt, Zhebnya Burachevsky

Research output: Contribution to book or proceedingConference articlepeer-review

1 Scopus citations

Abstract

In order for automated systems to interact with ASCII files, they must be able to discover the layout of records. There are three formats of record layouts: fixed, delimited, and mixed. The goal of this paper is to discover the record layout of fixed format. Each record of such files is considered to be a character string in which the fields' startings and endings are unknown and two adjacent fields may or may not be separated by a space. A new neural network was devised to accept a random sample of the file's records as a working set to discover the record layout for the file. The validity of the methodology was established through using 20 different synthesized files with regard to number of fields, length of fields, order of fields, content of fields, or any combination of them. The methodology's ability to discover the record layouts has 95% accuracy.

Original languageEnglish
Title of host publicationProceedings of the IADIS International Conference WWW/Internet 2009, ICWI 2009
EditorsPatricia Barbosa, Miguel Baptista Nunes, Pedro Isaias, Bebo White, Luis Rodrigues
PublisherIADIS
Pages319-323
Number of pages5
ISBN (Electronic)9789728924935
StatePublished - 2009
EventIADIS International Conference WWW/Internet 2009, ICWI 2009 - Rome, Italy
Duration: Nov 19 2009Nov 22 2009

Publication series

NameProceedings of the IADIS International Conference WWW/Internet 2009, ICWI 2009
Volume2

Conference

ConferenceIADIS International Conference WWW/Internet 2009, ICWI 2009
Country/TerritoryItaly
CityRome
Period11/19/0911/22/09

Scopus Subject Areas

  • Signal Processing
  • Computer Networks and Communications

Keywords

  • And record layout discovery
  • Knowledge discovery
  • Neural network
  • Record layout

Fingerprint

Dive into the research topics of 'A neural network for discovery of record layouts'. Together they form a unique fingerprint.

Cite this