A data-driven methodology to assess text complexity based on syntactic and semantic measurements

Diego Palma, Christian Soto, Mónica Veliz, Bernardo Riffo, Antonio Gutiérrez

Research output: Contribution to book or proceedingConference articlepeer-review

4 Scopus citations

Abstract

In this paper we propose a data driven methodology to assess text complexity of Spanish school texts. We model the problem as a classification task, that can be solved in a data-driven fashion using machine learning techniques. We show empirically that the discriminative power of the classifier depends on school grade level. Our proposal includes multiple predictors that capture different dimensions of text complexity such as coherence and cohesion. We provide an importance analysis of predictors across several complexity levels. Finally, we assess the model performance using accuracy and correlation measurements. The proposed model achieves accuracies of 0.7.

Original languageEnglish
Title of host publicationHuman Interaction and Emerging Technologies - Proceedings of the 1st International Conference on Human Interaction and Emerging Technologies, IHIET 2019
EditorsTareq Ahram, Redha Taiar, Serge Colson, Arnaud Choplin
PublisherSpringer Verlag
Pages509-515
Number of pages7
ISBN (Print)9783030256289
DOIs
StatePublished - 2020
Event1st International Conference on Human Interaction and Emerging Technologies, IHIET 2019 - Nice, France
Duration: Aug 22 2019Aug 24 2019

Publication series

NameAdvances in Intelligent Systems and Computing
Volume1018
ISSN (Print)2194-5357
ISSN (Electronic)2194-5365

Conference

Conference1st International Conference on Human Interaction and Emerging Technologies, IHIET 2019
Country/TerritoryFrance
CityNice
Period08/22/1908/24/19

Keywords

  • Artificial intelligence
  • Educational systems
  • Machine learning
  • Natural language processing
  • Text difficulty assessment

Fingerprint

Dive into the research topics of 'A data-driven methodology to assess text complexity based on syntactic and semantic measurements'. Together they form a unique fingerprint.

Cite this