Developing a GPT-Based text Extraction Model for Cancer Information

Yong Jeong Yi, Jaemin Jo, Beom Jun Bae, Hyunwoo Moon, June Yoon, Sanghyuk Lee

Research output: Contribution to book or proceedingConference articlepeer-review

Abstract

By employing Aristotle's rhetoric as the theoretical framework, the present study aims to develop a model that automatically extracts the three key components of persuasive strategies-ethos (authority), pathos (emotional appeal), and logos (logic)-from answers to pertinent cancer questions on Quora, a social question and answer platform. Furthermore, we apply the model to discrete groups of the most upvoted and random (non-upvoted) answers to compare differences in the three persuasive components. The dataset consists of a total of 103 questions and their corresponding answers, including both upvoted and random answers. It was employed for preliminary findings, comprising a total of 33 questions and answers, with answers to 19 questions used as training data and answers to 14 questions used as test data. We annotated sentences in the answers according to the three types of rhetoric employed. We then fine-tuned models based on Generative Pretrained Transformers (GPT) to classify the phrases, achieving an average F1 score of 0.84. Paired sample t-tests confirmed our research hypotheses regarding ethos and logos, while our hypothesis about pathos was not confirmed. Results suggest that ethos and logos are effective in communicating cancer information to consumers, but that pathos is not.

Original languageEnglish
Title of host publicationProceedings of the 14th International Conference on Cloud Computing, Data Science and Engineering, Confluence 2024
EditorsSanjeev Thakur, Rakesh Garg, Abhishek Singhal, Sumit Kumar, Sumit Kumar, Renuka Arora, Rajni Sehgal Kaushik
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages165-169
Number of pages5
ISBN (Electronic)9798350344837
DOIs
StatePublished - 2024
Event14th International Conference on Cloud Computing, Data Science and Engineering, Confluence 2024 - Noida, India
Duration: Jan 18 2024Jan 19 2024

Publication series

NameProceedings of the 14th International Conference on Cloud Computing, Data Science and Engineering, Confluence 2024

Conference

Conference14th International Conference on Cloud Computing, Data Science and Engineering, Confluence 2024
Country/TerritoryIndia
CityNoida
Period01/18/2401/19/24

Keywords

  • Aristotle's rhetoric
  • artificial intelligence
  • cancer information
  • ChatGPT
  • machine learning
  • persuasion
  • social Q&A

Fingerprint

Dive into the research topics of 'Developing a GPT-Based text Extraction Model for Cancer Information'. Together they form a unique fingerprint.

Cite this