Year of Graduation

2020

Level of Access

Open Access Thesis

Embargo Period

5-13-2020

Department or Program

Computer Science

First Advisor

Fernando Nascimento

Second Advisor

Stephen Majercik

Abstract

During the course of research, scholars often explore large textual databases for segments of text relevant to their conceptual analyses. This study proposes, develops and evaluates two algorithms for automated concept detection in theoretical corpora: ACS and WMD retrieval. Both novel algorithms are compared to key word retrieval, using a test set from the Digital Ricoeur corpus tagged by scholarly experts. WMD retrieval outperforms key word search on the concept detection task. Thus, WMD retrieval is a promising tool for concept detection and information retrieval systems focused on theoretical corpora.

COinS