Please use this identifier to cite or link to this item:
https://doi.org/10.21256/zhaw-30944
Publication type: | Article in scientific journal |
Type of review: | Peer review (publication) |
Title: | Automated extraction and analysis of sentences under production : a theoretical framework and its evaluation |
Authors: | Ulasik, Malgorzata Anna Miletić, Aleksandra |
et. al: | No |
DOI: | 10.3390/languages9030071 10.21256/zhaw-30944 |
Published in: | Languages |
Volume(Issue): | 9 |
Issue: | 3 |
Page(s): | 71 |
Issue Date: | 22-Feb-2024 |
Publisher / Ed. Institution: | MDPI |
ISSN: | 2226-471X |
Language: | English |
Subjects: | Writing process; Keystroke logging; Sentence production; Text history; Sentence history; Linguistic modeling |
Subject (DDC): | 808: Rhetoric and writing |
Abstract: | Sentences are generally understood to be essential communicative units in writing that are built to express thoughts and meanings. Studying sentence production provides a valuable opportunity to shed new light on the writing process itself and on the underlying cognitive processes. Nevertheless, research on the production of sentences in writing remains scarce. We propose a theoretical framework and an open-source implementation that aim to facilitate the study of sentence production based on keystroke logs. We centre our approach around the notion of sentence history: all the versions of a given sentence during the production of a text. The implementation takes keystroke logs as input and extracts sentence versions, aggregates them into sentence histories and evaluates the sentencehood of each sentence version. We provide detailed evaluation of the implementation based on a manually annotated corpus of texts in French, German and English. The implementation yields strong results on the three processing aspects. |
Further description: | Data will be available later at github https://github.com/mulasik/wta |
URI: | https://digitalcollection.zhaw.ch/handle/11475/30944 |
Related research data: | https://github.com/mulasik/wta |
Fulltext version: | Published version |
License (according to publishing contract): | CC BY 4.0: Attribution 4.0 International |
Departement: | Applied Linguistics |
Organisational Unit: | Institute of Language Competence (ILC) |
Published as part of the ZHAW project: | SPPC: Swiss Process–Product Corpus of Student Writing Development |
Appears in collections: | Publikationen Angewandte Linguistik |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
2024_Ulasik-Miletic_Automated-extraction-and-analysis-of-sentences-under-production.pdf | 807.63 kB | Adobe PDF | View/Open |
Show full item record
Ulasik, M. A., & Miletić, A. (2024). Automated extraction and analysis of sentences under production : a theoretical framework and its evaluation. Languages, 9(3), 71. https://doi.org/10.3390/languages9030071
Ulasik, M.A. and Miletić, A. (2024) ‘Automated extraction and analysis of sentences under production : a theoretical framework and its evaluation’, Languages, 9(3), p. 71. Available at: https://doi.org/10.3390/languages9030071.
M. A. Ulasik and A. Miletić, “Automated extraction and analysis of sentences under production : a theoretical framework and its evaluation,” Languages, vol. 9, no. 3, p. 71, Feb. 2024, doi: 10.3390/languages9030071.
ULASIK, Malgorzata Anna und Aleksandra MILETIĆ, 2024. Automated extraction and analysis of sentences under production : a theoretical framework and its evaluation. Languages. 22 Februar 2024. Bd. 9, Nr. 3, S. 71. DOI 10.3390/languages9030071
Ulasik, Malgorzata Anna, and Aleksandra Miletić. 2024. “Automated Extraction and Analysis of Sentences under Production : A Theoretical Framework and Its Evaluation.” Languages 9 (3): 71. https://doi.org/10.3390/languages9030071.
Ulasik, Malgorzata Anna, and Aleksandra Miletić. “Automated Extraction and Analysis of Sentences under Production : A Theoretical Framework and Its Evaluation.” Languages, vol. 9, no. 3, Feb. 2024, p. 71, https://doi.org/10.3390/languages9030071.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.