Publication type: | Conference paper |
Type of review: | Peer review (abstract) |
Title: | Manual and semi-automatic normalization of historical spelling : case studies from Early New High German |
Authors: | Bollmann, Marcel Dipper, Stefanie Krasselt, Julia Petran, Florian |
Proceedings: | Proceedings of the 11th Edition of the Conference on Natural Language Processing (KONVENS). Vienna, September 19-21, 2012 |
Page(s): | 342 |
Pages to: | 350 |
Conference details: | Conference on Natural Language Processing (KONVENS 2012), Vienna, Austria, 21 September 2012 |
Issue Date: | 2012 |
Series: | Schriftenreihe der Österreichischen Gesellschaft für Artificial Intelligence (ÖGAI) |
Series volume: | 5 |
Publisher / Ed. Institution: | Eigenverlag ÖGAI |
Publisher / Ed. Institution: | Wien |
ISBN: | 3-85027-005-X |
Language: | English |
Subject (DDC): | 410.285: Computational linguistics |
Abstract: | This paper presents work on manual and semi-automatic normalization of historical language data. We first address the guidelines that we use for mapping historical to modern word forms. The guidelines distinguish between normalization (preferring forms close to the original) and modernization (preferring forms close to modern language). Average inter-annotator agreement is 88.38% on a set of data from Early New High German. We then present Norma, a semi-automatic normalization tool. It integrates different modules (lexicon lookup, rewrite rules) for normalizing words in an interactive way. The tool dynamically updates the set of rule entries, given new input. Depending on the text and training settings, normalizing 1,000 tokens results in overall accuracies of 61.78–79.65% (baseline: 24.76–59.53%). |
URI: | http://www.oegai.at/konvens2012/proceedings/51_bollmann12w/51_bollmann12w.pdf https://digitalcollection.zhaw.ch/handle/11475/4045 |
Fulltext version: | Published version |
License (according to publishing contract): | Licence according to publishing contract |
Departement: | Applied Linguistics |
Appears in collections: | Publikationen Angewandte Linguistik |
Files in This Item:
There are no files associated with this item.
Show full item record
Bollmann, M., Dipper, S., Krasselt, J., & Petran, F. (2012). Manual and semi-automatic normalization of historical spelling : case studies from Early New High German [Conference paper]. Proceedings of the 11th Edition of the Conference on Natural Language Processing (KONVENS). Vienna, September 19-21, 2012, 342–350. http://www.oegai.at/konvens2012/proceedings/51_bollmann12w/51_bollmann12w.pdf
Bollmann, M. et al. (2012) ‘Manual and semi-automatic normalization of historical spelling : case studies from Early New High German’, in Proceedings of the 11th Edition of the Conference on Natural Language Processing (KONVENS). Vienna, September 19-21, 2012. Wien: Eigenverlag ÖGAI, pp. 342–350. Available at: http://www.oegai.at/konvens2012/proceedings/51_bollmann12w/51_bollmann12w.pdf.
M. Bollmann, S. Dipper, J. Krasselt, and F. Petran, “Manual and semi-automatic normalization of historical spelling : case studies from Early New High German,” in Proceedings of the 11th Edition of the Conference on Natural Language Processing (KONVENS). Vienna, September 19-21, 2012, 2012, pp. 342–350. [Online]. Available: http://www.oegai.at/konvens2012/proceedings/51_bollmann12w/51_bollmann12w.pdf
BOLLMANN, Marcel, Stefanie DIPPER, Julia KRASSELT und Florian PETRAN, 2012. Manual and semi-automatic normalization of historical spelling : case studies from Early New High German. In: Proceedings of the 11th Edition of the Conference on Natural Language Processing (KONVENS). Vienna, September 19-21, 2012 [online]. Conference paper. Wien: Eigenverlag ÖGAI. 2012. S. 342–350. ISBN 3-85027-005-X. Verfügbar unter: http://www.oegai.at/konvens2012/proceedings/51_bollmann12w/51_bollmann12w.pdf
Bollmann, Marcel, Stefanie Dipper, Julia Krasselt, and Florian Petran. 2012. “Manual and Semi-Automatic Normalization of Historical Spelling : Case Studies from Early New High German.” Conference paper. In Proceedings of the 11th Edition of the Conference on Natural Language Processing (KONVENS). Vienna, September 19-21, 2012, 342–50. Wien: Eigenverlag ÖGAI. http://www.oegai.at/konvens2012/proceedings/51_bollmann12w/51_bollmann12w.pdf.
Bollmann, Marcel, et al. “Manual and Semi-Automatic Normalization of Historical Spelling : Case Studies from Early New High German.” Proceedings of the 11th Edition of the Conference on Natural Language Processing (KONVENS). Vienna, September 19-21, 2012, Eigenverlag ÖGAI, 2012, pp. 342–50, http://www.oegai.at/konvens2012/proceedings/51_bollmann12w/51_bollmann12w.pdf.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.