Search CORE

1 research outputs found

Μελέτη Αξιοπιστίας Αυτόματης και σε Πραγματικό Χρόνο Μεταγραφής Ομιλίας (Υποτιτλισμού) σε Τηλεδιδασκαλία

Author: Baltzi Sofia
Μπαλτζή Σοφία
Publication venue
Publication date: 01/01/2021
Field of study

Η χρήση μεθόδων αυτόματης αναγνώρισης ομιλίας για την μεταγραφή σε πραγματικό χρόνο του προφορικού λόγου του διδάσκοντα σε υπότιτλους επιτρέπει σε άτομα (φοιτητές ή μαθητές) με κώφωση ή με βαρηκοΐα να παρακολουθήσουν τηλεδιδασκαλία μέσω του διαδικτύου. Σκοπός της διπλωματικής είναι να διεξάγει μια συστηματική μελέτη της αξιοπιστίας χρήσης ενός web-based εργαλείου μεταγραφής/υποτιτλισμού για την ελληνική γλώσσα, και συγκεκριμένα του Web Captioner, κατά την τηλεδιδασκαλία στην τριτοβάθμια εκπαίδευση με χρήση μετρικών αντικειμενικής αξιολόγησης της απόδοσης. Στην πειραματική διαδικασία συμμετέχουν 26 ομιλητές και με κατάλληλα επιλεγμένο corpus από αντιπροσωπευτικό δείγμα πέντε διαφορετικών πανεπιστημιακών τμημάτων/σχολών. Έχει γίνει εμπεριστατωμένη ανάλυση των λαθών, κάνοντας σύγκριση με το αρχικό κείμενο. Καταλήξαμε στα εξής κύρια συμπεράσματα: το Word Error Rate (WER) ήταν μικρότερο από ~2,5% για το 50% των χρηστών, ενώ ανερχόταν έως και σε 5-6% για το 90% των χρηστών. Επίσης, υπήρχε σημαντική στατιστική συσχέτιση μεταξύ του χρήστη και του επιπέδου λαθών στα διαφορετικά κείμενα που χρησιμοποιήθηκαν. Εξετάζοντας τις λέξεις που συχνότερα αποδίδονταν λάθος από το Web Captioner, παρατηρήσαμε ότι πολλές είναι μονογραμματικά άρθρα, άλλες είναι αρκετά σπάνιες για τις οποίες τυγχάνει να υπάρχουν ηχητικά παρόμοιες αλλά πολύ λιγότερο σπάνιες λέξεις στην ελληνική, και άλλες χρησιμοποιούνται συχνά ως προθέματα αλλά δεν αναγνωρίζονται ως σύνδεσμοι (π.χ. “εκ”).The use of automatic speech recognition methods (ASR) for transcribing speech of a lecturer/teacher in real time allows deaf and hearing impaired students to attend distance learning online. The goal of this dissertation is to conduct a systematic study of the reliability in using a web-based transcription/subtitling tool (Web Captioner) for the Greek language, in the context of distance learning in higher education, while using well-established metrics for evaluating the tool’s performance. The experimental evaluation involved 26 participants and used sampled texts selected from a representative corpus of five different university departments / faculties. We made a thorough analysis of the errors generated in the live captioning process, by comparing the captions with the original text. We came to the following main conclusions: Word Error Rate (WER) was less than ~2.5% at the 50th percentile of users, and up to 5-6% at the 90th percentile. There was also a significant statistical correlation between the user and the number of errors in the different texts that were used. After examining the words most often captured incorrectly by Web Captioner, we noticed that many are single-letter articles, others are quite rare and happen to have similar-sounding words that occur much more frequently in Greek, and others are often used as prefixes but not as easily recognized as conjunctions (e.g., "εκ")

Pergamos : Unified Institutional Repository / Digital Library Platform of the National and Kapodistrian University of Athens