8 research outputs found

    Attributing Authorship in the Noisy Digitized Correspondence of Jacob and Wilhelm Grimm

    Get PDF
    This article presents the results of a multidisciplinary project aimed at better understanding the impact of different digitization strategies in computational text analysis. More specifically, it describes an effort to automatically discern the authorship of Jacob and Wilhelm Grimm in a body of uncorrected correspondence processed by HTR (Handwritten Text Recognition) and OCR (Optical Character Recognition), reporting on the effect this noise has on the analyses necessary to computationally identify the different writing style of the two brothers. In summary, our findings show that OCR digitization serves as a reliable proxy for the more painstaking process of manual digitization, at least when it comes to authorship attribution. Our results suggest that attribution is viable even when using training and test sets from different digitization pipelines. With regards to HTR, this research demonstrates that even though automated transcription significantly increases the risk of text misclassification when compared to OCR, a cleanliness above ≈ 20% is already sufficient to achieve a higher-than-chance probability of correct binary attribution

    The Grimm Brothers : a stylometric network analysis

    No full text
    Stylometric methods can be used to reveal similarities between texts and, combined with network analysis, to depict the stylistic relations between those texts. The research conducted here focuses on a corpus of letters written by Jacob and Wilhelm Grimm. Using stylometric analysis, we model the writing styles of the brothers depending on the addressees and chronology. The brothers have individual styles: Wilhelm has a more friendly and personal tone independent on addresses, while Jacob has a more impersonal style, unless he was writing to Wilhelm. Their styles merge at the interactions of their career or personal development

    Chistul brahiogen cervical lateral

    Get PDF
    Catedra de chirurgie OMF şi implantologie orală „Arsenie Guţan”, USMF „Nicolae Testemiţanu”, Clinica „Emilian Coțaga”, IMSP Institutul Mamei și CopiluluiBackground. Branchial cysts are rare congenital malformations which can be emphasized at birth and throughout life; they arise due to the partial or complete involution of branchial apparatus in the development of the human embyo. They cause a huge discomfort to the patient. Objective of the study. Early diagnosis to assess the treatment tactics of the branchial lateral cysts. Material and Methods. Case presentation. The 40-year-old patient was admitted to the Institute of Emergency Medicine in the Oral and Maxillofacial Surgery Department with the diagnosis of a left lateral branchial cyst. Therefore, a clinical and paraclinical examination was performed (USG, laboratory tests). A surgical intervention such as „removal of the swelling” was advised. Results. Patient accuses a voluminous swelling which was located in the upper left lateral cervical region that began 2 years ago with episodes of frequent superinfection. During the examination, the mass was partially mobile, non-adherent to the adjacent tissues and had an elastic consistency. The surgery to remove the cystic formation was performed under general anesthesia. The postoperative period passed without any particularities and complications. The histopathological investigation confirmed the preoperative diagnosis highlighting characteristic aspects of the pathology: the cyst wall is lined with multi-layered squamous epithelium and presents lymphoid tissue with follicles. Conclusion. Early diagnosis will reduce the increase in size of the branchial cysts preventing inflammatory complications.Introducere. Chisturile brahiogene sunt malformații congenitale rare, care se evidențiază la momentul nașterii și pe parcursul vieții; iau naștere din cauza involuției parțiale sau complete a aparatului branhial, în dezvoltarea embrionului uman. Ele provoacă o multitudine de incomodități pacientului. Scopul lucrării. Diagnosticarea precoce pentru aprecierea tacticii de tratament a chisturilor brahiogene. Material și Metode. Prezentare de caz. Pacienta cu vârsta de 40 ani a fost internată la IMSP Institutul de Medicină Urgentă Spitalul de Urgență, în secția de Chirurgie OMF, cu diagnosticul de chist cervical lateral brahiogen pe stânga. S-a efectuat examinarea clinică și paraclinică (USG, analize de laborator). A fost recomandată intervenția chirurgicală de „înlăturare a formațiunii”. Rezultate. Pacienta acuză o formațiune de volum, cu localizarea în regiunea cervicală lateral superioară pe stânga, care a debutat cu 2 ani în urmă, cu episoade de suprainfectări frecvente. La momentul examinării formațiunea avea o consistență elastică, parțial mobilă, neaderentă la țesuturile adiacente. Operația de „înlăturare a formațiunii” s-a efectuat sub anestezie generală. Perioada postoperatorie a decurs fără particularități, iar complicații nu au avut loc. Investigația histopatologică a confirmat diagnosticul preoperator evidențiind aspecte caracteristice patologiei: peretele chistului este tapetat cu epiteliu pluristratifcat pavimentos și prezintă țesut limfoid cu foliculi. Concluzii. Diagnosticarea precoce va reduce creșterea în volum a chisturilor brahiogene preîntâmpinând complicațiile inflamatorii

    Szerzőazonosítás Jacob és Wilhelm Grimm zajos, digitalizált levelezésében

    Get PDF
    Az alábbi cikk egy multidiszciplináris projekt eredményeit mutatja be, amely a különböző digitalizációs stratégiák számítógépes szöveganalízisben való használhatóságát járja körül. Pontosabban Jacob és Wilhelm Grimm szerzőségének automatizált megkülönböztetésére tettünk kísérletet, melyet egy HTR (Handwritten Text Recognition – kézzel írott szöveg felismerése) és OCR (Optical Character Recognition – optikai karakterfelismerés) által feldolgozott levelezéskorpuszban hajtottunk végre, korrekció nélkül – felmérve, hogy az így keletkezett zaj milyen hatással van a fivérek különböző írásmódjának azonosítására. Összegezve, úgy tűnik, hogy az OCR megbízható helyettesítője lehet a manuális átírásnak, legalábbis a szerzőazonosítás kérdéskörét illetően. Eredményeink továbbá abba az irányba mutatnak, miszerint még a különböző digitalizációs eljárásokból származó tanító- és tesztkorpuszok (training and test set) is használhatók a szerzőazonosítás során. A HTR-t tekintve a kutatás azt demonstrálja, hogy ez az automatizált átírás ugyan az OCR-hez képest szignifikánsan növeli a szövegek félrecsoportosításának veszélyét, ám körülbelül 20% feletti tisztaság már önmagában elegendő ahhoz, hogy a véletlennél nagyobb esélye legyen a helyes bináris megfeleltetésnek

    Attributing Authorship in the Noisy Digitized Correspondence of Jacob and Wilhelm Grimm

    No full text
    This article presents the results of a multidisciplinary project aimed at better understanding the impact of different digitization strategies in computational text analysis. More specifically, it describes an effort to automatically discern the authorship of Jacob and Wilhelm Grimm in a body of uncorrected correspondence processed by HTR (Handwritten Text Recognition) and OCR (Optical Character Recognition), reporting on the effect this noise has on the analyses necessary to computationally identify the different writing style of the two brothers. In summary, our findings show that OCR digitization serves as a reliable proxy for the more painstaking process of manual digitization, at least when it comes to authorship attribution. Our results suggest that attribution is viable even when using training and test sets from different digitization pipelines. With regards to HTR, this research demonstrates that even though automated transcription significantly increases the risk of text misclassification when compared to OCR, a cleanliness above 48 20% is already sufficient to achieve a higher-than-chance probability of correct binary attribution

    Table_2.csv

    No full text
    <p>This article presents the results of a multidisciplinary project aimed at better understanding the impact of different digitization strategies in computational text analysis. More specifically, it describes an effort to automatically discern the authorship of Jacob and Wilhelm Grimm in a body of uncorrected correspondence processed by HTR (Handwritten Text Recognition) and OCR (Optical Character Recognition), reporting on the effect this noise has on the analyses necessary to computationally identify the different writing style of the two brothers. In summary, our findings show that OCR digitization serves as a reliable proxy for the more painstaking process of manual digitization, at least when it comes to authorship attribution. Our results suggest that attribution is viable even when using training and test sets from different digitization pipelines. With regards to HTR, this research demonstrates that even though automated transcription significantly increases the risk of text misclassification when compared to OCR, a cleanliness above ≈ 20% is already sufficient to achieve a higher-than-chance probability of correct binary attribution.</p

    Table_1.csv

    No full text
    <p>This article presents the results of a multidisciplinary project aimed at better understanding the impact of different digitization strategies in computational text analysis. More specifically, it describes an effort to automatically discern the authorship of Jacob and Wilhelm Grimm in a body of uncorrected correspondence processed by HTR (Handwritten Text Recognition) and OCR (Optical Character Recognition), reporting on the effect this noise has on the analyses necessary to computationally identify the different writing style of the two brothers. In summary, our findings show that OCR digitization serves as a reliable proxy for the more painstaking process of manual digitization, at least when it comes to authorship attribution. Our results suggest that attribution is viable even when using training and test sets from different digitization pipelines. With regards to HTR, this research demonstrates that even though automated transcription significantly increases the risk of text misclassification when compared to OCR, a cleanliness above ≈ 20% is already sufficient to achieve a higher-than-chance probability of correct binary attribution.</p
    corecore