7 research outputs found

    A revised modified Bentall's procedure using aorto-prosthetic hemostatic suture

    Get PDF
    nem

    SCyDia ā€“ OCR For Serbian Cyrillic with Diacritics

    Get PDF
    In the currently ongoing process of retro-digitization of Serbian dialectal dictionaries, the biggest obstacle is the lack of machine-readable versions of paper editions. Therefore, one essential step is needed before venturing into the dictionary-making process in the digital environment ā€“ OCRing the pages with the highest possible accuracy. Successful retro-digitization of Serbian dialectal dictionaries, currently in progress, has shown a dire need for one basic yet necessary step, lacking until now ā€“ OCRing the pages with the highest possible accuracy. OCR processing is not a new technology, as many opensource and commercial software solutions can reliably convert scanned images of paper documents into digital documents. Available software solutions are usually efficient enough to process scanned contracts, invoices, financial statements, newspapers, and books. In cases where it is necessary to process documents that contain accented text and precisely extract each character with diacritics, such software solutions are not efficient enough. This paper presents the OCR software called ā€œSCyDiaā€, developed to overcome this issue. We demonstrate the organizational structure of the OCR software ā€œSCyDiaā€ and the first results. The ā€œSCyDiaā€ is a web-based software solution that relies on the open-source software ā€œTesseractā€ in the background. ā€œSCyDiaā€ also contains a module for semi-automatic text correction. We have already processed over 15,000 pages, 13 dialectal dictionaries, and five dialectal monographs. At this point in our project, we have analyzed the accuracy of the ā€œSCyDiaā€ by processing 13 dialectal dictionaries. The results were analyzed manually by an expert who examined a number of randomly selected pages from each dictionary. The preliminary results show great promise, spanning from 97.19% to 99.87%

    SCyDia ā€“ OCR For Serbian Cyrillic with Diacritics

    Get PDF
    In the currently ongoing process of retro-digitization of Serbian dialectal dictionaries, the biggest obstacle is the lack of machine-readable versions of paper editions. Therefore, one essential step is needed before venturing into the dictionary-making process in the digital environment ā€“ OCRing the pages with the highest possible accuracy. Successful retro-digitization of Serbian dialectal dictionaries, currently in progress, has shown a dire need for one basic yet necessary step, lacking until now ā€“ OCRing the pages with the highest possible accuracy. OCR processing is not a new technology, as many opensource and commercial software solutions can reliably convert scanned images of paper documents into digital documents. Available software solutions are usually efficient enough to process scanned contracts, invoices, financial statements, newspapers, and books. In cases where it is necessary to process documents that contain accented text and precisely extract each character with diacritics, such software solutions are not efficient enough. This paper presents the OCR software called ā€œSCyDiaā€, developed to overcome this issue. We demonstrate the organizational structure of the OCR software ā€œSCyDiaā€ and the first results. The ā€œSCyDiaā€ is a web-based software solution that relies on the open-source software ā€œTesseractā€ in the background. ā€œSCyDiaā€ also contains a module for semi-automatic text correction. We have already processed over 15,000 pages, 13 dialectal dictionaries, and five dialectal monographs. At this point in our project, we have analyzed the accuracy of the ā€œSCyDiaā€ by processing 13 dialectal dictionaries. The results were analyzed manually by an expert who examined a number of randomly selected pages from each dictionary. The preliminary results show great promise, spanning from 97.19% to 99.87%

    The numberof senses effect in polysemous adjective recognition

    No full text
    Previous research revealed a significant polysemy effect: namely, it found that words with multiple related senses (polysemous words) are recognised faster compared to the words with multiple unrelated meanings (homonymous words) and words with only one meaning/sense (unambiguous words; Rodd et al., 2002). The measure of ambiguity in polysemous words was the number of senses (NoS), derived from the meanings/senses provided by native speakers. NoS was a significant predictor of reaction time in visual lexical decision task (VLDT) experiments (Filipović Đurđević, 2007). This is in accordance with various models of lexical ambiguity processing. Although some attribute the effect to an increased semantic activation due to the facilitation among the related senses (Armstrong & Plaut, 2016; Rodd et al., 2004), whereas others attribute it to the differences at the level of responding (Hino & Lupker, 1996), they agree in predicting the processing advantage in polysemous word recognition. Research in Serbian revealed this effect in noun and verb processing (Filipović Đurđević & Kostić, 2008; MiÅ”ić & Filipović Đurđević, 2019; 2020). The aim of this research was to further generalize the findings and to test whether the NoS effect is present in polysemous adjective recognition. The prediction was that the increase in the NoS and word frequency would be followed by faster adjective recognition. In this research, the participants were presented with a VLDT consisting of 107 polysemous Serbian adjectives. They were presented in all three grammatical genders using the Latin square design between participants, which allowed each participant to see only one form of the same adjective. Multiple regression revealed that the NoS and frequency were significant predictors of the reaction time: polysemous adjectives with higher NoS and higher frequency were processed faster (NoS: Ī² = -.199, S.E. = .093, df = 106, t = -2.143, p < .05; frequency: Ī² = -.281, S.E. = .093, df = 106, t = -3.036, p < .05). These findings are in accordance with our hypothesis and concur with the previous findings from the experiments with nouns and verbs (Filipović Đurđević & Kostić, 2008; MiÅ”ić & Filipović Đurđević, 2019; 2020), as well as various models regarding word ambiguity processing (Armstrong & Plaut, 2016; Hino & Lupker, 1996; Rodd et al., 2004). Together they converge to the conclusion that the NoS facilitates recognition of polysemous words in the VLDT across different parts of speech

    The number of senses effect in polysemous noun recognition: expanding the database

    No full text
    Words with multiple related senses (polysemous words) are recognised faster compared to the words with multiple unrelated meanings (homonymous words) and unambiguous words (Rodd et al., 2002). The measure of ambiguity in polysemous words was the number of senses (NoS), derived from the meanings/senses provided by native speakers, as well as the information theory measures, entropy (sense uncertainty) and redundancy (the balance of meaning probabilities). These measures were significant predictors of reaction time in visual lexical decision task (VLDT) experiments (Filipović Đurđević & Kostić, 2021). In spite of differences, multiple models agree in predicting the observed facilitation. Research in Serbian revealed these effects in noun, adjective, and verb processing (Anđelić, Ilić, MiÅ”ić, & Filipović Đurđević, 2021; Filipović Đurđević & Kostić, 2008; 2021; MiÅ”ić & Filipović Đurđević, 2021). The aim of this research was to conceptually replicate and further generalise the NoS effect on processing of nouns. Also, the goal was to collect native speakers' intuitions of the senses for the novel set of Serbian nouns and thus expand the existing database (Filipović Đurđević & Kostić, 2016). A novel set of 100 polysemous nouns was selected from the dictionary and then included in the normative study, in which 36 participants were instructed to write all of the senses that they could recall. The senses obtained from the participants were categorised according to the dictionary and the NoS along with the entropy and redundancy of senses was calculated. The same nouns were presented in a visual lexical decision task to a novel group of 87 native speakers. The results indicated that polysemous nouns with higher number of senses were processed faster (Ī² =-.02 , CI = -.03 ā€“ -.00, t =-2.78, p = .005), which is in accordance with our hypothesis. The results regarding the information theory measures revealed that the effects of entropy (H) and redundancy (T) indicated a non-significant trend in the predicted direction (H: Ī² =-.00 , CI [-.02 ā€“ .01], t =-.597 p = .557, T: Ī² =.01 , CI [-.00 ā€“ .03], t =1.66, p = .097). These findings concur with the previous findings from the noun, adjective and verb experiments and the SSD model (Armstrong & Plaut, 2016) and together they converge to the conclusion that the effect of number of senses in the processing of polysemous words facilitates recognition in the visual lexical decision

    Open database of polysemous senses of 308 Serbian polysemous nouns, verbs, and adjectives

    No full text
    The majority of words can denote multiple related objects/phenomena, i.e. can have multiple related senses ā€“ so called polysemes. Understanding this linguistic phenomenon is therefore of high importance both in terms of linguistic inquiries and in terms of psychological studies of cognitive mechanisms. Previous research demonstrated that, in addition to the number of senses, processing is also influenced by the balance of sense probabilities (Filipović Đurđević & Kostić, 2021). However, the resources for the study of lexical ambiguity are very sparce (e.g. a database of 150 polysemous Serbian nouns; Filipović Đurđević & Kostić, 2017). Additionally, most of these effects were demonstrated either within a single part of speech category (typically nouns) or for ambiguous words with senses that span across various part of speech (e.g. a record / to record; as pointed out by Eddington & Tokowicz, 2015). Therefore, the goal of this paper is to present a new open database containing raw and categorized native speakersā€™ semantic intuitions for 308 Serbian polysemous nouns (100), verbs (100), adjectives (108) and multiple quantifications representing an array of the level of ambiguity indices. For each of the polysemous words, we collected semantic intuitions of native speakers by using the total meaning metric (Azuma, 1997). We then categorized the collected descriptions by using three strategies: a) relying solely on semantic intuition, b) relying solely on dictionary descriptions, and c) combining semantic intuitions and dictionary descriptions. Within each strategy, we also monitored and investigated the effect of the coder (the researcher performing the categorization) in order to explore the robustness of each approach. We then generated the sense probability distributions for each word by counting the response frequencies across created categories. In order to quantify the level of ambiguity, we calculated the number of senses, redundancy, and entropy of the obtained sense probability distributions (Shannon, 1948; Filipović Đurđević & Kostić, 2017). Each measure, within each approach was also corrected for the effects of idiosyncratic senses, reflexive verbs etc. This database will be openly available and will provide a useful resource in ambiguity research. In future, this database should be expanded with measures from word embeddings (i.e. BERT; Wiedemann et al., 2019) that separate different word senses. This will allow for quantifying the level of ambiguity on large-scale samples of text that may reveal a more precise estimation of sense numbers and sense probabilities, and would allow for abandoning the counting-of-senses approach (as suggested by Filipović Đurđević et al., 2009). Adding this to the database in the future, and therefore allowing comparison to existing measures may allow another validation point for measures derived from human participants

    Zoonotic nematodes of wild carnivores

    No full text
    corecore