20 research outputs found

    A scientific-research activities information system

    No full text
    Cilj - Cilj istraživanja je razvoj modela, implementacija prototipa i verifikacija sistema za ekstrakciju metodologija iz naučnih članaka iz oblasti Informatike. Da bi se, pomoću tog sistema, naučnicima mogao obezbediti bolji uvid u metodologije u svojim oblastima potrebno je ekstrahovane metodolgije povezati sa metapodacima vezanim za publikaciju iz koje su ekstrahovani. Iz tih razloga istraživanje takoñe za cilj ima i razvoj modela sistema za automatsku ekstrakciju metapodataka iz naučnih članaka. Metodologija - Ekstrahovane metodologije se kategorizuju u četiri kategorije: kategorizuju se u četiri semantičke kategorije: zadatak (Task), metoda (Method), resurs/osobina (Resource/Feature) i implementacija (Implementation). Sistem se sastoji od dva nivoa: prvi je automatska identifikacija metodoloških rečenica; drugi nivo vrši prepoznavanje metodoloških fraza (segmenata). Zadatak ekstrakcije i kategorizacije formalizovan je kao problem označavanja sekvenci i upotrebljena su četiri zasebna Conditional Random Fields modela koji su zasnovani na sintaktičkim frazama. Sistem je evaluiran na ručno anotiranom korpusu iz oblasti Automatske Ekstrakcije Termina koji se sastoji od 45 naučnih članaka. Sistem za automatsku ekstrakciju metapodataka zasnovan je na klasifikaciji. Klasifikacija metapodataka vrši se u osam unapred definisanih sematičkih kategorija: Naslov, Autori, Pripadnost, Adresa, Email, Apstrakt, Ključne reči i Mesto publikacije. Izvršeni su eksperimenti sa svim standardnim modelima za klasifikaciju: naivni bayes, stablo odlučivanja, k-najbližih suseda i mašine potpornih vektora. Rezultati - Sistem za ekstrakciju metodologija postigao je sledeće rezultate: F-mera od 53% za identifikaciju Task i Method kategorija (sa preciznošću od 70%) dok su vrednosti za F-mere za Resource/Feature i Implementation kategorije bile 60% (sa preciznošću od 67%) i 75% (sa preciznošću od 85%) respektivno. Nakon izvršenih klasifikacionih eksperimenata, za sistem za ekstrakciju metapodataka, utvrñeno je da mašine potpornih vektora (SVM) pružaju najbolje performanse. Dobijeni rezultati SVM modela su generalno dobri, F-mera preko 85% kod skoro svih kategorija, a preko 90% kod većine. Ograničenja istraživanja/implikacije - Sistem za ekstrakciju metodologija, kao i sistem za esktrakciju metapodataka primenljivi su samo na naučne članke na engleskom jeziku. Praktične implikacije - Predloženi modeli mogu se, pre svega, koristiti za analizu i pregled razvoja naučnih oblasti kao i za kreiranje sematički bogatijih informacionih sistema naučno-istraživačke delatnosti. Originalnost/vrednost - Originalni doprinosi su sledeći: razvijen je model za ekstrakciju i semantičku kategorijzaciju metodologija iz naučnih članaka iz oblasti Informatike, koji nije opisan u postojećoj literaturi. Izvršena je analiza uticaja različitih vrsta osobina na ekstrakciju metodoloških fraza. Razvijen je u potpunosti automatizovan sistem za ekstrakciju metapodataka u informacionim sistemima naučno-istraživačke delatnosti.Purpose - The purpose of this research is model development, software prototype implementation and verification of the system for the identification of methodology mentions in scientific publications in a subdomain of automatic terminology extraction. In order to provide a better insight for scientists into the methodologies in their fields extracted methodologies should be connected with the metadata associated with the publication from which they are extracted. For this reason the purpose of this research was also a development of a system for the automatic extraction of metadata from scientific publications. Design/methodology/approach - Methodology mentions are categorized in four semantic categories: Task, Method, Resource/Feature and Implementation. The system comprises two major layers: the first layer is an automatic identification of methodological sentences; the second layer highlights methodological phrases (segments). Extraction and classification of the segments was 171 formalized as a sequence tagging problem and four separate phrase-based Conditional Random Fields were used to accomplish the task. The system has been evaluated on a manually annotated corpus comprising 45 full text articles. The system for the automatic extraction of metadata from scientific publications is based on classification. The metadata are classified eight pre-defined categories: Title, Authors, Affiliation, Address, Email, Abstract, Keywords and Publication Note. Experiments were performed with standard classification models: Decision Tree, Naive Bayes, K-nearest Neighbours and Support Vector Machines. Findings - The results of the system for methodology extraction show an Fmeasure of 53% for identification of both Task and Method mentions (with 70% precision), whereas the Fmeasures for Resource/Feature and Implementation identification was 60% (with 67% precision) and 75% (with 85% precision) respectively. As for the system for the automatic extraction of metadata Support Vector Machines provided the best performance. The Fmeasure was over 85% for almost all of the categories and over 90% for the most of them. Research limitations/implications - Both the system for the extractions of methodologies and the system for the extraction of metadata are only applicable to the scientific papers in English language. 172 Practical implications - The proposed models can be used in order to gain insight into a development of a scientific discipline and also to create semantically rich research activity information systems. Originality/Value - The main original contributions are: a novel model for the extraction of methodology mentions from scientific publications. The impact of the various types of features on the performance of the system was determined and presented. A fully automated system for the extraction of metadata for the rich research activity information systems was developed

    Lambda-calculus and formal language theory

    Get PDF
    Formal and symbolic approaches have offered computer science many application fields. The rich and fruitful connection between logic, automata and algebra is one such approach. It has been used to model natural languages as well as in program verification. In the mathematics of language it is able to model phenomena ranging from syntax to phonology while in verification it gives model checking algorithms to a wide family of programs. This thesis extends this approach to simply typed lambda-calculus by providing a natural extension of recognizability to programs that are representable by simply typed terms. This notion is then applied to both the mathematics of language and program verification. In the case of the mathematics of language, it is used to generalize parsing algorithms and to propose high-level methods to describe languages. Concerning program verification, it is used to describe methods for verifying the behavioral properties of higher-order programs. In both cases, the link that is drawn between finite state methods and denotational semantics provide the means to mix powerful tools coming from the two worlds

    Language and Linguistics in a Complex World Data, Interdisciplinarity, Transfer, and the Next Generation. ICAME41 Extended Book of Abstracts

    Get PDF
    This is a collection of papers, work-in-progress reports, and other contributions that were part of the ICAME41 digital conference

    Language and Linguistics in a Complex World Data, Interdisciplinarity, Transfer, and the Next Generation. ICAME41 Extended Book of Abstracts

    Get PDF
    This is a collection of papers, work-in-progress reports, and other contributions that were part of the ICAME41 digital conference

    Mind the gap: gap factors in intercultural business communication : a study of German-Indian semi-virtual tech/engineering teams

    Get PDF
    While the affordances of technology have facilitated virtual modes of global collaboration, cultural variances and a geographically-dispersed environment can also lead to impaired group communication in team interaction. This qualitative study draws on data gathered from four organizations to investigate the miscommunication and cognitive dissonances reported by virtual German-Indian engineering/tech communities of practice. The study argues that it is not so much the performance or doing of a communicative act that creates dissonances, but the gaps, i.e., the absence or not-doing of certain communicative actions expected in a collaborative context. The gap factors are experienced as unfulfilled reciprocal expectations, and are classified and explored against three parameters: 1) the culture of a technological community of practice, 2) the power relations between the interactants, and 3) the consequences of virtual communication. The findings indicate a complementary divergence between the two groups regarding the nature of gaps. While the German teams report gaps in communicative efficiency and content caused e.g., by non-disclosure, euphemistic language and a deficiency in push communication, the Indian teams perceive gaps in relationality and affective signaling. At the same time, they are two sides of the same coin, with the divergences arising from the way in which the intersecting structural parameters are viewed as being salient in interaction. The study concludes with implications and suggestions for organizational practice
    corecore