4 research outputs found

    A Taxonomy of Academic Abstract Sentence Classification Modelling

    Get PDF
    Background: Abstract sentence classification modelling has the potential to advance literature discovery capability for the array of academic literature information systems, however, no artefact exists that categorises known models and identifies their key characteristics. Aims: To systematically categorise known abstract sentence classification models and make this knowledge readily available to future researchers and professionals concerned with abstract sentence classification model development and deployment. Method: An information systems taxonomy development methodology was adopted after a literature review to categorise 23 abstract sentence classification models identified from the literature. Corresponding dimensions and characteristics were derived from this process with the resulting taxonomy presented. Results: Abstract sentence classification modelling has evolved significantly with state-of-the-art models now leveraging neural networks to achieve high-performance sentence classification. The resulting taxonomy provides a novel means to observe the development of this research field and enables us to consider how such models can be further improved or deployed in real-world applications

    GRADUATES' UNDERSTANDINGS, ACTUAL WRITING, AND CHALLENGES IN CONSTRUCTING RESEARCH ARTICLE ABSTRACTS

    Get PDF
    ABSTRAK Penelitian mengenai abstrak dalam artikel penelitian telah dikaji secara luas mulai dari menginvestigasi konstituen move-step, realisasi linguistic, keragaman budaya, maupun tantangan dalam penulisannya. Akan tetapi, hanya sedikit yang berfokus untuk mengeksplorasi pemahaman siswa dan menghubungkannya dengan tulisan mereka Oleh karena itu tujuan dari penelitian ini adalah untuk membandingkan pernyataan siswa mengenai move-step dalam abstrak dengan apa yang mereka tulis sekaligus mengeksplorasi tantangan menulis abstrak. Dengan demikian, penelitian ini menggunakan desain kualitatif dan menggunakan kuesioner terbuka yang diberikan kepada 10 mahasiswa pascasarjana secara daring. Untuk bagian move-step dianalisa menggunakan model Hyland (2000) sementara tantangan dalam penulisan abstrak diklasifikasikan berdasarkan teori Ferguson (2011). Hasil dari penelitian ini menyebutkan bahwa pemahaman siswa mengenai move-step tidak tercermin dalam tulisannya. Ketidaksesuaian antara pemahaman dan tulisan sebagian besar ditemukan pada step yang ada di move 1 dan move 5. Hal ini terjadi karena beberapa alasan seperti tingkat pengetahuan, tidak adanya persyaratan khusus dari penerbit, kurangnya keterampilan linguistik, serta perbedaan perspektif dan preferensi. Sementara itu, dalam hal tantangan, penulis meyebutkan bahwa mereka menghadapi enam tipe tantangan seperti menulis poin-poin utama dalam abstrak, menulis abstrak yang informatif, mengunakan kosa kata akademik, koherensi penulisan dan paragraf yang jelas, mencari contoh penulisan abstrak yang baik di internet dan mencari penerbit yang sesuai. Oleh karena itu, disarankan untu menambahkan penjelasan yang lebih mendalam mengenai elemen move-step pada mata kuliah penulisan akademik. Selain itu, siswa juga harus diberikan kesempatan yang luas untuk menulis abstrak berdasarkan pada teori-teori yang diberikan di kelas. Kata Kunci: tantangan, move-step dalam penulisan abstrak, tulisan siswa, pemahaman siswa ABSTRACT Studies on research article abstracts have been done extensively ranging from investigating the constituent move-step, linguistic realization, cultural diversities, to the challenges of writing them. However, only a few focus on exploring the students’ understandings and its relationship with their actual writing. Thus, this study aims to compare the students’ statement of abstract move-step with their actual writing as well as exploring their challenges in writing the abstract. This study employed a qualitative design and used an online open-ended questionnaire administered to ten graduates’ students. The move-step was analyzed using Hyland’s (2000) model while the challenges were classified based on Ferguson’s (2011) theory. The findings revealed that the students’ understandings do not reflected in their writing. The differences between the students’ understandings and actual writing mostly appeared in the step-level of move 1 and move 5. This is due to some reasons such as level of familiarity, no specific requirements from the publisher, lack of linguistic skills, and different perspectives and preferences. Meanwhile, in terms of challenges, the students stated that they encountered six challenges such as writing the main points of abstract, writing an informative abstract, using academic vocabulary, writing coherence and clear paragraph, finding good sample of abstract in the internet, and finding suitable publishers. Therefore, it is suggested that academic writing courses elaborate more about the elements of move-step in abstract writing. In addition, the students must also be given an ample opportunity to actually write the abstract referring to the theories given in the class. Key words: Challenges, move-step in abstract writing, students’ actual writing, students’ understanding

    An Exploration of the Generic Structures of Problem Statements in Research Article Abstracts

    Get PDF
    Studies on research article abstracts have examined the abstracts in their entirety. Besides, while some of these works concentrate on conference abstracts, most of them analyse a combination of research abstracts from a variety of disciplines outside arts-based disciplines. Problem statement segments of the abstracts are yet to be exclusively studied. Motivated by the paucity of work of this kind, this article therefore explores the generic structures of problem statements in arts-based research article abstracts. The study got its data from purposively selected three hundred arts-based research article abstracts published in learned journals in the inner circle between 2001 and 2010. The data were analysed using insights from the generic structure potential, mood and modality aspects of SFG. Out of the five generic structural features that were found to characterise the abstracts, only two namely; Picking Out Inexistent Works(PIW) and Picking Out Inadequacy of Existing Works(PIEW) were found to be obligatory while the rest are optional. Variants of gap identification mood categories ( e.g. gap identification moods that pick out inexistent work and those that pick out inadequacy of existing works etc.) and modality categories (possibility modals) were also found in the  data. These enhance effective statement of the communicative goals of research problems in the abstracts.  The article concludes that studying the generic structure of problem statements in the abstracts has potency of providing useful insights into how, in what form and where the research problems are stated in the abstracts. Keywords: Research Article Abstracts, Problem Statements, Generic Structural Potential(GSP), Mood,  Modalit

    A scientific-research activities information system

    No full text
    Cilj - Cilj istraživanja je razvoj modela, implementacija prototipa i verifikacija sistema za ekstrakciju metodologija iz naučnih članaka iz oblasti Informatike. Da bi se, pomoću tog sistema, naučnicima mogao obezbediti bolji uvid u metodologije u svojim oblastima potrebno je ekstrahovane metodolgije povezati sa metapodacima vezanim za publikaciju iz koje su ekstrahovani. Iz tih razloga istraživanje takoñe za cilj ima i razvoj modela sistema za automatsku ekstrakciju metapodataka iz naučnih članaka. Metodologija - Ekstrahovane metodologije se kategorizuju u četiri kategorije: kategorizuju se u četiri semantičke kategorije: zadatak (Task), metoda (Method), resurs/osobina (Resource/Feature) i implementacija (Implementation). Sistem se sastoji od dva nivoa: prvi je automatska identifikacija metodoloških rečenica; drugi nivo vrši prepoznavanje metodoloških fraza (segmenata). Zadatak ekstrakcije i kategorizacije formalizovan je kao problem označavanja sekvenci i upotrebljena su četiri zasebna Conditional Random Fields modela koji su zasnovani na sintaktičkim frazama. Sistem je evaluiran na ručno anotiranom korpusu iz oblasti Automatske Ekstrakcije Termina koji se sastoji od 45 naučnih članaka. Sistem za automatsku ekstrakciju metapodataka zasnovan je na klasifikaciji. Klasifikacija metapodataka vrši se u osam unapred definisanih sematičkih kategorija: Naslov, Autori, Pripadnost, Adresa, Email, Apstrakt, Ključne reči i Mesto publikacije. Izvršeni su eksperimenti sa svim standardnim modelima za klasifikaciju: naivni bayes, stablo odlučivanja, k-najbližih suseda i mašine potpornih vektora. Rezultati - Sistem za ekstrakciju metodologija postigao je sledeće rezultate: F-mera od 53% za identifikaciju Task i Method kategorija (sa preciznošću od 70%) dok su vrednosti za F-mere za Resource/Feature i Implementation kategorije bile 60% (sa preciznošću od 67%) i 75% (sa preciznošću od 85%) respektivno. Nakon izvršenih klasifikacionih eksperimenata, za sistem za ekstrakciju metapodataka, utvrñeno je da mašine potpornih vektora (SVM) pružaju najbolje performanse. Dobijeni rezultati SVM modela su generalno dobri, F-mera preko 85% kod skoro svih kategorija, a preko 90% kod većine. Ograničenja istraživanja/implikacije - Sistem za ekstrakciju metodologija, kao i sistem za esktrakciju metapodataka primenljivi su samo na naučne članke na engleskom jeziku. Praktične implikacije - Predloženi modeli mogu se, pre svega, koristiti za analizu i pregled razvoja naučnih oblasti kao i za kreiranje sematički bogatijih informacionih sistema naučno-istraživačke delatnosti. Originalnost/vrednost - Originalni doprinosi su sledeći: razvijen je model za ekstrakciju i semantičku kategorijzaciju metodologija iz naučnih članaka iz oblasti Informatike, koji nije opisan u postojećoj literaturi. Izvršena je analiza uticaja različitih vrsta osobina na ekstrakciju metodoloških fraza. Razvijen je u potpunosti automatizovan sistem za ekstrakciju metapodataka u informacionim sistemima naučno-istraživačke delatnosti.Purpose - The purpose of this research is model development, software prototype implementation and verification of the system for the identification of methodology mentions in scientific publications in a subdomain of automatic terminology extraction. In order to provide a better insight for scientists into the methodologies in their fields extracted methodologies should be connected with the metadata associated with the publication from which they are extracted. For this reason the purpose of this research was also a development of a system for the automatic extraction of metadata from scientific publications. Design/methodology/approach - Methodology mentions are categorized in four semantic categories: Task, Method, Resource/Feature and Implementation. The system comprises two major layers: the first layer is an automatic identification of methodological sentences; the second layer highlights methodological phrases (segments). Extraction and classification of the segments was 171 formalized as a sequence tagging problem and four separate phrase-based Conditional Random Fields were used to accomplish the task. The system has been evaluated on a manually annotated corpus comprising 45 full text articles. The system for the automatic extraction of metadata from scientific publications is based on classification. The metadata are classified eight pre-defined categories: Title, Authors, Affiliation, Address, Email, Abstract, Keywords and Publication Note. Experiments were performed with standard classification models: Decision Tree, Naive Bayes, K-nearest Neighbours and Support Vector Machines. Findings - The results of the system for methodology extraction show an Fmeasure of 53% for identification of both Task and Method mentions (with 70% precision), whereas the Fmeasures for Resource/Feature and Implementation identification was 60% (with 67% precision) and 75% (with 85% precision) respectively. As for the system for the automatic extraction of metadata Support Vector Machines provided the best performance. The Fmeasure was over 85% for almost all of the categories and over 90% for the most of them. Research limitations/implications - Both the system for the extractions of methodologies and the system for the extraction of metadata are only applicable to the scientific papers in English language. 172 Practical implications - The proposed models can be used in order to gain insight into a development of a scientific discipline and also to create semantically rich research activity information systems. Originality/Value - The main original contributions are: a novel model for the extraction of methodology mentions from scientific publications. The impact of the various types of features on the performance of the system was determined and presented. A fully automated system for the extraction of metadata for the rich research activity information systems was developed
    corecore