8 research outputs found

    Development of linguistic linked open data resources for collaborative data-intensive research in the language sciences

    Get PDF
    Making diverse data in linguistics and the language sciences open, distributed, and accessible: perspectives from language/language acquistiion researchers and technical LOD (linked open data) researchers. This volume examines the challenges inherent in making diverse data in linguistics and the language sciences open, distributed, integrated, and accessible, thus fostering wide data sharing and collaboration. It is unique in integrating the perspectives of language researchers and technical LOD (linked open data) researchers. Reporting on both active research needs in the field of language acquisition and technical advances in the development of data interoperability, the book demonstrates the advantages of an international infrastructure for scholarship in the field of language sciences. With contributions by researchers who produce complex data content and scholars involved in both the technology and the conceptual foundations of LLOD (linguistics linked open data), the book focuses on the area of language acquisition because it involves complex and diverse data sets, cross-linguistic analyses, and urgent collaborative research. The contributors discuss a variety of research methods, resources, and infrastructures. Contributors Isabelle Barrière, Nan Bernstein Ratner, Steven Bird, Maria Blume, Ted Caldwell, Christian Chiarcos, Cristina Dye, Suzanne Flynn, Claire Foley, Nancy Ide, Carissa Kang, D. Terence Langendoen, Barbara Lust, Brian MacWhinney, Jonathan Masci, Steven Moran, Antonio Pareja-Lora, Jim Reidy, Oya Y. Rieger, Gary F. Simons, Thorsten Trippel, Kara Warburton, Sue Ellen Wright, Claus Zin

    Development of Linguistic Linked Open Data Resources for Collaborative Data-Intensive Research in the Language Sciences

    Get PDF
    This book is the product of an international workshop dedicated to addressing data accessibility in the linguistics field. It is therefore vital to the book’s mission that its content be open access. Linguistics as a field remains behind many others as far as data management and accessibility strategies. The problem is particularly acute in the subfield of language acquisition, where international linguistic sound files are needed for reference. Linguists' concerns are very much tied to amount of information accumulated by individual researchers over the years that remains fragmented and inaccessible to the larger community. These concerns are shared by other fields, but linguistics to date has seen few efforts at addressing them. This collection, undertaken by a range of leading experts in the field, represents a big step forward. Its international scope and interdisciplinary combination of scholars/librarians/data consultants will provide an important contribution to the field

    A model for automated topic spotting in a mobile chat based mathematics tutoring environment

    Get PDF
    Systems of writing have existed for thousands of years. The history of civilisation and the history of writing are so intertwined that it is hard to separate the one from the other. These systems of writing, however, are not static. They change. One of the latest developments in systems of writing is short electronic messages such as seen on Twitter and in MXit. One novel application which uses these short electronic messages is the Dr Math® project. Dr Math is a mobile online tutoring system where pupils can use MXit on their cell phones and receive help with their mathematics homework from volunteer tutors around the world. These conversations between pupils and tutors are held in MXit lingo or MXit language – this cryptic, abbreviated system 0f ryting w1ch l0ks lyk dis. Project μ (pronounced mu and indicating MXit Understander) investigated how topics could be determined in MXit lingo and Project μ's research outputs spot mathematics topics in conversations between Dr Math tutors and pupils. Once the topics are determined, supporting documentation can be presented to the tutors to assist them in helping pupils with their mathematics homework. Project μ made the following contributions to new knowledge: a statistical and linguistic analysis of MXit lingo provides letter frequencies, word frequencies, message length statistics as well as linguistic bases for new spelling conventions seen in MXit based conversations; a post-stemmer for use with MXit lingo removes suffixes from the ends of words taking into account MXit spelling conventions allowing words such as equashun and equation to be reduced to the same root stem; a list of over ten thousand stop words for MXit lingo appropriate for the domain of mathematics; a misspelling corrector for MXit lingo which corrects words such as acount and equates it to account; and a model for spotting mathematical topics in MXit lingo. The model was instantiated and integrated into the Dr Math tutoring platform. Empirical evidence as to the effectiveness of the μ Topic Spotter and the other contributions is also presented. The empirical evidence includes specific statistical tests with MXit lingo, specific tests of the misspelling corrector, stemmer, and feedback mechanism, and an extensive exercise of content analysis with respect to mathematics topics

    An ontology for accessing transcription systems

    Full text link

    An Ontology for Accessing Transcription Systems (OATS)

    Full text link
    This paper presents the Ontology for Accessing Transcription Systems (OATS), a knowledge base that supports interoperation over disparate transcription systems and practical orthographies. The knowledge base includes an ontological description of writing systems and relations for mapping transcription system segments to an interlingua pivot, the IPA. It includes orthographic and phonemic inventories from 203 African languages. OATS is motivated by the desire to query data in the knowledge base via IPA or native orthography, and for error checking of digitized data and conversion between transcription systems. The model in this paper implements these goals

    An ontology for accessing transcription systems (OATS)

    No full text
    corecore