Search CORE

17,952 research outputs found

Automating Metadata Extraction: Genre Classification

Author: Kim Dr Yunhyong
Ross Seamus
Publication venue
Publication date: 01/01/2006
Field of study

A problem that frequently arises in the management and integration of scientific data is the lack of context and semantics that would link data encoded in disparate ways. To bridge the discrepancy, it often helps to mine scientific texts to aid the understanding of the database. Mining relevant text can be significantly aided by the availability of descriptive and semantic metadata. The Digital Curation Centre (DCC) has undertaken research to automate the extraction of metadata from documents in PDF([22]). Documents may include scientific journal papers, lab notes or even emails. We suggest genre classification as a first step toward automating metadata extraction. The classification method will be built on looking at the documents from five directions; as an object of specific visual format, a layout of strings with characteristic grammar, an object with stylo-metric signatures, an object with meaning and purpose, and an object linked to previously classified objects and external sources. Some results of experiments in relation to the first two directions are described here; they are meant to be indicative of the promise underlying this multi-faceted approach.

Enlightened Romanticism: Mary Gartside’s colour theory in the age of Moses Harris, Goethe and George Field

Author: Loske Alexandra
Publication venue
Publication date: 01/01/2012
Field of study

The aim of this paper is to evaluate the work of Mary Gartside, a British female colour theorist, active in London between 1781 and 1808. She published three books between 1805 and 1808. In chronological and intellectual terms Gartside can cautiously be regarded an exemplary link between Moses Harris, who published a short but important theory of colour in the second half of the eighteenth century, and J.W. von Goethe’s highly influential Zur Farbenlehre, published in Germany in 1810. Gartside’s colour theory was published privately under the disguise of a traditional water colouring manual, illustrated with stunning abstract colour blots (see example above). Until well into the twentieth century, she remained the only woman known to have published a theory of colour. In contrast to Goethe and other colour theorists in the late 18th and early 19th century Gartside was less inclined to follow the anti-Newtonian attitudes of the Romantic movement

Sussex Research Online

The Distributional Hypothesis

Author: Sahlgren Magnus
Publication venue
Publication date: 01/01/2008
Field of study

CiteSeerX

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

The Mirror DBMS at TREC-8

Author: Hiemstra Djoerd
Vries Arjen P. de
Publication venue: National Institute of Standards and Technology (NIST)
Publication date: 01/01/1999
Field of study

The database group at University of Twente participates in TREC8 using the Mirror DBMS, a prototype database system especially designed for multimedia and web retrieval. From a database perspective, the purpose has been to check whether we can get sufficient performance, and to prepare for the very large corpus track in which we plan to participate next year. From an IR perspective, the experiments have been designed to learn more about the effect of the global statistics on the ranking

CiteSeerX

CWI's Institutional Repository

Radboud Repository

University of Twente Research Information

Bilingual episodic memory: an introduction

Author: Aneta Pavlenko
Baddeley A.
Berman R.
Jean-Marc Dewaele
McCabe A.
McKoon G.
Ochs E.
Pavlenko A.
Robert W. Schrauf
Schank R.
Schank R.
Schieffelin B. B.
Shore B.
Tulving E.
Tulving E.
Publication venue: 'SAGE Publications'
Publication date: 01/09/2003
Field of study

Our current models of bilingual memory are essentially accounts of semantic memory whose goal is to explain bilingual lexical access to underlying imagistic and conceptual referents. While this research has included episodic memory, it has focused largely on recall for words, phrases, and sentences in the service of understanding the structure of semantic memory. Building on the four papers in this special issue, this article focuses on larger units of episodic memory(from quotidian events with simple narrative form to complex autobiographical memories) in service of developing a model of bilingual episodic memory. This requires integrating theory and research on how culture-specific narrative traditions inform encoding and retrieval with theory and research on the relation between(monolingual) semantic and episodic memory(Schank, 1982; Schank & Abelson, 1995; Tulving, 2002). Then, taking a cue from memory-based text processing studies in psycholinguistics(McKoon & Ratcliff, 1998), we suggest that as language forms surface in the progressive retrieval of features of an event, they trigger further forms within the same language serving to guide a within-language/ within-culture retrieval

Crossref

Birkbeck Institutional Research Online

Speech rhythm: a metaphor?

Author: Abercrombie D
Barry WJ
Carter PM
Couper-Kuhlen E
Cummins F
Dankovičová J
Dellwo V
Dellwo V
Eriksson A
Fletcher J
Francis Nolan
Gibbon D
Grabe E
Hae-Sung Jeon
Kim J-M
Koreman J
Lehiste I
Lin H
Lloyd James A
Mok P
Nazzi T
O'Dell M
O'Rourke E
Patel AD
Pike KL
Platt JT
Ramus F
Ramus F
Shattuck-Hufnagel S
Tongue RK
White L
White L
Windmann A
Zvonik E
Publication venue: 'The Royal Society'
Publication date: 11/11/2014
Field of study

Is speech rhythmic? In the absence of evidence for a traditional view that languages strive to coordinate either syllables or stress-feet with regular time intervals, we consider the alternative that languages exhibit contrastive rhythm subsisting merely in the alternation of stronger and weaker elements. This is initially plausible, particularly for languages with a steep ‘prominence gradient’, i.e. a large disparity between stronger and weaker elements; but we point out that alternation is poorly achieved even by a ‘stress-timed’ language such as English, and, historically, languages have conspicuously failed to adopt simple phonological remedies that would ensure alternation. Languages seem more concerned to allow ‘syntagmatic contrast’ between successive units and to use durational effects to support linguistic functions than to facilitate rhythm. Furthermore, some languages (e.g. Tamil, Korean) lack the lexical prominence which would most straightforwardly underpin prominence alternation. We conclude that speech is not incontestibly rhythmic, and may even be antirhythmic. However, its linguistic structure and patterning allow the metaphorical extension of rhythm in varying degrees and in different ways depending on the language, and that it is this analogical process which allows speech to be matched to external rhythms

CLoK

Crossref

PubMed Central

Improving Statistical Language Model Performance with Automatically Generated Word Hierarchies

Author: McMahon John
Smith F. J.
Publication venue
Publication date: 01/01/1995
Field of study

An automatic word classification system has been designed which processes word unigram and bigram frequency statistics extracted from a corpus of natural language utterances. The system implements a binary top-down form of word clustering which employs an average class mutual information metric. Resulting classifications are hierarchical, allowing variable class granularity. Words are represented as structural tags --- unique

n

-bit numbers the most significant bit-patterns of which incorporate class information. Access to a structural tag immediately provides access to all classification levels for the corresponding word. The classification system has successfully revealed some of the structure of English, from the phonemic to the semantic level. The system has been compared --- directly and indirectly --- with other recent word classification systems. Class based interpolated language models have been constructed to exploit the extra information supplied by the classifications and some experiments have shown that the new models improve model performance.Comment: 17 Page Paper. Self-extracting PostScript Fil

arXiv.org e-Print Archive

CiteSeerX

Questioning short-term memory and its measurement: why digit span measures long-term associative learning

Author: Acheson
Acheson
Archibald
Bachelder
Baddeley
Bannard
Beckner
Bill Macken
Bopp
Botvinick
Braine
Bunting
Burgess
Burgess
Bybee
Cherry
Chi
Conway
Conway
Cowan
Cowan
Cowan
Cowan
Crannell
Dempster
Dempster
Du
Elliot
Ericsson
Ericsson
Feldman
Fernald
French
Gary Jones
Gathercole
Gathercole
Gathercole
Gobet
Goldberg
Grenfell-Essam
Hale
Halford
Hansson
Hebb
Hedenius
Helland
Henson
Hester
Hinton
Hornung
Hulme
Hulme
Jacobs
Jalbert
Jefferies
Jones
Jones
Jones
Jones
Jones
Karakaş
Kaufman
Kaufman
Kilb
Kučera
Kyllonen
Lovatt
Luck
MacDonald
Macken
Macken
Macken
Maidment
Maidment
Majerus
Majerus
Martin
Mathias
Messer
Miller
Misyak
Murray
Nairne
Neath
Ottem
Page
Page
Pascual-Leone
Paulesu
Payne
Perham
Pierrehumbert
Poirier
Poirier
Reber
Saffran
Saffran
Saint-Aubin
Saint-Aubin
Salthouse
Salthouse
Schweickert
Stone
Stuart
Swingley
Taylor
Tomasello
Vihman
Wechsler
Wechsler
Woods
Woodward
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

Traditional accounts of verbal short-term memory explain differences in performance for different types of verbal material by reference to inherent characteristics of the verbal items making up memory sequences. The role of previous experience with sequences of different types is ostensibly controlled for either by deliberate exclusion or by presenting multiple trials constructed from different random permutations

Elsevier - Publisher Connector

Crossref

Online Research @ Cardiff

Nottingham Trent Institutional Repository (IRep)

A network model of interpersonal alignment in dialog

Author: Alexander Mehler
Anderson
Andy Lücking
Barrat
Bonchev
Bunke
Caldarelli
Caldarelli
Church
Clark
Cover
Diestel
Erdős
Feldman
Garey
Giles
Gärdenfors
Halliday
Kamp
Kraskov
Levelt
Lewis
Manning
Maturana
Mehler
Mehler
Pastor-Satorras
Petra Weiß
Rieger
Schenker
Schober
Tuldava
Publication venue
Publication date: 01/01/2010
Field of study

In dyadic communication, both interlocutors adapt to each other linguistically, that is, they align interpersonally. In this article, we develop a framework for modeling interpersonal alignment in terms of the structural similarity of the interlocutors’ dialog lexica. This is done by means of so-called two-layer time-aligned network series, that is, a time-adjusted graph model. The graph model is partitioned into two layers, so that the interlocutors’ lexica are captured as subgraphs of an encompassing dialog graph. Each constituent network of the series is updated utterance-wise. Thus, both the inherent bipartition of dyadic conversations and their gradual development are modeled. The notion of alignment is then operationalized within a quantitative model of structure formation based on the mutual information of the subgraphs that represent the interlocutor’s dialog lexica. By adapting and further developing several models of complex network theory, we show that dialog lexica evolve as a novel class of graphs that have not been considered before in the area of complex (linguistic) networks. Additionally, we show that our framework allows for classifying dialogs according to their alignment status. To the best of our knowledge, this is the first approach to measuring alignment in communication that explores the similarities of graph-like cognitive representations. Keywords: alignment in communication; structural coupling; linguistic networks; graph distance measures; mutual information of graphs; quantitative network analysi

Crossref

Directory of Open Access Journals

Publications at Bielefeld University

Hochschulschriftenserver - Universität Frankfurt am Main

Designing Women: Essentializing Femininity in AI Linguistics

Author: Vega Ellianie S.
Publication venue: The Cupola: Scholarship at Gettysburg College
Publication date: 01/10/2019
Field of study

Since the eighties, feminists have considered technology a force capable of subverting sexism because of technology’s ability to produce unbiased logic. Most famously, Donna Haraway’s “A Cyborg Manifesto” posits that the cyborg has the inherent capability to transcend gender because of its removal from social construct and lack of loyalty to the natural world. But while humanoids and artificial intelligence have been imagined as inherently subversive to gender, current artificial intelligence perpetuates gender divides in labor and language as their programmers imbue them with traits considered “feminine.” A majority of 21st century AI and humanoids are programmed to fit female stereotypes as they fulfill emotional labor and perform pink-collar tasks, whether through roles as therapists, query-fillers, or companions. This paper examines four specific chat-based AI --ELIZA, XiaoIce, Sophia, and Erica-- and examines how their feminine linguistic patterns are used to maintain the illusion of emotional understanding in regards to the tasks that they perform. Overall, chat-based AI fails to subvert gender roles, as feminine AI are relegated to the realm of emotional intelligence and labor

Gettysburg College