11 research outputs found

    Pliers and snowball at CLEF 2002

    Get PDF
    We test the utility of European language stemmers created using the Snowball language [1]. This allows us to experiment with PLIERS in languages other than English. We also report on some BM25 tuning constant experiments conducted in order to find the best settings for our searches

    Combining Multiple Strategies for Effective Monolingual and Cross-Language Retrieval

    Full text link

    Normalização automática de descrições de hotéis

    Get PDF
    As descrições de produtos turísticos na área da hotelaria, aviação, rent-a-car e pacotes de férias baseiam-se sobretudo em descrições textuais em língua natural muito heterogénea com estilos, apresentações e conteúdos muito diferentes entre si. Uma vez que o sector do turismo é bastante dinâmico e que os seus produtos e ofertas estão constantemente em alteração, o tratamento manual de normalização de toda essa informação não é possível. Neste trabalho construiu-se um protótipo que permite a classificação e extracção automática de informação a partir de descrições de produtos de turismo. Inicialmente a informação é classificada quanto ao tipo. Seguidamente são extraídos os elementos relevantes de cada tipo e gerados objectos facilmente computáveis. Sobre os objectos extraídos, o protótipo com recurso a modelos de textos e imagens gera automaticamente descrições normalizadas e orientadas a um determinado mercado. Esta versatilidade permite um novo conjunto de serviços na promoção e venda dos produtos que seria impossível implementar com a informação original. Este protótipo, embora possa ser aplicado a outros domínios, foi avaliado na normalização da descrição de hotéis. As frases descritivas do hotel são classificadas consoante o seu tipo (Local, Serviços e/ou Equipamento) através de um algoritmo de aprendizagem automática que obtém valores médios de cobertura de 96% e precisão de 72%. A cobertura foi considerada a medida mais importante uma vez que a sua maximização permite que não se percam frases para processamentos posteriores. Este trabalho permitiu também a construção e população de uma base de dados de hotéis que possibilita a pesquisa de hotéis pelas suas características. Esta funcionalidade não seria possível utilizando os conteúdos originais. ABSTRACT: The description of tourism products, like hotel, aviation, rent-a-car and holiday packages, is strongly supported on natural language expressions. Due to the extent of tourism offers and considering the high dynamics in the tourism sector, manual data management is not a reliable or scalable solution. Offer descriptions - in the order of thousands - are structured in different ways, possibly comprising different languages, complementing and/or overlap one another. This work aims at creating a prototype for the automatic classification and extraction of relevant knowledge from tourism-related text expressions. Captured knowledge is represented in a normalized/standard format to enable new services based on this information in order to promote and sale tourism products that would be impossible to implement with the raw information. Although it could be applied to other areas, this prototype was evaluated in the normalization of hotel descriptions. Hotels descriptive sentences are classified according their type (Location, Services and/or Equipment) using a machine learning algorithm. The built setting obtained an average recall of 96% and precision of 72%. Recall considered the most important measure of performance since its maximization allows that sentences were not lost in further processes. As a side product a database of hotels was built and populated with search facilities on its characteristics. This ability would not be possible using the original contents

    The Birmingham group: reading the second city in the 1930s

    Get PDF
    Described politically as propagandistic: the imposition of political dogma on creativity; the literature of a party disguised as the literature of a class and often dismissed as: conservative; lacking in invention, or simply the naive emulation of bourgeois realism, attempts to define 'Proletarian', or 'Working-Class' fiction and vouchsafe the 'authenticity' of its creators have continually proven resistant to any single or easy definition. This thesis will argue that the narratives of the Birmingham group rather than constrained by such narrow and negative assessments, present instead as a direct challenge to and refutation of them. Departing from traditional views of working-class writing as a genre informed by male-oriented notions of class-solidarity or contemporary critiques which, during a period of representational experimentation, had somewhat perfunctorily seen working-class literature indebted to the more individualistic concerns of bourgeois realism, this thesis will suggest that the narratives of the Birmingham group are more accurately characterised by the diversity of their innovative and formal approach. Far from politically quiescent they operate in the liminal space between overt propaganda and addressedness to reveal how intersections of class, gender and sexual identity frequently overlooked due to the critical legacy of patriarchal and workerist assumptions, were, from the outset not only present in their narratives but also prescient of political and formal issues raised in the more recent discussion of working-class literature

    The Free Press : February 11, 2010

    Get PDF

    Space & Distance As I Require: The Journals and Prose Fragments of Philip Whalen 1950-1966

    Full text link
    Space & Distance As I Require: The Journals & Prose Fragments of Philip Whalen 1950 - 1966 presents the early journals, prose fragments, and a few unpublished poems and essays by San Francisco Renaissance and Beat Generation poet Philip Whalen (1923-2002). This work includes a scholarly apparatus with both general literary and textual introductions, a critical bibliography that reflects my literary-historical concerns, brief section introductions, annotations, and an informal concordance with Whalen\u27s poetry utilizing The Collected Poems of Philip Whalen (ed. Rothenberg, 2007) as a reference work. Philip Whalen was an Irish-American writer with roots in small town Oregon, a poet who was, as Kenneth Rextroth once said, as intensely Northwestern in sensibility as the painters Morris Graves and Mark Tobey. Whalen was a poet of complex sources and influences, extraordinarily well-read in Elizabethan and 18th century English literature, in particular the satiric gestures of Sterne, Pope, Johnson, and Swift. During his lifetime Whalen produced a remarkable oeuvre of close to twenty collections of verse, twenty broadsides, two novels, eight or nine works of experimental prose, plus several dozen critical essays, lectures, commentaries, introductions, prefaces, and interviews, an extensive literary correspondence, and forty years of carefully written literary journals, ranging from roughly 1952 to 1992. Like two of his favorite 18th century novelists Laurence Sterne and Jonathan Swift, Whalen lived the second half of his life as an ordained cleric within a formal religious setting, a new religion for the West, Zen Buddhism, a spiritual tradition founded in India at least a thousand years before the birth of Christ. Whalen began his study of buddhism at Reed College in Portland, Oregon, having served in the Army Air Force as a radio repairman during the final years of WWII. At Reed Whalen\u27s interest in Asian culture was encouraged and augmented by his roommate Gary Snyder, the Pulitzer Prize-winning poet who blazed a circuitous trail around Ezra Pound, bypassing Fascism and Confucianism to forge a link between Zen Buddhism, Northwestern Wobblie unionism, and Marxist economic theory. He and Whalen remained close friends throughout Whalen\u27s life. It was Snyder who probably first taught Whalen how to sit still in the Zen meditation posture, a fundamentally ungraspable, trans-rational, non-discursive, and deconstructive form of introspection that influenced Whalen\u27s writing and played a decisive role in his poetics. Shortly after the landmark Six Gallery poetry reading in San Francisco in October, 1955 Snyder moved to Japan to study Zen, leaving Whalen to fend for himself in an apartment he shared in Berkeley with Allen Ginsberg and Jack Kerouac. The journals show that Whalen was clear but shy about his bisexuality. For a period of time he was deeply involved in a love triangle, or rather a pentangle, with two married people, one of them a man, the other Gary Snyder\u27s wife, the poet Joanne Kyger. He remained in the U.S. during the late fifties and early to mid-1960s, a tumultuous six or seven years during which he was unable to support himself financially, alternatively couch-surfing with friends, habitating a shack in the woods on Mt. Tamalpais, bumming free rooms from friends in San Francisco, Berkeley, and Marin County. He also attempted a `straight\u27 job and career in Newport, Oregon, and lived in San Francisco for over two years with his companion and lover Leslie Thompson. Finally, in February, 1966 - at Snyder\u27s behest - Whalen moved to Japan. He taught English for a regular weekly salary in the ancient capital city of Kyoto, spending his spare time reading, writing, and studying Japanese culture, religion, art, theater, and literature. I am presenting here the poet\u27s `pre-Kyoto\u27 journals and fragments

    Maritime expressions:a corpus based exploration of maritime metaphors

    Get PDF
    This study uses a purpose-built corpus to explore the linguistic legacy of Britain’s maritime history found in the form of hundreds of specialised ‘Maritime Expressions’ (MEs), such as TAKEN ABACK, ANCHOR and ALOOF, that permeate modern English. Selecting just those expressions commencing with ’A’, it analyses 61 MEs in detail and describes the processes by which these technical expressions, from a highly specialised occupational discourse community, have made their way into modern English. The Maritime Text Corpus (MTC) comprises 8.8 million words, encompassing a range of text types and registers, selected to provide a cross-section of ‘maritime’ writing. It is analysed using WordSmith analytical software (Scott, 2010), with the 100 million-word British National Corpus (BNC) as a reference corpus. Using the MTC, a list of keywords of specific salience within the maritime discourse has been compiled and, using frequency data, concordances and collocations, these MEs are described in detail and their use and form in the MTC and the BNC is compared. The study examines the transformation from ME to figurative use in the general discourse, in terms of form and metaphoricity. MEs are classified according to their metaphorical strength and their transference from maritime usage into new registers and domains such as those of business, politics, sports and reportage etc. A revised model of metaphoricity is developed and a new category of figurative expression, the ‘resonator’, is proposed. Additionally, developing the work of Lakov and Johnson, Kovesces and others on Conceptual Metaphor Theory (CMT), a number of Maritime Conceptual Metaphors are identified and their cultural significance is discussed

    Massachusetts Domestic and Foreign Corporations Subject to an Excise: For the Use of Assessors (2004)

    Get PDF
    International audienc
    corecore