Search CORE

161 research outputs found

Text Mining Infrastructure in R

Author: David Meyer
Ingo Feinerer
Kurt Hornik
Publication venue
Publication date
Field of study

During the last decade text mining has become a widely used discipline utilizing statistical and machine learning methods. We present the tm package which provides a framework for text mining applications within R. We give a survey on text mining facilities in R and explain how typical application tasks can be carried out using our framework. We present techniques for count-based analysis methods, text clustering, text classification and string kernels.

Research Papers in Economics

Extracting predictive models from marked-p free-text documents at the Royal Botanic Gardens, Kew, London

Author: A. Tucker
E. Steele
I. Feinerer
M.R. Evans
R. Feldman
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

In this paper we explore the combination of text-mining, un-supervised and supervised learning to extract predictive models from a corpus of digitised historical floras. These documents deal with the nomenclature, geographical distribution, ecology and comparative morphology of the species of a region. Here we exploit the fact that portions of text in the floras are marked up as different types of trait and habitat. We infer models from these different texts that can predict different habitat-types based upon the traits of plant species. We also integrate plant taxonomy data in order to assist in the validation of our models. We have shown that by clustering text describing the habitat of different floras we can identify a number of important and distinct habitats that are associated with particular families of species along with statistical significance scores. We have also shown that by using these discovered habitat-types as labels for supervised learning we can predict them based upon a subset of traits, identified using wrapper feature selection

Crossref

British Library (BL) Shared Research Repository

Brunel University Research Archive

Identifying patient experience from online resources via sentiment analysis and topic modelling

Author: Andrew S.
Choices NHS
Feinerer I.
Greaves F.
Jason W.
Manary M.P.
Meyer D.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 06/12/2016
Field of study

Crossref

Royal Holloway - Pure

Lossless Selection Views under Constraints

Author: Feinerer Ingo
Franconi Enrico
Guagliardo Paolo
Publication venue
Publication date: 01/01/2014
Field of study

The problem of updating a database through a set of views consists in propagat-ing updates of the views to the base relations over which the view relations are defined, so that the changes to the database reflect exactly those to the views. This is a classical problem in database research, known as the view update prob

CiteSeerX

Edinburgh Research Explorer

A hashtag worth a thousand words: Discursive strategies around #JeNeSuisPasCharlie after the 2015 Charlie Hebdo shooting

Author: Badouard R.
Bruns A.
Feinerer I.
Freelon D.
Giglietto F.
Hampton K.
James N. A.
Mehan H.
Morozov E.
Morstatter F.
Walter N.
Publication venue: 'SAGE Publications'
Publication date: 01/01/2017
Field of study

Following a shooting attack by two self-proclaimed Islamist gunmen at the offices of French satirical weekly Charlie Hebdo on 7 January 2015, there emerged the hashtag #JeSuisCharlie on Twitter as an expression of solidarity and support for the magazine’s right to free speech. Almost simultaneously, however, there was also #JeNeSuisPasCharlie explicitly countering the former, affirmative hashtag. Based on a multimethod analysis of 74,047 tweets containing #JeNeSuisPasCharlie posted between 7 and 11 January, this article reveals that users of the hashtag under study employed various discursive strategies and tactics to challenge the mainstream framing of the shooting as the universal value of freedom of expression being threatened by religious extremism, while protecting themselves from the risk of being viewed as disrespecting victims or endorsing the violence committed. The significance of this study is twofold. First, it extends the literature on strategic speech acts by examining how such acts take place in a social media context. Second, it highlights the need for a multidimensional and reflective methodology when dealing with data mined from social media

Archivio istituzionale della ricerca - Università di Urbino

Crossref

Directory of Open Access Journals

SOAS Research Online

A comparison of tools for teaching formal software verification

Author: D Gries
DA Patterson
E Dijkstra
E Stiller
EM Clarke
EM Clarke
Gernot Salzer
Ingo Feinerer
MRA Huth
PJ Denning
S Owre
W Ahrendt
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Optimal and Automated Deployment for Microservices

Author: A Bergmayr
A Brogi
D Merkel
E Ábrahám
EB Johnsen
F Durán
I Feinerer
J Humble
J Mauro
K Hightower
M Bravetti
N Dragoni
R Cosmo Di
R Cosmo Di
S Gouw de
Publication venue
Publication date: 01/01/2019
Field of study

Microservices are highly modular and scalable Service Oriented Architectures. They underpin automated deployment practices like Continuous Deployment and Autoscaling. In this paper, we formalize these practices and show that automated deployment - proven undecidable in the general case - is algorithmically treatable for microservices. Our key assumption is that the configuration life-cycle of a microservice is split into two phases: (i) creation, which entails establishing initial connections with already available microservices, and (ii) subsequent binding/unbinding with other microservices. To illustrate the applicability of our approach, we implement an automatic optimal deployment tool and compute deployment plans for a realistic microservice architecture, modeled in the Abstract Behavioral Specification (ABS) language

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

The Opinion Management Framework: Identifying and addressing customer concerns extracted from online product reviews

Author: Blei
Blei
Bruce Spencer
Chang
Eleanna Kafeza
Feinerer
Feras Al-Obeidat
Go
Gottschalk
Grün
Guo
Hopcroft
King
Kolisch
Li
Lin
Liu
Loughran
Mei
Mohammad
Ong
Phan
Purnawirawan
Rana
Stone
Sun
Słowiński
Wu
Yan
Publication venue: 'Elsevier BV'
Publication date: 01/01/2018
Field of study

© 2017 Elsevier B.V. Online product reviews appear in many e-commerce websites and help merchants understand any obstacles experienced by existing customers. Negative reviews can discourage potential customers, especially when such reviews appear with no response from the merchant. After the appearance of an unfavourable review, the merchant is at risk of incurring negative impact on the community of present and future customers, which can harm the business. He or she may be able to deflect this by promptly communicating any planned actions, completing them, and reporting that they are complete. The initial communication is the most urgent. When presented with a set of online reviews, a merchant\u27s predicament is to quickly decide what tasks need to be done, which are the most important, and when each can be completed. In this paper, we describe our Opinion Management Framework that assists a merchant to quickly identify, select, and schedule tasks that can rectify issues mentioned in online reviews. We also describe an interactive web-based prototype that helps the business owner (1) to select a set of tasks with an optimal cost/benefit tradeoff, (2) to ensure that all tasks can be completed within a specific time limit, and (3) to conservatively estimate a completion date for each issue\u27s resolution

ZU Scholars (Zayed University)

Crossref

How Digital Are the Digital Humanities? An Analysis of Two Scholarly Blogging Platforms

Author: A Gruzd
B Grün
C Puschmann
C Puschmann
C Puschmann
C Ross
CL Borgman
Cornelius Puschmann
D Berry
DM Blei
ET Meyer
F Moretti
H Shema
H Wickham
I Feinerer
I Rowlands
J Bar-Ilan
J Moody
J Moody
JA Evans
JB Kruskal
L Leydesdorff
M Callon
M Mahrt
M Nentwich
M Taddy
Marco Bastos
MG Kirschenbaum
MK Gold
P Juola
S Schreibman
SR Lipsitz
T McPherson
TJ Pinch
Vincent Larivière
WH Dutton
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2015
Field of study

In this paper we compare two academic networking platforms, HASTAC and Hypotheses, to show the distinct ways in which they serve specific communities in the Digital Humanities (DH) in different national and disciplinary contexts. After providing background information on both platforms, we apply co-word analysis and topic modeling to show thematic similarities and differences between the two sites, focusing particularly on how they frame DH as a new paradigm in humanities research. We encounter a much higher ratio of posts using humanities-related terms compared to their digital counterparts, suggesting a one-way dependency of digital humanities-related terms on the corresponding unprefixed labels. The results also show that the terms digital archive, digital literacy, and digital pedagogy are relatively independent from the respective unprefixed terms, and that digital publishing, digital libraries, and digital media show considerable cross-pollination between the specialization and the general noun. The topic modeling reproduces these findings and reveals further differences between the two platforms. Our findings also indicate local differences in how the emerging field of DH is conceptualized and show dynamic topical shifts inside these respective contexts

Public Library of Science (PLOS)

City Research Online

Crossref

Directory of Open Access Journals

PubMed Central

Zeppelin Universität (ZU)

DataSHIELD – new directions and dimensions

Author: Andrew Turner
Avraam
Berg
Boulton
Budin-Ljøsne
Burton
Butters
Butters
Butters
Cai
Carter
Coffey
Dehghan
Demetris Avraam
Doiron
Donalek
Feinerer
Ford
Fortier
Gaye
Howard
Hundepool
Iruthayarajah
James Baker
Jonathan A. Tedds
Jones
Jones
Jones
Jones
Kamel Boulos
Karr
Kratz
Lappalainen
Lyons
Madeleine Murtagh
McGready
Meystre
Miwa
Murtagh
Murtagh
Ohno-Machado
Oliver W. Butters
Olshannikova
Paul R. Burton
Platt
Power
Rak
Rebecca C. Wilson
Ross
Sastry
Schendel
Seth
Shlomo
Sudlow
Suissa
Sweeney
Wallace
Wilson
Wilson
Wolfson
Wu
Yuan
Zhou
Zijlema
Publication venue: 'Ubiquity Press, Ltd.'
Publication date: 01/01/2017
Field of study

In disciplines such as biomedicine and social sciences, sharing and combining sensitive individual-level data is often prohibited by ethical-legal or governance constraints and other barriers such as the control of intellectual property or the huge sample sizes. DataSHIELD (Data Aggregation Through Anonymous Summary-statistics from Harmonised Individual-levEL Databases) is a distributed approach that allows the analysis of sensitive individual-level data from one study, and the co-analysis of such data from several studies simultaneously without physically pooling them or disclosing any data. Following initial proof of principle, a stable DataSHIELD platform has now been implemented in a number of epidemiological consortia. This paper reports three new applications of DataSHIELD including application to post-publication sensitive data analysis, text data analysis and privacy protected data visualisation. Expansion of DataSHIELD analytic functionality and application to additional data types demonstrate the broad applications of the software beyond biomedical sciences

University of Liverpool Repository

Crossref

Directory of Open Access Journals

Oxford University Research Archive

Sussex Research Online

Explore Bristol Research

Leicester Research Archive