348 research outputs found
Dispersion chromium product Summary report
Vapor deposition and precipitation hardening methods for producing dispersion chromium alloy
Identifying Retweetable Tweets with a Personalized Global Classifier
In this paper we present a method to identify tweets that a user may find
interesting enough to retweet. The method is based on a global, but
personalized classifier, which is trained on data from several users,
represented in terms of user-specific features. Thus, the method is trained on
a sufficient volume of data, while also being able to make personalized
decisions, i.e., the same post received by two different users may lead to
different classification decisions. Experimenting with a collection of approx.\
130K tweets received by 122 journalists, we train a logistic regression
classifier, using a wide variety of features: the content of each tweet, its
novelty, its text similarity to tweets previously posted or retweeted by the
recipient or sender of the tweet, the network influence of the author and
sender, and their past interactions. Our system obtains F1 approx. 0.9 using
only 10 features and 5K training instances.Comment: This is a long paper version of the extended abstract titled "A
Personalized Global Filter To Predict Retweets", of the same authors, which
was published in the 25th ACM UMAP conference in Bratislava, Slovakia, in
July 201
A HMM POS Tagger for Micro-blogging Type Texts
The high volume of communication via micro-blogging type messages has created an increased demand for text processing tools customised the unstructured text genre. The available text processing tools developed on structured texts has been shown to deteriorate significantly when used on unstructured, micro-blogging type texts. In this paper, we present the results of testing a HMM based POS (Part-Of-Speech) tagging model customized for unstructured texts. We also evaluated the tagger against published CRF based state-of-the-art POS tagging models customized for Tweet messages using three publicly available Tweet corpora. Finally, we did cross-validation tests with both the taggers by training them on one Tweet corpus and testing them on another one
Self-Control in Cyberspace: Applying Dual Systems Theory to a Review of Digital Self-Control Tools
Many people struggle to control their use of digital devices. However, our
understanding of the design mechanisms that support user self-control remains
limited. In this paper, we make two contributions to HCI research in this
space: first, we analyse 367 apps and browser extensions from the Google Play,
Chrome Web, and Apple App stores to identify common core design features and
intervention strategies afforded by current tools for digital self-control.
Second, we adapt and apply an integrative dual systems model of self-regulation
as a framework for organising and evaluating the design features found. Our
analysis aims to help the design of better tools in two ways: (i) by
identifying how, through a well-established model of self-regulation, current
tools overlap and differ in how they support self-control; and (ii) by using
the model to reveal underexplored cognitive mechanisms that could aid the
design of new tools.Comment: 11.5 pages (excl. references), 6 figures, 1 tabl
Deep Memory Networks for Attitude Identification
We consider the task of identifying attitudes towards a given set of entities
from text. Conventionally, this task is decomposed into two separate subtasks:
target detection that identifies whether each entity is mentioned in the text,
either explicitly or implicitly, and polarity classification that classifies
the exact sentiment towards an identified entity (the target) into positive,
negative, or neutral.
Instead, we show that attitude identification can be solved with an
end-to-end machine learning architecture, in which the two subtasks are
interleaved by a deep memory network. In this way, signals produced in target
detection provide clues for polarity classification, and reversely, the
predicted polarity provides feedback to the identification of targets.
Moreover, the treatments for the set of targets also influence each other --
the learned representations may share the same semantics for some targets but
vary for others. The proposed deep memory network, the AttNet, outperforms
methods that do not consider the interactions between the subtasks or those
among the targets, including conventional machine learning methods and the
state-of-the-art deep learning models.Comment: Accepted to WSDM'1
Berkeley Accelerator Space Effects (BASE) Light Ion FacilityUpgrade
The BASE Light Ion Facility upgrades have been completed. All proton beams are now delivered to Cave 4A. New control software, a larger diameter beam window, and improved quality assurance measures have been added
From unlabelled tweets to Twitter-specific opinion words
In this article, we propose a word-level classification model for automatically generating a Twitter-specific opinion lexicon from a corpus of unlabelled tweets. The tweets from the corpus are represented by two vectors: a bag-of-words vector and a semantic vector based on word-clusters. We propose a distributional representation for words by treating them as the centroids of the tweet vectors in which they appear. The lexicon generation is conducted by training a word-level classifier using these centroids to form the instance space and a seed lexicon to label the training instances. Experimental results show that the two types of tweet vectors complement each other in a statistically significant manner and that our generated lexicon produces significant improvements for tweet-level polarity classification
Emerging Approaches to DNA Data Storage: Challenges and Prospects
With the total amount of worldwide data skyrocketing, the global data storage demand is predicted to grow to 1.75 × 1014GB by 2025. Traditional storage methods have difficulties keeping pace given that current storage media have a maximum density of 103GB/mm3. As such, data production will far exceed the capacity of currently available storage methods. The costs of maintaining and transferring data, as well as the limited lifespans and significant data losses associated with current technologies also demand advanced solutions for information storage. Nature offers a powerful alternative through the storage of information that defines living organisms in unique orders of four bases (A, T, C, G) located in molecules called deoxyribonucleic acid (DNA). DNA molecules as information carriers have many advantages over traditional storage media. Their high storage density, potentially low maintenance cost, ease of synthesis, and chemical modification make them an ideal alternative for information storage. To this end, rapid progress has been made over the past decade by exploiting user-defined DNA materials to encode information. In this review, we discuss the most recent advances of DNA-based data storage with a major focus on the challenges that remain in this promising field, including the current intrinsic low speed in data writing and reading and the high cost per byte stored. Alternatively, data storage relying on DNA nanostructures (as opposed to DNA sequence) as well as on other combinations of nanomaterials and biomolecules are proposed with promising technological and economic advantages. In summarizing the advances that have been made and underlining the challenges that remain, we provide a roadmap for the ongoing research in this rapidly growing field, which will enable the development of technological solutions to the global demand for superior storage methodologies
Evaluation of an interactive, case-based review session in teaching medical microbiology
<p>Abstract</p> <p>Background</p> <p>Oklahoma State University-Center for Health Sciences (OSU-CHS) has replaced its microbiology wet laboratory with a variety of tutorials including a case-based interactive session called Microbial Jeopardy!. The question remains whether the time spent by students and faculty in the interactive case-based tutorial is worthwhile? This study was designed to address this question by analyzing both student performance data and assessing students' perceptions regarding the tutorial.</p> <p>Methods</p> <p>Both quantitative and qualitative data were used in the current study. Part One of the study involved assessing student performance using archival records of seven case-based exam questions used in the 2004, 2005, 2006, and 2007 OSU-CHS Medical Microbiology course. Two sample t-tests for proportions were used to test for significant differences related to tutorial usage. Part Two used both quantitative and qualitative means to assess student's perceptions of the Microbial Jeopardy! session. First, a retrospective survey was administered to students who were enrolled in Medical Microbiology in 2006 or 2007. Second, responses to open-ended items from the 2008 course evaluations were reviewed for comments regarding the Microbial Jeopardy! session.</p> <p>Results</p> <p>Both student performance and student perception data support continued use of the tutorials. Quantitative and qualitative data converge to suggest that students like and learn from the interactive, case-based session.</p> <p>Conclusion</p> <p>The case-based tutorial appears to improve student performance on case-based exam questions. Additionally, students perceived the tutorial as helpful in preparing for exam questions and reviewing the course material. The time commitment for use of the case-based tutorial appears to be justified.</p
- …