3,072 research outputs found
Text segmentation on multilabel documents: A distant-supervised approach
Segmenting text into semantically coherent segments is an important task with
applications in information retrieval and text summarization. Developing
accurate topical segmentation requires the availability of training data with
ground truth information at the segment level. However, generating such labeled
datasets, especially for applications in which the meaning of the labels is
user-defined, is expensive and time-consuming. In this paper, we develop an
approach that instead of using segment-level ground truth information, it
instead uses the set of labels that are associated with a document and are
easier to obtain as the training data essentially corresponds to a multilabel
dataset. Our method, which can be thought of as an instance of distant
supervision, improves upon the previous approaches by exploiting the fact that
consecutive sentences in a document tend to talk about the same topic, and
hence, probably belong to the same class. Experiments on the text segmentation
task on a variety of datasets show that the segmentation produced by our method
beats the competing approaches on four out of five datasets and performs at par
on the fifth dataset. On the multilabel text classification task, our method
performs at par with the competing approaches, while requiring significantly
less time to estimate than the competing approaches.Comment: Accepted in 2018 IEEE International Conference on Data Mining (ICDM
Representation learning of drug and disease terms for drug repositioning
Drug repositioning (DR) refers to identification of novel indications for the
approved drugs. The requirement of huge investment of time as well as money and
risk of failure in clinical trials have led to surge in interest in drug
repositioning. DR exploits two major aspects associated with drugs and
diseases: existence of similarity among drugs and among diseases due to their
shared involved genes or pathways or common biological effects. Existing
methods of identifying drug-disease association majorly rely on the information
available in the structured databases only. On the other hand, abundant
information available in form of free texts in biomedical research articles are
not being fully exploited. Word-embedding or obtaining vector representation of
words from a large corpora of free texts using neural network methods have been
shown to give significant performance for several natural language processing
tasks. In this work we propose a novel way of representation learning to obtain
features of drugs and diseases by combining complementary information available
in unstructured texts and structured datasets. Next we use matrix completion
approach on these feature vectors to learn projection matrix between drug and
disease vector spaces. The proposed method has shown competitive performance
with state-of-the-art methods. Further, the case studies on Alzheimer's and
Hypertension diseases have shown that the predicted associations are matching
with the existing knowledge.Comment: Accepted to appear in 3rd IEEE International Conference on
Cybernetics (Spl Session: Deep Learning for Prediction and Estimation
Scaling behaviour in probabilistic neuronal cellular automata
We study a neural network model of interacting stochastic discrete two--state
cellular automata on a regular lattice. The system is externally tuned to a
critical point which varies with the degree of stochasticity (or the effective
temperature). There are avalanches of neuronal activity, namely spatially and
temporally contiguous sites of activity; a detailed numerical study of these
activity avalanches is presented, and single, joint and marginal probability
distributions are computed. At the critical point, we find that the scaling
exponents for the variables are in good agreement with a mean--field theory.Comment: 7 pages, 4 figures Accepted for publication in Physical Review
Invasion from the skies: the impact of foreign television on India
Increased competition and shrinking budgets have forced public service broadcasters around the world to reconsider their role. Doordarshan, India's public service television network, shares the problems faced by its counterparts in more developed countries. Although it continues to enjoy the luxury of being the only television network broadcasting its programs from within national boundaries, it has had to change its policies and programming to compete with foreign television channels including Murdoch's Star TV. However, it is the Indian audience that has benefited most from this competition from the skies in the form of improved quality and quantity of programs. This paper reports on an audience survey carried out in India earlier this year to gauge television viewers' perception of these benefits. The paper also gives background on the developments in the television industry in India
Rendering Afghanistan legible: borders, frontiers and the ‘state’ of Afghanistan
The aim of this article is to show how the partial colonisation of Afghanistan and its ‘frontier status’ have generated discourses of state failure, which have led to the construal of Afghanistan as a zone of exception and of permanent crisis. The main argument is that colonial spatialisations have an enduring legacy that continues to structure the ways in which we experience and think about the Afghan state today. The construction of Afghanistan today as a ‘failed state’ has emerged through a historical (Anglophone) discourse that has relied heavily on the trope of the ‘frontier’ to make sense of the place between India and Central Asia. Thus, the ‘frontier’ has played a formative role in defining Afghanistan as a state and space and this plays out in how we interact – through representation, policies, and intervention – with the state in the global realm today. The import of this extends far and wide and has ramifications for our understanding of coloniality and liminality in contemporary international relations (IR), including scholarship on sovereignty statehood, and borders. It also has implications for a range of states and places that are considered ‘fragile’, ‘failing’, or ‘failed
Special Issue “Gynaecological Cancers Risk: Breast Cancer, Ovarian Cancer and Endometrial Cancer”
Over the last decade there have been significant advances and developments in our understanding of factors affecting women’s cancer risk, our ability to identify individuals at increased risk and risk stratify populations, as well as implement and evaluate strategies for screening and prevention
Consumer Search Behavior on the Mobile Internet: An Empirical Analysis
The increasing diffusion of smartphones and tablet computers has facilitated access to product information by providing Internet access anywhere and at any time. As a result, consumers are increasingly using the mobile Internet to search for product information to help them in their purchase decisions. However, there is very little documentation of how, where and when consumers actually carry out such search. Using location-based data from a leading European product information and barcode-scanning app that contains more than 80 million observations, this study provides insights using actual consumer search behavior. The results show that consumer search on the mobile Internet is not bound to store opening hours and is likely to happen to a large extent as ongoing search during consumption. Furthermore, consumers’ geographic mobility is positively correlated and previous search experience is negatively correlated with their search intensity. Finally, access to more types of information via search results, especially product related information, reduces further search on price information, suggesting that product information content can lower price sensitivity.http://deepblue.lib.umich.edu/bitstream/2027.42/111728/1/1275_Manchanda.pd
- …