15,976 research outputs found
The Hidden Web, XML and Semantic Web: A Scientific Data Management Perspective
The World Wide Web no longer consists just of HTML pages. Our work sheds
light on a number of trends on the Internet that go beyond simple Web pages.
The hidden Web provides a wealth of data in semi-structured form, accessible
through Web forms and Web services. These services, as well as numerous other
applications on the Web, commonly use XML, the eXtensible Markup Language. XML
has become the lingua franca of the Internet that allows customized markups to
be defined for specific domains. On top of XML, the Semantic Web grows as a
common structured data source. In this work, we first explain each of these
developments in detail. Using real-world examples from scientific domains of
great interest today, we then demonstrate how these new developments can assist
the managing, harvesting, and organization of data on the Web. On the way, we
also illustrate the current research avenues in these domains. We believe that
this effort would help bridge multiple database tracks, thereby attracting
researchers with a view to extend database technology.Comment: EDBT - Tutorial (2011
Deep Temporal-Recurrent-Replicated-Softmax for Topical Trends over Time
Dynamic topic modeling facilitates the identification of topical trends over
time in temporal collections of unstructured documents. We introduce a novel
unsupervised neural dynamic topic model named as Recurrent Neural
Network-Replicated Softmax Model (RNNRSM), where the discovered topics at each
time influence the topic discovery in the subsequent time steps. We account for
the temporal ordering of documents by explicitly modeling a joint distribution
of latent topical dependencies over time, using distributional estimators with
temporal recurrent connections. Applying RNN-RSM to 19 years of articles on NLP
research, we demonstrate that compared to state-of-the art topic models, RNNRSM
shows better generalization, topic interpretation, evolution and trends. We
also introduce a metric (named as SPAN) to quantify the capability of dynamic
topic model to capture word evolution in topics over time.Comment: In Proceedings of the 16th Annual Conference of the North American
Chapter of the Association for Computational Linguistics: Human Language
Technologies (NAACL-HLT 2018
Basic tasks of sentiment analysis
Subjectivity detection is the task of identifying objective and subjective
sentences. Objective sentences are those which do not exhibit any sentiment.
So, it is desired for a sentiment analysis engine to find and separate the
objective sentences for further analysis, e.g., polarity detection. In
subjective sentences, opinions can often be expressed on one or multiple
topics. Aspect extraction is a subtask of sentiment analysis that consists in
identifying opinion targets in opinionated text, i.e., in detecting the
specific aspects of a product or service the opinion holder is either praising
or complaining about
- …