93 research outputs found
On Identifying Hashtags in Disaster Twitter Data
Tweet hashtags have the potential to improve the search for information
during disaster events. However, there is a large number of disaster-related
tweets that do not have any user-provided hashtags. Moreover, only a small
number of tweets that contain actionable hashtags are useful for disaster
response. To facilitate progress on automatic identification (or extraction) of
disaster hashtags for Twitter data, we construct a unique dataset of
disaster-related tweets annotated with hashtags useful for filtering actionable
information. Using this dataset, we further investigate Long Short Term
Memory-based models within a Multi-Task Learning framework. The best performing
model achieves an F1-score as high as 92.22%. The dataset, code, and other
resources are available on Github
Concept-based Interactive Query Expansion Support Tool (CIQUEST)
This report describes a three-year project (2000-03) undertaken in the Information Studies
Department at The University of Sheffield and funded by Resource, The Council for
Museums, Archives and Libraries. The overall aim of the research was to provide user
support for query formulation and reformulation in searching large-scale textual resources
including those of the World Wide Web. More specifically the objectives were: to investigate
and evaluate methods for the automatic generation and organisation of concepts derived from
retrieved document sets, based on statistical methods for term weighting; and to conduct
user-based evaluations on the understanding, presentation and retrieval effectiveness of
concept structures in selecting candidate terms for interactive query expansion.
The TREC test collection formed the basis for the seven evaluative experiments conducted in
the course of the project. These formed four distinct phases in the project plan. In the first
phase, a series of experiments was conducted to investigate further techniques for concept
derivation and hierarchical organisation and structure. The second phase was concerned with
user-based validation of the concept structures. Results of phases 1 and 2 informed on the
design of the test system and the user interface was developed in phase 3. The final phase
entailed a user-based summative evaluation of the CiQuest system.
The main findings demonstrate that concept hierarchies can effectively be generated from
sets of retrieved documents and displayed to searchers in a meaningful way. The approach
provides the searcher with an overview of the contents of the retrieved documents, which in
turn facilitates the viewing of documents and selection of the most relevant ones. Concept
hierarchies are a good source of terms for query expansion and can improve precision. The
extraction of descriptive phrases as an alternative source of terms was also effective. With
respect to presentation, cascading menus were easy to browse for selecting terms and for
viewing documents. In conclusion the project dissemination programme and future work are
outlined
Proceedings of the Seventh International Conference Formal Approaches to South Slavic and Balkan languages
Proceedings of the Seventh International Conference Formal Approaches to South Slavic and Balkan Languages publishes 17 papers that were presented at the conference organised in Dubrovnik, Croatia, 4-6 Octobre 2010
Enhancing the interactivity of a clinical decision support system by using knowledge engineering and natural language processing
Mental illness is a serious health problem and it affects many people. Increasingly,Clinical Decision Support Systems (CDSS) are being used for diagnosis and it is important to improve the reliability and performance of these systems. Missing a potential clue or a wrong diagnosis can have a detrimental effect on the patient's quality of life and could lead to a fatal outcome. The context of this research is the Galatean Risk and Safety Tool (GRiST), a mental-health-risk assessment system. Previous research has shown that success of a CDSS depends on its ease of use, reliability and interactivity. This research addresses these concerns for the GRiST by deploying data mining techniques. Clinical narratives and numerical data have both been analysed for this purpose.Clinical narratives have been processed by natural language processing (NLP)technology to extract knowledge from them. SNOMED-CT was used as a reference ontology and the performance of the different extraction algorithms have been compared. A new Ensemble Concept Mining (ECM) method has been proposed, which may eliminate the need for domain specific phrase annotation requirements. Word embedding has been used to filter phrases semantically and to build a semantic representation of each of the GRiST ontology nodes.The Chi-square and FP-growth methods have been used to find relationships between GRiST ontology nodes. Interesting patterns have been found that could be used to provide real-time feedback to clinicians. Information gain has been used efficaciously to explain the differences between the clinicians and the consensus risk. A new risk management strategy has been explored by analysing repeat assessments. A few novel methods have been proposed to perform automatic background analysis of the patient data and improve the interactivity and reliability of GRiST and similar systems
Recent Advances in Social Data and Artificial Intelligence 2019
The importance and usefulness of subjects and topics involving social data and artificial intelligence are becoming widely recognized. This book contains invited review, expository, and original research articles dealing with, and presenting state-of-the-art accounts pf, the recent advances in the subjects of social data and artificial intelligence, and potentially their links to Cyberspace
- …