4,254 research outputs found
UPI: A Primary Index for Uncertain Databases
Uncertain data management has received growing attention from industry and academia. Many efforts have been made to optimize uncertain databases, including the development of special index data structures. However, none of these efforts have explored primary (clustered) indexes for uncertain databases, despite the fact that clustering has the potential to offer substantial speedups for non-selective analytic queries on large uncertain databases. In this paper, we propose a new index called a UPI (Uncertain Primary Index) that clusters heap files according to uncertain attributes with both discrete and continuous uncertainty distributions.
Because uncertain attributes may have several possible values, a UPI on an uncertain attribute duplicates tuple data once for each possible value. To prevent the size of the UPI from becoming unmanageable, its size is kept small by placing low-probability tuples in a special Cutoff Index that is consulted only when queries for low-probability values are run. We also propose several other optimizations, including techniques to improve secondary index performance and techniques to reduce maintenance costs and fragmentation by buffering changes to the table and writing updates in sequential batches. Finally, we develop cost models for UPIs to estimate query performance in various settings to help automatically select tuning parameters of a UPI.
We have implemented a prototype UPI and experimented on two real datasets. Our results show that UPIs can significantly (up to two orders of magnitude) improve the performance of uncertain queries both over clustered and unclustered attributes. We also show that our buffering techniques mitigate table fragmentation and keep the maintenance cost as low as or even lower than using an unclustered heap file.National Science Foundation (U.S.) (Grant IIS-0448124)National Science Foundation (U.S.) (Grant IIS-0905553)National Science Foundation (U.S.) (Grant IIS-0916691
Doctor of Philosophy
dissertationX-ray computed tomography (CT) is a widely popular medical imaging technique that allows for viewing of in vivo anatomy and physiology. In order to produce high-quality images and provide reliable treatment, CT imaging requires the precise knowledge of t
RECAP: Towards Precise Radiology Report Generation via Dynamic Disease Progression Reasoning
Automating radiology report generation can significantly alleviate
radiologists' workloads. Previous research has primarily focused on realizing
highly concise observations while neglecting the precise attributes that
determine the severity of diseases (e.g., small pleural effusion). Since
incorrect attributes will lead to imprecise radiology reports, strengthening
the generation process with precise attribute modeling becomes necessary.
Additionally, the temporal information contained in the historical records,
which is crucial in evaluating a patient's current condition (e.g., heart size
is unchanged), has also been largely disregarded. To address these issues, we
propose RECAP, which generates precise and accurate radiology reports via
dynamic disease progression reasoning. Specifically, RECAP first predicts the
observations and progressions (i.e., spatiotemporal information) given two
consecutive radiographs. It then combines the historical records,
spatiotemporal information, and radiographs for report generation, where a
disease progression graph and dynamic progression reasoning mechanism are
devised to accurately select the attributes of each observation and
progression. Extensive experiments on two publicly available datasets
demonstrate the effectiveness of our model.Comment: Accepted by Findings of EMNLP 202
Recommended from our members
Automatic message annotation and semantic interface for context aware mobile computing
This thesis was submitted for the degree of Docter of Philosophy and awarded by Brunel University.In this thesis, the concept of mobile messaging awareness has been investigated by designing and implementing a framework which is able to annotate the short text messages with context ontology for semantic reasoning inference and classification purposes. The annotated metadata of text message keywords are identified and annotated with concepts, entities and knowledge that drawn from ontology without the need of learning process and the proposed framework supports semantic reasoning based messages awareness for categorization purposes. The first stage of the research is developing the framework of facilitating mobile communication with short text annotated messages (SAMS), which facilitates annotating short text message with part of speech tags augmented with an internal and external metadata. In the SAMS framework the annotation process is carried out automatically at the time of composing a message. The obtained metadata is collected from the device’s file system and the message header information which is then accumulated with the message’s tagged keywords to form an XML file, simultaneously. The significance of annotation process is to assist the proposed framework during the search and retrieval processes to identify the tagged keywords and The Semantic Web Technologies are utilised to improve the reasoning mechanism. Later, the proposed framework is further improved “Contextual Ontology based Short Text Messages reasoning (SOIM)”. SOIM further enhances the search capabilities of SAMS by adopting short text message annotation and semantic reasoning capabilities with domain ontology as Domain ontology is modeled into set of ontological knowledge modules that capture features of contextual entities and features of particular event or situation. Fundamentally, the framework SOIM relies on the hierarchical semantic distance to compute an approximated match degree of new set of relevant keywords to their corresponding abstract class in the domain ontology. Adopting contextual ontology leverages the framework performance to enhance the text comprehension and message categorization. Fuzzy Sets and Rough Sets theory have been integrated with SOIM to improve the inference capabilities and system efficiency. Since SOIM is based on the degree of similarity to choose the matched pattern to the message, the issue of choosing the best-retrieved pattern has arisen during the stage of decision-making. Fuzzy reasoning classifier based rules that adopt the Fuzzy Set theory for decision making have been applied on top of SOIM framework in order to increase the accuracy of the classification process with clearer decision. The issue of uncertainty in the system has been addressed by utilising the Rough Sets theory, in which the irrelevant and indecisive properties which affect the framework efficiency negatively have been ignored during the matching process.The Ministry of Higher Education and Scientific Research (IRAQ
- …