12,253 research outputs found
Supporting Data mining of large databases by visual feedback queries
In this paper, we describe a query system that provides visual relevance feedback in querying large databases. Our goal is to support the process of data mining by representing as many data items as possible on the display. By arranging and coloring the data items as pixels according to their relevance for the query, the user gets a visual impression of the resulting data set. Using an interactive query interface, the user may change the query dynamically and receives immediate feedback by the visual representation of the resulting data set. Furthermore, by using multiple windows for different parts of a complex query, the user gets visual feedback for each part of the query and, therefore, may easier understand the overall result. Our system allows to represent the largest amount of data that can be visualized on current display technology, provides valuable feedback in querying the database, and allows the user to find results which, otherwise, would remain hidden in the database
Using Visualization to Support Data Mining of Large Existing Databases
In this paper. we present ideas how visualization technology can be used to improve the difficult process of querying very large databases. With our VisDB system, we try to provide visual support not only for the query specification process. but also for evaluating query results and. thereafter, refining the query accordingly. The main idea of our system is to represent as many data items as possible by the pixels of the display device. By arranging and coloring the pixels according to the relevance for the query, the user gets a visual impression of the resulting data set and of its relevance for the query. Using an interactive query interface, the user may change the query dynamically and receives immediate feedback by the visual representation of the resulting data set. By using multiple windows for different parts of the query, the user gets visual feedback for each part of the query and, therefore, may easier understand the overall result. To support complex queries, we introduce the notion of approximate joins which allow the user to find data items that only approximately fulfill join conditions. We also present ideas how our technique may be extended to support the interoperation of heterogeneous databases. Finally, we discuss the performance problems that are caused by interfacing to existing database systems and present ideas to solve these problems by using data structures supporting a multidimensional search of the database
The relationship between IR and multimedia databases
Modern extensible database systems support multimedia data through ADTs. However, because of the problems with multimedia query formulation, this support is not sufficient.\ud
\ud
Multimedia querying requires an iterative search process involving many different representations of the objects in the database. The support that is needed is very similar to the processes in information retrieval.\ud
\ud
Based on this observation, we develop the miRRor architecture for multimedia query processing. We design a layered framework based on information retrieval techniques, to provide a usable query interface to the multimedia database.\ud
\ud
First, we introduce a concept layer to enable reasoning over low-level concepts in the database.\ud
\ud
Second, we add an evidential reasoning layer as an intermediate between the user and the concept layer.\ud
\ud
Third, we add the functionality to process the users' relevance feedback.\ud
\ud
We then adapt the inference network model from text retrieval to an evidential reasoning model for multimedia query processing.\ud
\ud
We conclude with an outline for implementation of miRRor on top of the Monet extensible database system
Portinari: A Data Exploration Tool to Personalize Cervical Cancer Screening
Socio-technical systems play an important role in public health screening
programs to prevent cancer. Cervical cancer incidence has significantly
decreased in countries that developed systems for organized screening engaging
medical practitioners, laboratories and patients. The system automatically
identifies individuals at risk of developing the disease and invites them for a
screening exam or a follow-up exam conducted by medical professionals. A triage
algorithm in the system aims to reduce unnecessary screening exams for
individuals at low-risk while detecting and treating individuals at high-risk.
Despite the general success of screening, the triage algorithm is a
one-size-fits all approach that is not personalized to a patient. This can
easily be observed in historical data from screening exams. Often patients rely
on personal factors to determine that they are either at high risk or not at
risk at all and take action at their own discretion. Can exploring patient
trajectories help hypothesize personal factors leading to their decisions? We
present Portinari, a data exploration tool to query and visualize future
trajectories of patients who have undergone a specific sequence of screening
exams. The web-based tool contains (a) a visual query interface (b) a backend
graph database of events in patients' lives (c) trajectory visualization using
sankey diagrams. We use Portinari to explore diverse trajectories of patients
following the Norwegian triage algorithm. The trajectories demonstrated
variable degrees of adherence to the triage algorithm and allowed
epidemiologists to hypothesize about the possible causes.Comment: Conference paper published at ICSE 2017 Buenos Aires, at the Software
Engineering in Society Track. 10 pages, 5 figure
Interactive retrieval of video using pre-computed shot-shot similarities
A probabilistic framework for content-based interactive video retrieval is described. The developed indexing of video fragments originates from the probability of the user's positive judgment about key-frames of video shots. Initial estimates of the probabilities are obtained from low-level feature representation. Only statistically significant estimates are picked out, the rest are replaced by an appropriate constant allowing efficient access at search time without loss of search quality and leading to improvement in most experiments. With time, these probability estimates are updated from the relevance judgment of users performing searches, resulting in further substantial increases in mean average precision
- âŚ