147,555 research outputs found
Bibliography Data Mining and Data Visualization
Data mining is a concept of discovering meaningful patterns from large data repositories, and Data visualization is a graphical representation of data using shapes, colors and images for a better conceptualization. These two techniques have been in use for a long time now and are used together in number of fields to gain a better perception of the data. Bibliographic data is widely used in academic and scientific literature fields and this project deals with data mining and data visualization of bibliographic data downloaded from Citeseer Citation Indexing system. The downloaded metadata is extracted into the database, looking for patterns in the data. The extracted data is then queried for the search string and presented to the user using interactive visualization where the user can navigate through the records for better conceptualization of the data. The data is further color coded to define the importance of each record extracted
Information Visualization and Visual Data Mining
Data visualization is the graphical display of abstract information for two purposes: sense-making (also called data analysis) and communication. Important stories live in our data and data visualization is a powerful means to discover and understand these stories, and then to present them to others. In this paper, we propose a classification of information visualization and visual data mining techniques which is based on the data type to be visualized, the visualization technique and the interaction and distortion technique. We exemplify the classification using a few examples, most of them referring to techniques and systems presented in this special issue
New data analytics and visualization methods in personal data mining, cancer data analysis and sports data visualization
In this dissertation, we discuss a reading profiling system, a biological data visualization system and a sports visualization system. Self-tracking is getting increasingly popular in the field of personal informatics. Reading profiling can be used as a personal data collection method. We present UUAT, an unintrusive user attention tracking system. In UUAT, we used user interaction data to develop technologies that help to pinpoint a users reading region (RR). Based on computed RR and user interaction data, UUAT can identify a readers reading struggle or interest. A biomarker is a measurable substance that may be used as an indicator of a particular disease. We developed CancerVis for visual and interactive analysis of cancer data and demonstrate how to apply this platform in cancer biomarker research. CancerVis provides interactive multiple views from different perspectives of a dataset. The views are synchronized so that users can easily link them to a same data entry. Furthermore, CancerVis supports data mining practice in cancer biomarker, such as visualization of optimal cutpoints and cutthrough exploration. Tennis match summarization helps after-live sports consumers assimilate an interested match. We developed TennisVis, a comprehensive match summarization and visualization platform. TennisVis offers chart- graph for a client to quickly get match facts. Meanwhile, TennisVis offers various queries of tennis points to satisfy diversified client preferences (such as volley shot, many-shot rally) of tennis fans. Furthermore, TennisVis offers video clips for every single tennis point and a recommendation rating is computed for each tennis play. A case study shows that TennisVis identifies more than 75% tennis points in full time match
Integrating E-Commerce and Data Mining: Architecture and Challenges
We show that the e-commerce domain can provide all the right ingredients for
successful data mining and claim that it is a killer domain for data mining. We
describe an integrated architecture, based on our expe-rience at Blue Martini
Software, for supporting this integration. The architecture can dramatically
reduce the pre-processing, cleaning, and data understanding effort often
documented to take 80% of the time in knowledge discovery projects. We
emphasize the need for data collection at the application server layer (not the
web server) in order to support logging of data and metadata that is essential
to the discovery process. We describe the data transformation bridges required
from the transaction processing systems and customer event streams (e.g.,
clickstreams) to the data warehouse. We detail the mining workbench, which
needs to provide multiple views of the data through reporting, data mining
algorithms, visualization, and OLAP. We con-clude with a set of challenges.Comment: KDD workshop: WebKDD 200
Recommended from our members
Towards a Domain – Specific Comparative Analysis of Data Mining Tools
Advancement in technology has brought in widespread adoption and utilization of data mining tools. Successful implementation of data mining requires a careful assessment of the various data mining tools. Although several works have compared data mining tools based on usability, opensource, integrated data mining tools for statistical analysis, big/small scale, and data visualization, none of them has suggested the tools for various industry-sectors. This paper attempts to provide a comparative study of various data mining tools based on popularity and usage among various industry-sectors such as business, education, and healthcare. The factors used in the comparison are performance and scalability, data access, data preparation, data exploration and visualization, advanced modeling capabilities, programming language, operating system, interfaces, ease of use, and price/license. The following popular data mining tools are assessed: SAS Enterprise Miner, KNIME, and R for business, Moodle Learning Analytics, Blackboard Analytics, and Canvas for education, and RapidMiner, IBM Watson Health, and Tableau for healthcare. It also discusses the critical issues and challenges associated with the adoption of data mining tools. Furthermore, it suggests possible solutions to help various industries choose the best data mining tool that covers their respective data mining requirements
- …