3,624 research outputs found

    Brain electrical activity discriminant analysis using Reproducing Kernel Hilbert spaces

    Get PDF
    A deep an adequate understanding of the human brain functions has been an objective for interdisciplinar teams of scientists. Different types of technological acquisition methodologies, allow to capture some particular data that is related with brain activity. Commonly, the more used strategies are related with the brain electrical activity, where reflected neuronal interactions are reflected in the scalp and obtained via electrode arrays as time series. The processing of this type of brain electrical activity (BEA) data, poses some challenges that should be addressed carefully due their intrinsic properties. BEA in known to have a nonstationaty behavior and a high degree of variability dependenig of the stimulus or responses that are being adressed..

    Clustering Customer Shopping Trips With Network Structure

    Get PDF
    Moving objects can be tracked with sensors such as RFID tags or GPS devices. Their movement can be represented as sequences of time-stamped locations. Studying such spatio-temporal movement sequences to discover spatial sequential patterns holds promises in many real-world settings. A few interesting applications are customer shopping traverse pattern discovery, vehicle traveling pattern discovery, and route prediction. Traditional spatial data mining algorithms suitable for the Euclidean space are not directly applicable in these settings. We propose a new algorithm to cluster movement paths such as shopping trips for pattern discovery. In our work, we represent the spatio-temporal series as sequences of discrete locations following a pre-defined network. We incorporate a modified version of the Longest Common Subsequence (LCS) algorithm with the network structure to measure the similarity of movement paths. With such spatial networks we implicitly address the existence of spatial obstructs as well. Experiments were performed on both hand-collected real-life trips and simulated trips in grocery shopping. The initial evaluation results show that our proposed approach, called Net-LCSS, can be used to support effective and efficient clustering for shopping trip pattern discovery

    Slicer

    Get PDF
    Explorative data visualization is a widespread tool for gaining insights from datasets. Investigating data in linked visualizations lets users explore potential relationships in their data at will. Furthermore, this type of analysis does not require any technical knowledge, widening the userbase from developers to anyone. Implementing explorative data visualizations in web browsers makes data analysis accessible to anyone with a PC. In addition to accessibility, the available types of visualizations and their interactive latency are essential for the utility of data exploration. Available visualizations limit the number of datasets eligible for use in the application, and latency limits how much exploring the users are willing to do. Existing solutions often do all the computation involved in either the client application or on a backend server. However, using the client limits performance and data size since hardware resources in web browsers are scarce, and sending large datasets over a network is not feasible. Whereas server-based computation often comes with high requirements for server hardware and is limited by network latency and bandwidth on each interaction. This thesis presents Slicer, a framework for creating explorative data visualizations in web browsers. Applications can be created with minimal developer effort, requiring only a description of the visualizations. Slicer implements bar charts and choropleth maps. The visualizations are linked and can be filtered either by brushing or clicking on single targets. To overcome the hurdles of pure client- and server-reliant solutions, Slicer uses a hybrid approach, where prioritized interactions are handled client-side. Recognizing that different types of interactions have different latency thresholds, we trade the cost of switching views for low latency on filtering. To achieve real-time filtering performance, we follow the principle that the chosen resolution of the visualizations, not data size, should limit interactive scalability. We describe use of data tiles accommodating more interactions than shown in earlier work, using an approach based on delta differencing, which ensures constant time complexity when filtering. For computing data tiles, we present techniques for efficient computation on consumer hardware. Our results show that Slicer can offer real-time interactivity on latency-sensitive interactions regardless of data size, averaging above 150Hz on a consumer laptop. For less sensitive interactions, acceptable latency is shown for datasets with tens of millions of records, depending on the resolution of the visualizations

    Differentiable Pattern Set Mining

    Get PDF

    Feature Extraction and Duplicate Detection for Text Mining: A Survey

    Get PDF
    Text mining, also known as Intelligent Text Analysis is an important research area. It is very difficult to focus on the most appropriate information due to the high dimensionality of data. Feature Extraction is one of the important techniques in data reduction to discover the most important features. Proce- ssing massive amount of data stored in a unstructured form is a challenging task. Several pre-processing methods and algo- rithms are needed to extract useful features from huge amount of data. The survey covers different text summarization, classi- fication, clustering methods to discover useful features and also discovering query facets which are multiple groups of words or phrases that explain and summarize the content covered by a query thereby reducing time taken by the user. Dealing with collection of text documents, it is also very important to filter out duplicate data. Once duplicates are deleted, it is recommended to replace the removed duplicates. Hence we also review the literature on duplicate detection and data fusion (remove and replace duplicates).The survey provides existing text mining techniques to extract relevant features, detect duplicates and to replace the duplicate data to get fine grained knowledge to the user

    Mining frequent sequential patterns in data streams using SSM-algorithm.

    Get PDF
    Frequent sequential mining is the process of discovering frequent sequential patterns in data sequences as found in applications like web log access sequences. In data stream applications, data arrive at high speed rates in a continuous flow. Data stream mining is an online process different from traditional mining. Traditional mining algorithms work on an entire static dataset in order to obtain results while data stream mining algorithms work with continuously arriving data streams. With rapid change in technology, there are many applications that take data as continuous streams. Examples include stock tickers, network traffic measurements, click stream data, data feeds from sensor networks, and telecom call records. Mining frequent sequential patterns on data stream applications contend with many challenges such as limited memory for unlimited data, inability of algorithms to scan infinitely flowing original dataset more than once and to deliver current and accurate result on demand. This thesis proposes SSM-Algorithm (sequential stream mining-algorithm) that delivers frequent sequential patterns in data streams. The concept of this work came from FP-Stream algorithm that delivers time sensitive frequent patterns. Proposed SSM-Algorithm outperforms FP-Stream algorithm by the use of a hash based and two efficient tree based data structures. All incoming streams are handled dynamically to improve memory usage. SSM-Algorithm maintains frequent sequences incrementally and delivers most current result on demand. The introduced algorithm can be deployed to analyze e-commerce data where the primary source of the data is click stream data. (Abstract shortened by UMI.)Dept. of Computer Science. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2005 .M668. Source: Masters Abstracts International, Volume: 44-03, page: 1409. Thesis (M.Sc.)--University of Windsor (Canada), 2005

    Knowledge Extraction in Video Through the Interaction Analysis of Activities

    Get PDF
    Video is a massive amount of data that contains complex interactions between moving objects. The extraction of knowledge from this type of information creates a demand for video analytics systems that uncover statistical relationships between activities and learn the correspondence between content and labels. However, those are open research problems that have high complexity when multiple actors simultaneously perform activities, videos contain noise, and streaming scenarios are considered. The techniques introduced in this dissertation provide a basis for analyzing video. The primary contributions of this research consist of providing new algorithms for the efficient search of activities in video, scene understanding based on interactions between activities, and the predicting of labels for new scenes
    corecore