4,450 research outputs found

    Multi-Level Visual Alphabets

    Get PDF
    A central debate in visual perception theory is the argument for indirect versus direct perception; i.e., the use of intermediate, abstract, and hierarchical representations versus direct semantic interpretation of images through interaction with the outside world. We present a content-based representation that combines both approaches. The previously developed Visual Alphabet method is extended with a hierarchy of representations, each level feeding into the next one, but based on features that are not abstract but directly relevant to the task at hand. Explorative benchmark experiments are carried out on face images to investigate and explain the impact of the key parameters such as pattern size, number of prototypes, and distance measures used. Results show that adding an additional middle layer improves results, by encoding the spatial co-occurrence of lower-level pattern prototypes

    Feature Extraction and Duplicate Detection for Text Mining: A Survey

    Get PDF
    Text mining, also known as Intelligent Text Analysis is an important research area. It is very difficult to focus on the most appropriate information due to the high dimensionality of data. Feature Extraction is one of the important techniques in data reduction to discover the most important features. Proce- ssing massive amount of data stored in a unstructured form is a challenging task. Several pre-processing methods and algo- rithms are needed to extract useful features from huge amount of data. The survey covers different text summarization, classi- fication, clustering methods to discover useful features and also discovering query facets which are multiple groups of words or phrases that explain and summarize the content covered by a query thereby reducing time taken by the user. Dealing with collection of text documents, it is also very important to filter out duplicate data. Once duplicates are deleted, it is recommended to replace the removed duplicates. Hence we also review the literature on duplicate detection and data fusion (remove and replace duplicates).The survey provides existing text mining techniques to extract relevant features, detect duplicates and to replace the duplicate data to get fine grained knowledge to the user

    SeLINA: a Self-Learning Insightful Network Analyzer

    Get PDF
    Understanding the behavior of a network from a large scale traffic dataset is a challenging problem. Big data frameworks offer scalable algorithms to extract information from raw data, but often require a sophisticated fine-tuning and a detailed knowledge of machine learning algorithms. To streamline this process, we propose SeLINA (Self-Learning Insightful Network Analyzer), a generic, self-tuning, simple tool to extract knowledge from network traffic measurements. SeLINA includes different data analytics techniques providing self-learning capabilities to state-of-the-art scalable approaches, jointly with parameter auto-selection to off-load the network expert from parameter tuning. We combine both unsupervised and supervised approaches to mine data with a scalable approach. SeLINA embeds mechanisms to check if the new data fits the model, to detect possible changes in the traffic, and to, possibly automatically, trigger model rebuilding. The result is a system that offers human-readable models of the data with minimal user intervention, supporting domain experts in extracting actionable knowledge and highlighting possibly meaningful interpretations. SeLINA's current implementation runs on Apache Spark. We tested it on large collections of realworld passive network measurements from a nationwide ISP, investigating YouTube and P2P traffic. The experimental results confirmed the ability of SeLINA to provide insights and detect changes in the data that suggest further analyse

    SeLINA: a Self-Learning Insightful Network Analyzer

    Get PDF
    Understanding the behavior of a network from a large scale traffic dataset is a challenging problem. Big data frameworks offer scalable algorithms to extract information from raw data, but often require a sophisticated fine-tuning and a detailed knowledge of machine learning algorithms. To streamline this process, we propose SeLINA (Self-Learning Insightful Network Analyzer), a generic, self-tuning, simple tool to extract knowledge from network traffic measurements. SeLINA includes different data analytics techniques providing self-learning capabilities to state-of-the-art scalable approaches, jointly with parameter auto-selection to off-load the network expert from parameter tuning. We combine both unsupervised and supervised approaches to mine data with a scalable approach. SeLINA embeds mechanisms to check if the new data fits the model, to detect possible changes in the traffic, and to, possibly automatically, trigger model rebuilding. The result is a system that offers human-readable models of the data with minimal user intervention, supporting domain experts in extracting actionable knowledge and highlighting possibly meaningful interpretations. SeLINA’s current implementation runs on Apache Spark. We tested it on large collections of realworld passive network measurements from a nationwide ISP, investigating YouTube and P2P traffic. The experimental results confirmed the ability of SeLINA to provide insights and detect changes in the data that suggest further analyses

    "CHARACTERIZATION OF SLAUGHTERED AND NON-SLAUGHTERED GOAT MEAT AT LOW FREQUENCIES"

    Get PDF
    The electrical stimulation of meat has a high potential for use in the quality control of meat tissues during the past two decades. Dielectric spectroscopy is the most used technique to measure the electrical properties of tissues. Open ended coaxial cable or two parallel plates integrated with network analyzer, impedance analyzer or LCZ meter have been used to measure the dielectric properties of meat for different purposes. The purpose of this research is to construct a capacitive device capable of differentiating slaughtered and non-slaughtered goat meats, by determining the dielectric properties of goat meat at various frequencies and storage times. The detector cell has two circular platinum plates assembled on the micrometer barrel encased within a perspex box material to form the capacitor. The test rig is validated to insure it is working well. Two goats were slaughtered in the same environment. One of the goats was slaughtered properly (Islamic method) and the second one was killed by garrote. The measurements were done on the hindlimb muscles. The sizes of samples were 2 em diameter and 5 mm thick. The slaughtered and non-slaughtered meat samples were separately placed between the capacitor plates. The capacitance and dissipation factor were measured across the capacitor device which was connected to a LCR meter. The experiment was repeated for various frequencies (from I 00 Hz to 2 kHz), and at different storage times (at I day after slaughtering to 10 days). Maxwell Garnett mixing rule was applied to obtain the theoretical value of the effective permittivity by using goat muscle and blood permittivity. The results show that the device is able to differentiate slaughtered and non-slaughtered goat meat. At all applied frequencies, the relative permittivity of the non-slaughtered meat were clearly more than the relative permittivity of the slaughtered meat which agrees with the simulation results. The dissipation factor of the non-slaughtered meat was less than the dissipation factor of the slaughtered meat

    A survey of kernel and spectral methods for clustering

    Get PDF
    Clustering algorithms are a useful tool to explore data structures and have been employed in many disciplines. The focus of this paper is the partitioning clustering problem with a special interest in two recent approaches: kernel and spectral methods. The aim of this paper is to present a survey of kernel and spectral clustering methods, two approaches able to produce nonlinear separating hypersurfaces between clusters. The presented kernel clustering methods are the kernel version of many classical clustering algorithms, e.g., K-means, SOM and neural gas. Spectral clustering arise from concepts in spectral graph theory and the clustering problem is configured as a graph cut problem where an appropriate objective function has to be optimized. An explicit proof of the fact that these two paradigms have the same objective is reported since it has been proven that these two seemingly different approaches have the same mathematical foundation. Besides, fuzzy kernel clustering methods are presented as extensions of kernel K-means clustering algorithm. (C) 2007 Pattem Recognition Society. Published by Elsevier Ltd. All rights reserved

    CAS-MINE: Providing personalized services in context-aware applications by means of generalized rules

    Get PDF
    Context-aware systems acquire and exploit information on the user context to tailor services to a particular user, place, time, and/or event. Hence, they allowservice providers to adapt their services to actual user needs, by offering personalized services depending on the current user context. Service providers are usually interested in profiling users both to increase client satisfaction and to broaden the set of offered services. Novel and efficient techniques are needed to tailor service supply to the user (or the user category) and to the situation inwhich he/she is involved. This paper presents the CAS-Mine framework to efficiently discover relevant relationships between user context data and currently asked services for both user and service profiling. CAS-Mine efficiently extracts generalized association rules, which provide a high-level abstraction of both user habits and service characteristics depending on the context. A lazy (analyst-provided) taxonomy evaluation performed on different attributes (e.g., a geographic hierarchy on spatial coordinates, a classification of provided services) drives the rule generalization process. Extracted rules are classified into groups according to their semantic meaning and ranked by means of quality indices, thus allowing a domain expert to focus on the most relevant patterns. Experiments performed on three context-aware datasets, obtained by logging user requests and context information for three real applications, show the effectiveness and the efficiency of the CAS-Mine framework in mining different valuable types of correlations between user habits, context information, and provided services

    Case-Based-Reasoning System for Feature Selection and Diagnosing Disease; Case Study: Asthma

    Get PDF
    Asthma is a chronic informatory disease of the respiratory canals in which it has not become obvious what is the reason for the reports argumentation on the ground of asthma prevalence. In the present research, the purpose would be to design a case-based-reasoning (CBR) model in order to assist a physician to diagnose the type of disease and also the needed therapy. At first for designing this system, the disease variables were discriminated and were at the patients' disposal as a questionnaire, and after gathering the relevant data (CBR) algorithm was rendered on the data which led to the asthma diagnosis. The system was tested on 325 asthmatic and non asthmatic adult cases and was accessed with eighty percent accuracy. The consequences were promising. With regard to the fact that the factors of the disease are different in various countries, This study was performed in order to determine risk factors for asthma in Iranian society and the results of research showed that the most important variables of asthma disease in Iran are symptoms heperresponsivity, frequency of cough, cough. Key words: data mining, case based reasoning, asthma, diagnosis
    corecore