12 research outputs found

    Noninvasive methods for children\u27s cholesterol level determination

    Get PDF
    Today, there is a controversy about the role of cholesterol in infants and the measurement and management of blood cholesterol in children. Several scientific evidences are supporting relationship between elevated blood cholesterol in children and high cholesterol in adults and development of adult arteriosclerotic diseases such as cardiovascular and cerebrovascular disease. Therefore controlling the level of blood cholesterol in children is very important for the health of the whole population. Non-invasive methods are much more convenient for the children because of their anxieties about blood examinations. In this paper we will present a new try to find non-invasive methods for determining the level of blood cholesterol in children with the use of intelligent system

    Surface EMG decomposition using a novel approach for blind source separation

    Get PDF
    We introduce a new method to perform a blind deconvolution of the surface electromyogram (EMG) signals generated by isometric muscle contractions. The method extracts the information from the raw EMG signals detected only on the skin surface, enabling longtime noninvasive monitoring of the electromuscular properties. Its preliminary results show that surface EMG signals can be used to determine the number of active motor units, the motor unit firing rate and the shape of the average action potential in each motor unit

    A Clustering Comparison Measure Using Density Profiles and its Application to the Discovery of Alternate Clusterings

    Get PDF
    Data clustering is a fundamental and very popular method of data analysis. Its subjective nature, however, means that different clustering algorithms or different parameter settings can produce widely varying and sometimes conflicting results. This has led to the use of clustering comparison measures to quantify the degree of similarity between alternative clusterings. Existing measures, though, can be limited in their ability to assess similarity and sometimes generate unintuitive results. They also cannot be applied to compare clusterings which contain different data points, an activity which is important for scenarios such as data stream analysis. In this paper, we introduce a new clustering similarity measure, known as ADCO, which aims to address some limitations of existing measures, by allowing greater flexibility of comparison via the use of density profiles to characterize a clustering. In particular, it adopts a ‘data mining style’ philosophy to clustering comparison, whereby two clusterings are considered to be more similar, if they are likely to give rise to similar types of prediction models. Furthermore, we show that this new measure can be applied as a highly effective objective function within a new algorithm, known as MAXIMUS, for generating alternate clusterings

    Poročilo o obisku Informacijskega oddelka univerzitetne bolnišnice v Tokiju

    Get PDF

    Combining classification algorithms

    Get PDF
    Dissertação de Doutoramento em Ciência de Computadores apresentada à Faculdade de Ciências da Universidade do PortoA capacidade de um algoritmo de aprendizagem induzir, para um determinado problema, uma boa generalização depende da linguagem de representação usada para generalizar os exemplos. Como diferentes algoritmos usam diferentes linguagens de representação e estratégias de procura, são explorados espaços diferentes e são obtidos resultados diferentes. O problema de encontrar a representação mais adequada para o problema em causa, é uma área de investigação bastante activa. Nesta dissertação, em vez de procurar métodos que fazem o ajuste aos dados usando uma única linguagem de representação, apresentamos uma família de algoritmos, sob a designação genérica de Generalização em Cascata, onde o espaço de procura contem modelos que utilizam diferentes linguagens de representação. A ideia básica do método consiste em utilizar os algoritmos de aprendizagem em sequência. Em cada iteração ocorre um processo com dois passos. No primeiro passo, um classificador constrói um modelo. No segundo passo, o espaço definido pelos atributos é estendido pela inserção de novos atributos gerados utilizando este modelo. Este processo de construção de novos atributos constrói atributos na linguagem de representação do classificador usado para construir o modelo. Se posteriormente na sequência, um classificador utiliza um destes novos atributos para construir o seu modelo, a sua capacidade de representação foi estendida. Desta forma as restrições da linguagem de representação dosclassificadores utilizados a mais alto nível na sequência, são relaxadas pela incorporação de termos da linguagem derepresentação dos classificadores de base. Esta é a metodologia base subjacente ao sistema Ltree e à arquitecturada Generalização em Cascata.O método é apresentado segundo duas perspectivas. Numa primeira parte, é apresentado como uma estratégia paraconstruir árvores de decisão multivariadas. É apresentado o sistema Ltree que utiliza como operador para a construção de atributos um discriminante linear. ..

    New rule induction algorithms with improved noise tolerance and scalability

    Get PDF
    As data storage capacities continue to increase due to rapid advances in information technology, there is a growing need for devising scalable data mining algorithms able to sift through large volumes of data in a short amount of time. Moreover, real-world data is inherently imperfect due to the presence of noise as opposed to artificially prepared data. Consequently, there is also a need for designing robust algorithms capable of handling noise, so that the discovered patterns are reliable with good predictive performance on future data. This has led to ongoing research in the field of data mining, intended to develop algorithms that are scalable as well as robust. The most straightforward approach for handling the issue of scalability is to develop efficient algorithms that can process large datasets in a relatively short time. Efficiency may be achieved by employing suitable rule mining constraints that can drastically cut down the search space. The first part of this thesis focuses on the improvement of a state-of-the-art rule induction algorithm, RULES-6, which incorporates certain search space pruning constraints in order to scale to large datasets. However, the constraints are insufficient and also have not been exploited to the maximum, resulting in the generation of specific rules which not only increases learning time but also the length of the rule set. In order to address these issues, a new algorithm RULES-7 is proposed which uses deep rule mining constraints from association learning. This results in a significant drop in execution time for large datasets while boosting the classification accuracy of the model on future data. A novel comparison heuristic is also proposed for the algorithm which improves classification accuracy while maintaining the execution time. Since an overwhelming majority of induction algorithms are unable to handle the continuous data ubiquitous in the real-world, it is also necessary to develop an efficient discretisation procedure whereby continuous attributes can be treated as discrete. By generalizing the raw continuous data, discretisation helps to speed up the induction process and results in a simpler and intelligible model that is also more accurate on future data. Many preprocessing discretisation techniques have been proposed to date, of which the entropy based technique has by far been accepted as the most accurate. However, the technique is suboptimal for classification because of failing to identify the cut points within the value range of each class for a continuous attribute, which deteriorates its classification accuracy. The second part of this thesis presents a new discretisation technique which utilizes the entropy based principle but takes a class-centered approach to discretisation. The proposed technique not only increases the efficiency of rule induction but also improves the classification accuracy of the induced model. Another issue with existing induction algorithms relates to the way covered examples are dealt with when a new rule is formed. To avoid problems such as fragmentation and small disjuncts, the RULES family of algorithms marks the examples instead of removing them. This tends to increase overlapping between rules. The third part of this thesis proposes a new hybrid pruning technique in order to address the overlapping issue so as to reduce the rule set size. It also proposes an incremental post-pruning technique designed specifically to handle the issue of noisy data. This leads to improved induction performance as well as better classification accuracy

    Dynamic Discretization of Continuous Attributes

    No full text
    corecore