715 research outputs found

    Practical tools for exploring data and models

    Get PDF

    A Novel Scheme for Intelligent Recognition of Pornographic Images

    Full text link
    Harmful contents are rising in internet day by day and this motivates the essence of more research in fast and reliable obscene and immoral material filtering. Pornographic image recognition is an important component in each filtering system. In this paper, a new approach for detecting pornographic images is introduced. In this approach, two new features are suggested. These two features in combination with other simple traditional features provide decent difference between porn and non-porn images. In addition, we applied fuzzy integral based information fusion to combine MLP (Multi-Layer Perceptron) and NF (Neuro-Fuzzy) outputs. To test the proposed method, performance of system was evaluated over 18354 download images from internet. The attained precision was 93% in TP and 8% in FP on training dataset, and 87% and 5.5% on test dataset. Achieved results verify the performance of proposed system versus other related works

    Visualization of Neural Networks

    Get PDF
    Import 05/08/2014Tato bakalářská práce se zabývá problematikou grafické vizualizace vysoce rozměrných dat do nízkorozměrného prostoru. Zaměřuje se na implementaci alespoň tří vizualizačních metod typu Samoorganizující Mapa. Jednotlivé metody budou implementovány jako DLL knihovna, která bude umožňovat hostitelské aplikaci přístup k daným metodám.This bachelor thesis deals with the graphical visualization of high-dimensional data into a low-dimensional space. It focuses at least on the implementation of three visualization methods of Self-Organizing Map type. The various methods will be implemented as DLL library that will allow host application access to these methods.460 - Katedra informatikyvelmi dobř

    Visualising software in cyberspace

    Get PDF
    The problems of maintaining software systems are well documented. The increasing size and complexity of modern software serves only to worsen matters. Software maintainers are typically confronted with very large and very complex software systems, of which they may have little or no prior knowledge. At this stage they will normally have some maintenance task to perform, though possibly little indication of where or how to start. They need to investigate and understand the software to some extent in order to begin maintenance. This understanding process is termed program comprehension. There are various theories on program comprehension, many of which put emphasis on the construction of a mental model of the software within the mind of the maintainor. These same theories hypothesise a number of techniques employed by the maintainer for the creation and revision of this mental model. Software visualisation attempts to provide tool support for generating, supplementing and verifying the maintainer’s mental model. The majority of software visualisations to date have concentrated on producing two dimensional representations and animations of various aspects of a software system. Very little work has been performed previously regarding the issues involved in visualising software within a virtual reality environment. This research represents a significant first step into this exciting field and offers insight into the problems posed by this new media. This thesis provides an identification of the possibilities afforded byU3D graphics for software visualisation and program comprehension. It begins by defining seven key areas of 3D software visualisation, followed by the definition of two terms, visualisation and representation. These two terms provide a conceptual division between a visualisation and the elements of which it is comprised. This division enables improved discussion of the properties of a 3D visualisation and particularly the idenfification of properties that are desirable for a successful visualisation. A number of such desirable properties are suggested for both visualisations and representations, providing support for the design and evaluation of a 3D software visualisation system. Also presented are a number of prototype visualisations, each providing a different approach to the visualisation of a software system. The prototypes help demonstrate the practicalities and feasibility of 3D software visualisation. Evaluation of these prototypes is performed using a variety of techniques, the results of which emphasise the fact that there is substantial potential for the application of 3D graphics and virtual reality to software visualisation

    Applications of multivariate statistics in honey bee research, analysis of metabolomics data from samples of honey bee propolis

    Get PDF
    This thesis was previously held under moratorium from 20/04/2020 to 20/04/2022Honey bees play a significant role both ecologically and economically, through the pollination of flowering plants and crops. Additionally, honey is an ancient food source that is highly valued by different religions and cultures and has been shown to possess a wide range of beneficial uses, including cosmetic treatment, eye disease, bronchial asthma and hiccups. In addition to honey, honey bees also produce beeswax, pollen, royal jelly and propolis. In this thesis, data is studied which comes from samples of propolis from various geographical locations. Propolis is a resinous product, which consists of a combination of beeswax, saliva and resins that have been gathered by honey bees from the exudates of various surrounding plants. It is used by the bees to seal small gaps and maintain the hives, but is also an anti-microbial substance that may protect them against disease. The appearance and consistency of propolis changes depending on the temperature; it becomes elastic and sticky when warm, but hard and brittle when cold. Furthermore, its composition and colour varies from yellowish-green to dark brown, depending on its age and the sources of resin from the environment. Propolis is a highly biochemically active substance with many potential benefits in health care, which have attracted much attention. Biochemical analysis of propolis leads to highly multivariate metabolomics data. The main benefit of metabolomics is to generate a spectrum, in which peaks correspond to different chemical components, making possible the detection of multiple substances simultaneously. Relevant spectral features may be used for pattern recognition. The purpose of this research is to study methods used for statistical analysis of biochemical data arising from propolis samples. We investigate the use of different statistical methods for metabolomics data from chemical analysis of propolis samples using Mass Spectrometry (MS). Methods studied will include pre-treatment methods and multivariate analysis techniques including principal component analysis (PCA), multidimensional scaling (MDS), and clustering methods including hierarchical cluster analysis (HCA), k-means clustering and self organising maps (SOMs). Background material and results of data analysis will be presented from samples of propolis from beehives in Scotland, Libya and Europe. Conclusions are drawn in terms of the data sets themselves as well as the properties of the different methods studied for analysing such metabolomics data.Honey bees play a significant role both ecologically and economically, through the pollination of flowering plants and crops. Additionally, honey is an ancient food source that is highly valued by different religions and cultures and has been shown to possess a wide range of beneficial uses, including cosmetic treatment, eye disease, bronchial asthma and hiccups. In addition to honey, honey bees also produce beeswax, pollen, royal jelly and propolis. In this thesis, data is studied which comes from samples of propolis from various geographical locations. Propolis is a resinous product, which consists of a combination of beeswax, saliva and resins that have been gathered by honey bees from the exudates of various surrounding plants. It is used by the bees to seal small gaps and maintain the hives, but is also an anti-microbial substance that may protect them against disease. The appearance and consistency of propolis changes depending on the temperature; it becomes elastic and sticky when warm, but hard and brittle when cold. Furthermore, its composition and colour varies from yellowish-green to dark brown, depending on its age and the sources of resin from the environment. Propolis is a highly biochemically active substance with many potential benefits in health care, which have attracted much attention. Biochemical analysis of propolis leads to highly multivariate metabolomics data. The main benefit of metabolomics is to generate a spectrum, in which peaks correspond to different chemical components, making possible the detection of multiple substances simultaneously. Relevant spectral features may be used for pattern recognition. The purpose of this research is to study methods used for statistical analysis of biochemical data arising from propolis samples. We investigate the use of different statistical methods for metabolomics data from chemical analysis of propolis samples using Mass Spectrometry (MS). Methods studied will include pre-treatment methods and multivariate analysis techniques including principal component analysis (PCA), multidimensional scaling (MDS), and clustering methods including hierarchical cluster analysis (HCA), k-means clustering and self organising maps (SOMs). Background material and results of data analysis will be presented from samples of propolis from beehives in Scotland, Libya and Europe. Conclusions are drawn in terms of the data sets themselves as well as the properties of the different methods studied for analysing such metabolomics data

    Unsupervised machine learning clustering and data exploration of radio-astronomical images

    Get PDF
    In this thesis, I demonstrate a novel and efficient unsupervised clustering and data exploration method with the combination of a Self-Organising Map (SOM) and a Convolutional Autoencoder, applied to radio-astronomical images from the Radio Galaxy Zoo (RGZ) dataset. The rapidly increasing volume and complexity of radio-astronomical data have ushered in a new era of big-data astronomy which has increased the demand for Machine Learning (ML) solutions. In this era, the sheer amount of image data produced with modern instruments and has resulted in a significant data deluge. Furthermore, the morphologies of objects captured in these radio-astronomical images are highly complex and challenging to classify conclusively due to their intricate and indiscrete nature. Additionally, major radio-astronomical discoveries are unplanned and found in the unexpected, making unsupervised ML highly desirable by operating with few assumptions and without labelled training data. In this thesis, I developed a novel unsupervised ML approach as a practical solution to these astronomy challenges. Using this system, I demonstrated the use of convolutional autoencoders and SOM’s as a dimensionality reduction method to delineate the complexity and volume of astronomical data. My optimised system shows that the coupling of these methods is a powerful method of data exploration and unsupervised clustering of radio-astronomical images. The results of this thesis show this approach is capable of accurately separating features by complexity on a SOM manifold and unified distance matrix with neighbourhood similarity and hierarchical clustering of the mapped astronomical features. This method provides an effective means to explore the high-level topological relationships of image features and morphology in large datasets automatically with minimal processing time and computational resources. I achieved these capabilities with a new and innovative method of SOM training using the autoencoder compressed latent feature vector representations of radio-astronomical data, rather than raw images. Using this system, I successfully investigated SOM affine transformation invariance and analysed the true nature of rotational effects on this manifold using autoencoder random rotation training augmentations. Throughout this thesis, I present my method as a powerful new approach to data exploration technique and contribution to the field. The speed and effectiveness of this method indicates excellent scalability and holds implications for use on large future surveys, large-scale instruments such as the Square Kilometre Array and in other big-data and complexity analysis applications

    An algorithmic framework for visualising and exploring multidimensional data

    Get PDF
    To help understand multidimensional data, information visualisation techniques are often applied to take advantage of human visual perception in exposing latent structure. A popular means of presenting such data is via two-dimensional scatterplots where the inter-point proximities reflect some notion of similarity between the entities represented. This can result in potentially interesting structure becoming almost immediately apparent. Traditional algorithms for carrying out this dimension reduction tend to have different strengths and weaknesses in terms of run times and layout quality. However, it has been found that the combination of algorithms can produce hybrid variants that exhibit significantly lower run times while maintaining accurate depictions of high-dimensional structure. The author's initial contribution in the creation of such algorithms led to the design and implementation of a software system (HIVE) for the development and investigation of new hybrid variants and the subsequent analysis of the data they transform. This development was motivated by the fact that there are potentially many hybrid algorithmic combinations to explore and therefore an environment that is conductive to their development, analysis and use is beneficial not only in exploring the data they transform but also in exploring the growing number of visualisation tools that these algorithms beget. This thesis descries three areas of the author's contribution to the field of information visualisation. Firstly, work on hybrid algorithms for dimension reduction is presented and their analysis shows their effectiveness. Secondly, the development of a framework for the creation of tailored hybrid algorithms is illustrated. Thirdly, a system embodying the framework, providing an environment conductive to the development, evaluation and use of the algorithms is described. Case studies are provided to demonstrate how the author and others have used and found value in the system across areas as diverse as environmental science, social science and investigative psychology, where multidimensional data are in abundance

    Strategies for image visualisation and browsing

    Get PDF
    PhDThe exploration of large information spaces has remained a challenging task even though the proliferation of database management systems and the state-of-the art retrieval algorithms is becoming pervasive. Signi cant research attention in the multimedia domain is focused on nding automatic algorithms for organising digital image collections into meaningful structures and providing high-semantic image indices. On the other hand, utilisation of graphical and interactive methods from information visualisation domain, provide promising direction for creating e cient user-oriented systems for image management. Methods such as exploratory browsing and query, as well as intuitive visual overviews of image collection, can assist the users in nding patterns and developing the understanding of structures and content in complex image data-sets. The focus of the thesis is combining the features of automatic data processing algorithms with information visualisation. The rst part of this thesis focuses on the layout method for displaying the collection of images indexed by low-level visual descriptors. The proposed solution generates graphical overview of the data-set as a combination of similarity based visualisation and random layout approach. Second part of the thesis deals with problem of visualisation and exploration for hierarchical organisation of images. Due to the absence of the semantic information, images are considered the only source of high-level information. The content preview and display of hierarchical structure are combined in order to support image retrieval. In addition to this, novel exploration and navigation methods are proposed to enable the user to nd the way through database structure and retrieve the content. On the other hand, semantic information is available in cases where automatic or semi-automatic image classi ers are employed. The automatic annotation of image items provides what is referred to as higher-level information. This type of information is a cornerstone of multi-concept visualisation framework which is developed as a third part of this thesis. This solution enables dynamic generation of user-queries by combining semantic concepts, supported by content overview and information ltering. Comparative analysis and user tests, performed for the evaluation of the proposed solutions, focus on the ways information visualisation a ects the image content exploration and retrieval; how e cient and comfortable are the users when using di erent interaction methods and the ways users seek for information through di erent types of database organisation
    corecore