34 research outputs found

    Compact bilinear pooling via kernelized random projection for fine-grained image categorization on low computational power devices

    Get PDF
    [EN]Bilinear pooling is one of the most popular and effective methods for fine-grained image recognition. However, a major drawback of Bilinear pooling is the dimensionality of the resulting descriptors, which typically consist of several hundred thousand features. Even when generating the descriptor is tractable, its dimension makes any subsequent operations impractical and often results in huge computational and storage costs. We introduce a novel method to efficiently reduce the dimension of bilinear pooling descriptors by performing a Random Projection. Conveniently, this is achieved without ever computing the high-dimensional descriptor explicitly. Our experimental results show that our method outperforms existing compact bilinear pooling algorithms in most cases, while running faster on low computational power devices, where efficient extensions of bilinear pooling are most useful

    Low Computational Cost Machine Learning: Random Projections and Polynomial Kernels

    Get PDF
    [EN] According to recent reports, over the course of 2018, the volume of data generated, captured and replicated globally was 33 Zettabytes (ZB), and it is expected to reach 175 ZB by the year 2025. Managing this impressive increase in the volume and variety of data represents a great challenge, but also provides organizations with a precious opportunity to support their decision-making processes with insights and knowledge extracted from massive collections of data and to automate tasks leading to important savings. In this context, the field of machine learning has attracted a notable level of attention, and recent breakthroughs in the area have enabled the creation of predictive models of unprecedented accuracy. However, with the emergence of new computational paradigms, the field is now faced with the challenge of creating more efficient models, capable of running on low computational power environments while maintaining a high level of accuracy. This thesis focuses on the design and evaluation of new algorithms for the generation of useful data representations, with special attention to the scalability and efficiency of the proposed solutions. In particular, the proposed methods make an intensive use of randomization in order to map data samples to the feature spaces of polynomial kernels and then condensate the useful information present in those feature spaces into a more compact representation. The resulting algorithmic designs are easy to implement and require little computational power to run. As a consequence, they are perfectly suited for applications in environments where computational resources are scarce and data needs to be analyzed with little delay. The two major contributions of this thesis are: (1) we present and evaluate efficient and data-independent algorithms that perform Random Projections from the feature spaces of polynomial kernels of different degrees and (2) we demonstrate how these techniques can be used to accelerate machine learning tasks where polynomial interaction features are used, focusing particularly on bilinear models in deep learning

    Learning Deep SPD Visual Representation for Image Classification

    Get PDF
    Symmetric positive definite (SPD) visual representations are effective due to their ability to capture high-order statistics to describe images. Reliable and efficient calculation of SPD matrix representation from small sized feature maps with a high number of channels in CNN is a challenging issue. This thesis presents three novel methods to address the above challenge. The first method, called Relation Dropout (ReDro), is inspired by the fact that eigen-decomposition of a block diagonal matrix can be efficiently obtained by eigendecomposition of each block separately. Thus, instead of using a full covariance matrix as in the literature, this thesis randomly group the channels and form a covariance matrix per group. ReDro is inserted as an additional layer preceding the matrix normalisation step and the random grouping is made transparent to all subsequent layers. ReDro can be seen as a dropout-related regularisation which discards some pair-wise channel relationships across each group. The second method, called FastCOV, exploits the intrinsic connection between eigensytems of XXT and XTX. Specifically, it computes position-wise covariance matrix upon convolutional feature maps instead of the typical channel-wise covariance matrix. As the spatial size of feature maps is usually much smaller than the channel number, conducting eigen-decomposition of the position-wise covariance matrix avoids rank-deficiency and it is faster than the decomposition of the channel-wise covariance matrix. The eigenvalues and eigenvectors of the normalised channel-wise covariance matrix can be retrieved by the connection of the XXT and XTX eigen-systems. The third method, iSICE, deals with the reliable covariance estimation from small sized and highdimensional CNN feature maps. It exploits the prior structure of the covariance matrix to estimate sparse inverse covariance which is developed in the literature to deal with the covariance matrix’s small sample issue. Given a covariance matrix, this thesis iteratively minimises its log-likelihood penalised by a sparsity with gradient descend. The resultant representation characterises partial correlation instead of indirect correlation characterised in covariance representation. As experimentally demonstrated, all three proposed methods improve the image classification performance, whereas the first two proposed methods reduce the computational cost of learning large SPD visual representations

    Efficient Image and Video Representations for Retrieval

    Get PDF
    Image (Video) retrieval is an interesting problem of retrieving images (videos) similar to the query. Images (Videos) are represented in an input (feature) space and similar images (videos) are obtained by finding nearest neighbors in the input representation space. Numerous input representations both in real valued and binary space have been proposed for conducting faster retrieval. In this thesis, we present techniques that obtain improved input representations for retrieval in both supervised and unsupervised settings for images and videos. Supervised retrieval is a well known problem of retrieving same class images of the query. We address the practical aspects of achieving faster retrieval with binary codes as input representations for the supervised setting in the first part, where binary codes are used as addresses into hash tables. In practice, using binary codes as addresses does not guarantee fast retrieval, as similar images are not mapped to the same binary code (address). We address this problem by presenting an efficient supervised hashing (binary encoding) method that aims to explicitly map all the images of the same class ideally to a unique binary code. We refer to the binary codes of the images as `Semantic Binary Codes' and the unique code for all same class images as `Class Binary Code'. We also propose a new class­ based Hamming metric that dramatically reduces the retrieval times for larger databases, where only hamming distance is computed to the class binary codes. We also propose a Deep semantic binary code model, by replacing the output layer of a popular convolutional Neural Network (AlexNet) with the class binary codes and show that the hashing functions learned in this way outperforms the state­ of ­the art, and at the same time provide fast retrieval times. In the second part, we also address the problem of supervised retrieval by taking into account the relationship between classes. For a given query image, we want to retrieve images that preserve the relative order i.e. we want to retrieve all same class images first and then, the related classes images before different class images. We learn such relationship aware binary codes by minimizing the similarity between inner product of the binary codes and the similarity between the classes. We calculate the similarity between classes using output embedding vectors, which are vector representations of classes. Our method deviates from the other supervised binary encoding schemes as it is the first to use output embeddings for learning hashing functions. We also introduce new performance metrics that take into account the related class retrieval results and show significant gains over the state­ of­ the art. High Dimensional descriptors like Fisher Vectors or Vector of Locally Aggregated Descriptors have shown to improve the performance of many computer vision applications including retrieval. In the third part, we will discuss an unsupervised technique for compressing high dimensional vectors into high dimensional binary codes, to reduce storage complexity. In this approach, we deviate from adopting traditional hyperplane hashing functions and instead learn hyperspherical hashing functions. The proposed method overcomes the computational challenges of directly applying the spherical hashing algorithm that is intractable for compressing high dimensional vectors. A practical hierarchical model that utilizes divide and conquer techniques using the Random Select and Adjust (RSA) procedure to compress such high dimensional vectors is presented. We show that our proposed high dimensional binary codes outperform the binary codes obtained using traditional hyperplane methods for higher compression ratios. In the last part of the thesis, we propose a retrieval based solution to the Zero shot event classification problem - a setting where no training videos are available for the event. To do this, we learn a generic set of concept detectors and represent both videos and query events in the concept space. We then compute similarity between the query event and the video in the concept space and videos similar to the query event are classified as the videos belonging to the event. We show that we significantly boost the performance using concept features from other modalities

    Patient Dropout Prediction in Virtual Health: A Multimodal Dynamic Knowledge Graph and Text Mining Approach

    Full text link
    Virtual health has been acclaimed as a transformative force in healthcare delivery. Yet, its dropout issue is critical that leads to poor health outcomes, increased health, societal, and economic costs. Timely prediction of patient dropout enables stakeholders to take proactive steps to address patients' concerns, potentially improving retention rates. In virtual health, the information asymmetries inherent in its delivery format, between different stakeholders, and across different healthcare delivery systems hinder the performance of existing predictive methods. To resolve those information asymmetries, we propose a Multimodal Dynamic Knowledge-driven Dropout Prediction (MDKDP) framework that learns implicit and explicit knowledge from doctor-patient dialogues and the dynamic and complex networks of various stakeholders in both online and offline healthcare delivery systems. We evaluate MDKDP by partnering with one of the largest virtual health platforms in China. MDKDP improves the F1-score by 3.26 percentage points relative to the best benchmark. Comprehensive robustness analyses show that integrating stakeholder attributes, knowledge dynamics, and compact bilinear pooling significantly improves the performance. Our work provides significant implications for healthcare IT by revealing the value of mining relations and knowledge across different service modalities. Practically, MDKDP offers a novel design artifact for virtual health platforms in patient dropout management

    Methodologies for innovation and best practices in Industry 4.0 for SMEs

    Get PDF
    Today, cyber physical systems are transforming the way in which industries operate, we call this Industry 4.0 or the fourth industrial revolution. Industry 4.0 involves the use of technologies such as Cloud Computing, Edge Computing, Internet of Things, Robotics and most of all Big Data. Big Data are the very basis of the Industry 4.0 paradigm, because they can provide crucial information on all the processes that take place within manufacturing (which helps optimize processes and prevent downtime), as well as provide information about the employees (performance, individual needs, safety in the workplace) as well as clients/customers (their needs and wants, trends, opinions) which helps businesses become competitive and expand on the international market. Current processing capabilities thanks to technologies such as Internet of Things, Cloud Computing and Edge Computing, mean that data can be processed much faster and with greater security. The implementation of Artificial Intelligence techniques, such as Machine Learning, can enable technologies, can help machines take certain decisions autonomously, or help humans make decisions much faster. Furthermore, data can be used to feed predictive models which can help businesses and manufacturers anticipate future changes and needs, address problems before they cause tangible harm

    Intelligent Models in Complex Problem Solving

    Get PDF
    Artificial Intelligence revived in the last decade. The need for progress, the growing processing capacity and the low cost of the Cloud have facilitated the development of new, powerful algorithms. The efficiency of these algorithms in Big Data processing, Deep Learning and Convolutional Networks is transforming the way we work and is opening new horizons

    Managing smart cities with deepint.net

    Get PDF
    In this keynote, the evolution of intelligent computer systems will be examined. The need for human capital will be emphasised, as well as the need to follow one’s “gut instinct” in problem-solving. We will look at the benefits of combining information and knowledge to solve complex problems and will examine how knowledge engineering facilitates the integration of different algorithms. Furthermore, we will analyse the importance of complementary technologies such as IoT and Blockchain in the development of intelligent systems. It will be shown how tools like "Deep Intelligence" make it possible to create computer systems efficiently and effectively. "Smart" infrastructures need to incorporate all added-value resources so they can offer useful services to the society, while reducing costs, ensuring reliability and improving the quality of life of the citizens. The combination of AI with IoT and with blockchain offers a world of possibilities and opportunities

    Learning AI with deepint.net

    Get PDF
    This keynote will examine the evolution of intelligent computer systems over the last years, underscoring the need for human capital in this field, so that further progress can be made. In this regard, learning about AI through experience is a big challenge, but it is possible thanks to tools such as deepint.net, which enable anyone to develop AI systems; knowledge of programming is no longer necessary

    Building Efficient Smart Cities

    Get PDF
    Current technological developments offer promising solutions to the challenges faced by cities such as crowding, pollution, housing, the search for greater comfort, better healthcare, optimized mobility and other urban services that must be adapted to the fast-paced life of the citizens. Cities that deploy technology to optimize their processes and infrastructure fit under the concept of a smart city. An increasing number of cities strive towards becoming smart and some are even already being recognized as such, including Singapore, London and Barcelona. Our society has an ever-greater reliance on technology for its sustenance. This will continue into the future, as technology is rapidly penetrating all facets of human life, from daily activities to the workplace and industries. A myriad of data is generated from all these digitized processes, which can be used to further enhance all smart services, increasing their adaptability, precision and efficiency. However, dealing with large amounts of data coming from different types of sources is a complex process; this impedes many cities from taking full advantage of data, or even worse, a lack of control over the data sources may lead to serious security issues, leaving cities vulnerable to cybercrime. Given that smart city infrastructure is largely digitized, a cyberattack would have fatal consequences on the city’s operation, leading to economic loss, citizen distrust and shut down of essential city services and networks. This is a threat to the efficiency smart cities strive for
    corecore