66 research outputs found

    Automated Feature Engineering for Deep Neural Networks with Genetic Programming

    Get PDF
    Feature engineering is a process that augments the feature vector of a machine learning model with calculated values that are designed to enhance the accuracy of a model’s predictions. Research has shown that the accuracy of models such as deep neural networks, support vector machines, and tree/forest-based algorithms sometimes benefit from feature engineering. Expressions that combine one or more of the original features usually create these engineered features. The choice of the exact structure of an engineered feature is dependent on the type of machine learning model in use. Previous research demonstrated that various model families benefit from different types of engineered feature. Random forests, gradient-boosting machines, or other tree-based models might not see the same accuracy gain that an engineered feature allowed neural networks, generalized linear models, or other dot-product based models to achieve on the same data set. This dissertation presents a genetic programming-based algorithm that automatically engineers features that increase the accuracy of deep neural networks for some data sets. For a genetic programming algorithm to be effective, it must prioritize the search space and efficiently evaluate what it finds. This dissertation algorithm faced a potential search space composed of all possible mathematical combinations of the original feature vector. Five experiments were designed to guide the search process to efficiently evolve good engineered features. The result of this dissertation is an automated feature engineering (AFE) algorithm that is computationally efficient, even though a neural network is used to evaluate each candidate feature. This approach gave the algorithm a greater opportunity to specifically target deep neural networks in its search for engineered features that improve accuracy. Finally, a sixth experiment empirically demonstrated the degree to which this algorithm improved the accuracy of neural networks on data sets augmented by the algorithm’s engineered features

    The Data Science Design Manual

    Get PDF

    Self-Organizing Teams in Online Work Settings

    Get PDF
    As the volume and complexity of distributed online work increases, the collaboration among people who have never worked together in the past is becoming increasingly necessary. Recent research has proposed algorithms to maximize the performance of such teams by grouping workers according to a set of predefined decision criteria. This approach micro-manages workers, who have no say in the team formation process. Depriving users of control over who they will work with stifles creativity, causes psychological discomfort and results in less-than-optimal collaboration results. In this work, we propose an alternative model, called Self-Organizing Teams (SOTs), which relies on the crowd of online workers itself to organize into effective teams. Supported but not guided by an algorithm, SOTs are a new human-centered computational structure, which enables participants to control, correct and guide the output of their collaboration as a collective. Experimental results, comparing SOTs to two benchmarks that do not offer user agency over the collaboration, reveal that participants in the SOTs condition produce results of higher quality and report higher teamwork satisfaction. We also find that, similarly to machine learning-based self-organization, human SOTs exhibit emergent collective properties, including the presence of an objective function and the tendency to form more distinct clusters of compatible teammates

    The Application of Machine Learning to At-Risk Cultural Heritage Image Data

    Get PDF
    This project investigates the application of Convolutional Neural Network (CNN) methods and technologies to problems related to At-Risk cultural heritage object recognition. The primary aim for this work is the use of developmental software combining the disciplines of computer vision and artefact studies, developing applications in the field of heritage protection specifically related to the illegal antiquities market. To accomplish this digital image data provided by the Durham University Oriental Museum was used in conjunction with several different implementations of pre-trained CNN software models, for the purposes of artefact Classification and Identification. Testing focused on data capture using a variety of digital recording devices, guided by the developmental needs of a heritage programme seeking to create software solutions to heritage threats in the Middle East and North Africa (MENA) region. Quantitative data results using information retrieval metrics is reported for all model and test sets, and has been used to evaluate the models predictive results

    Computational Methods for Medical and Cyber Security

    Get PDF
    Over the past decade, computational methods, including machine learning (ML) and deep learning (DL), have been exponentially growing in their development of solutions in various domains, especially medicine, cybersecurity, finance, and education. While these applications of machine learning algorithms have been proven beneficial in various fields, many shortcomings have also been highlighted, such as the lack of benchmark datasets, the inability to learn from small datasets, the cost of architecture, adversarial attacks, and imbalanced datasets. On the other hand, new and emerging algorithms, such as deep learning, one-shot learning, continuous learning, and generative adversarial networks, have successfully solved various tasks in these fields. Therefore, applying these new methods to life-critical missions is crucial, as is measuring these less-traditional algorithms' success when used in these fields

    21st Century Cottage Industry - A cross-case synthesis of freelancer intermediary platforms

    Get PDF
    The purpose of this study was to identify possible archetypes of freelancer intermediary platforms. Though there is growing interest towards platforms, classification of platforms stops when it is classified as a transaction, innovation, integrated or some other platform. However, this approach doesn’t account for the variation within these categories. Given the young population's interest towards freelancing and the estimated size of the platform economy as a whole ($4300 Bn.) and the number of freelancer intermediaries (250-300), attempting to identify the subtypes of freelancer intermediary platforms was deemed a worthy endeavor. Finding these subtypes of intermediary platforms or archetypes of freelancer intermediaries has both academic and practical implications. For academics, these archetypes will contribute to the growing body of platform literature by giving it new units of analysis and by creating reasonable categorization. For people interested in utilizing a freelancer intermediary platform either as a seller or a buyer, this thesis offers solid knowledge of the intermediary platforms functions and features as well as what to expect when joining one. The research design is built on principles of embedded and flexible multiple-case study and cross-case synthesis. When describing a contemporary phenomenon, a multiple-case study produces more robust results when the weight of one case decreases. The cross-case synthesis was one of the few viable options given the study’s lack of dependent and independent variables. These variables were unavailable because no beforehand information on what the archetypes could be was available. For this reason, this study adapted analytical methods of grounded theory. The study identified four archetypes of freelancer intermediary platforms: the locals, two for the price of one, the middle child and the global juggernauts. Locals focus on physical services that are dependent on freelancers’ location. Two for the price of one are small platforms that charge only one side be it, seller or buyer. The middle child is very similar to global juggernauts in other aspects but the size and is a necessary phase in the platform’s maturation. Global juggernauts are the biggest platforms and the industry leaders that have significant network and trust management systems in place. Archetypes form a solid foundation on which future research on freelancer intermediaries can be based on

    Always One Bit More, Computing and the Experience of Ambiguity

    Get PDF
    Fun is often understood to be non-conceptual and indeed without rigour, without relation to formal processes of thought, yielding an intense and joyous informality, a release from procedure. Yet, as this book argues, fun may also be found, alongside other kinds of pleasure, in the generation, iteration and imagination of operations and procedures. This chapter aims to develop a means of drawing out an understanding of fun in relation to concepts of experience in the culture of mathematics and in the machinic fun of certain computer games. Mathematical concepts of experience, as something to be effaced, in terms of the grind of churning out calculations, understood as an acme of human knowledge bordering on the mystical or something both prosaic, peculiar and thrillingly abstract have been crucial to the motivation and genesis of computing. Experience may be figured as something innate to the computing person, or that is abstractable and thus mobile, shifting heterogeneously from one context to another, producing strange affinities between scales – residues and likeness among computational forms that can occasionally link the most austere and mundane or cacophonous of aesthetics. Among such, the fine and perplexing fun of paradox and ambiguity arises not simply in the interplay between formalisms and other kinds of life but as formalisms interweave releasing and congealing further dynamics. There are many ways in which mathematics has been linked to culture as a means of ordering, describing, inspiring or explaining ways of being in the world, but it is less often that mathematics thinks about itself as producing figurations of existence, and such moments are useful to turn to in gaining a sense of some of the patternings of computational culture
    • …
    corecore