12,845 research outputs found

    One Small Step for Generative AI, One Giant Leap for AGI: A Complete Survey on ChatGPT in AIGC Era

    Full text link
    OpenAI has recently released GPT-4 (a.k.a. ChatGPT plus), which is demonstrated to be one small step for generative AI (GAI), but one giant leap for artificial general intelligence (AGI). Since its official release in November 2022, ChatGPT has quickly attracted numerous users with extensive media coverage. Such unprecedented attention has also motivated numerous researchers to investigate ChatGPT from various aspects. According to Google scholar, there are more than 500 articles with ChatGPT in their titles or mentioning it in their abstracts. Considering this, a review is urgently needed, and our work fills this gap. Overall, this work is the first to survey ChatGPT with a comprehensive review of its underlying technology, applications, and challenges. Moreover, we present an outlook on how ChatGPT might evolve to realize general-purpose AIGC (a.k.a. AI-generated content), which will be a significant milestone for the development of AGI.Comment: A Survey on ChatGPT and GPT-4, 29 pages. Feedback is appreciated ([email protected]

    Self-Supervised Learning to Prove Equivalence Between Straight-Line Programs via Rewrite Rules

    Full text link
    We target the problem of automatically synthesizing proofs of semantic equivalence between two programs made of sequences of statements. We represent programs using abstract syntax trees (AST), where a given set of semantics-preserving rewrite rules can be applied on a specific AST pattern to generate a transformed and semantically equivalent program. In our system, two programs are equivalent if there exists a sequence of application of these rewrite rules that leads to rewriting one program into the other. We propose a neural network architecture based on a transformer model to generate proofs of equivalence between program pairs. The system outputs a sequence of rewrites, and the validity of the sequence is simply checked by verifying it can be applied. If no valid sequence is produced by the neural network, the system reports the programs as non-equivalent, ensuring by design no programs may be incorrectly reported as equivalent. Our system is fully implemented for a given grammar which can represent straight-line programs with function calls and multiple types. To efficiently train the system to generate such sequences, we develop an original incremental training technique, named self-supervised sample selection. We extensively study the effectiveness of this novel training approach on proofs of increasing complexity and length. Our system, S4Eq, achieves 97% proof success on a curated dataset of 10,000 pairs of equivalent programsComment: 30 pages including appendi

    Machine Learning Research Trends in Africa: A 30 Years Overview with Bibliometric Analysis Review

    Full text link
    In this paper, a critical bibliometric analysis study is conducted, coupled with an extensive literature survey on recent developments and associated applications in machine learning research with a perspective on Africa. The presented bibliometric analysis study consists of 2761 machine learning-related documents, of which 98% were articles with at least 482 citations published in 903 journals during the past 30 years. Furthermore, the collated documents were retrieved from the Science Citation Index EXPANDED, comprising research publications from 54 African countries between 1993 and 2021. The bibliometric study shows the visualization of the current landscape and future trends in machine learning research and its application to facilitate future collaborative research and knowledge exchange among authors from different research institutions scattered across the African continent

    Innovative Hybrid Approaches for Vehicle Routing Problems

    Get PDF
    This thesis deals with the efficient resolution of Vehicle Routing Problems (VRPs). The first chapter faces the archetype of all VRPs: the Capacitated Vehicle Routing Problem (CVRP). Despite having being introduced more than 60 years ago, it still remains an extremely challenging problem. In this chapter I design a Fast Iterated-Local-Search Localized Optimization algorithm for the CVRP, shortened to FILO. The simplicity of the CVRP definition allowed me to experiment with advanced local search acceleration and pruning techniques that have eventually became the core optimization engine of FILO. FILO experimentally shown to be extremely scalable and able to solve very large scale instances of the CVRP in a fraction of the computing time compared to existing state-of-the-art methods, still obtaining competitive solutions in terms of their quality. The second chapter deals with an extension of the CVRP called the Extended Single Truck and Trailer Vehicle Routing Problem, or simply XSTTRP. The XSTTRP models a broad class of VRPs in which a single vehicle, composed of a truck and a detachable trailer, has to serve a set of customers with accessibility constraints making some of them not reachable by using the entire vehicle. This problem moves towards VRPs including more realistic constraints and it models scenarios such as parcel deliveries in crowded city centers or rural areas, where maneuvering a large vehicle is forbidden or dangerous. The XSTTRP generalizes several well known VRPs such as the Multiple Depot VRP and the Location Routing Problem. For its solution I developed an hybrid metaheuristic which combines a fast heuristic optimization with a polishing phase based on the resolution of a limited set partitioning problem. Finally, the thesis includes a final chapter aimed at guiding the computational evaluation of new approaches to VRPs proposed by the machine learning community

    Growth trends and site productivity in boreal forests under management and environmental change: insights from long-term surveys and experiments in Sweden

    Get PDF
    Under a changing climate, current tree and stand growth information is indispensable to the carbon sink strength of boreal forests. Important questions regarding tree growth are to what extent have management and environmental change influenced it, and how it might respond in the future. In this thesis, results from five studies (Papers I-V) covering growth trends, site productivity, heterogeneity in managed forests and potentials for carbon storage in forests and harvested wood products via differing management strategies are presented. The studies were based on observations from national forest inventories and long-term experiments in Sweden. The annual height growth of Scots pine (Pinus sylvestris) and Norway spruce (Picea abies) had increased, especially after the millennium shift, while the basal area growth remains stable during the last 40 years (Papers I-II). A positive response on height growth with increasing temperature was observed. The results generally imply a changing growing condition and stand composition. In Paper III, yield capacity of conifers was analysed and compared with existing functions. The results showed that there is a bias in site productivity estimates and the new functions give better prediction of the yield capacity in Sweden. In Paper IV, the variability in stand composition was modelled as indices of heterogeneity to calibrate the relationship between basal area and leaf area index in managed stands of Norway spruce and Scots pine. The results obtained show that the stand structural heterogeneity effects here are of such a magnitude that they cannot be neglected in the implementation of hybrid growth models, especially those based on light interception and light-use efficiency. In the long-term, the net climate benefits in Swedish forests may be maximized through active forest management with high harvest levels and efficient product utilization, compared to increasing carbon storage in standing forests through land set-asides for nature conservation (Paper V). In conclusion, this thesis offers support for the development of evidence-based policy recommendations for site-adapted and sustainable management of Swedish forests in a changing climate

    The Role of Transient Vibration of the Skull on Concussion

    Get PDF
    Concussion is a traumatic brain injury usually caused by a direct or indirect blow to the head that affects brain function. The maximum mechanical impedance of the brain tissue occurs at 450±50 Hz and may be affected by the skull resonant frequencies. After an impact to the head, vibration resonance of the skull damages the underlying cortex. The skull deforms and vibrates, like a bell for 3 to 5 milliseconds, bruising the cortex. Furthermore, the deceleration forces the frontal and temporal cortex against the skull, eliminating a layer of cerebrospinal fluid. When the skull vibrates, the force spreads directly to the cortex, with no layer of cerebrospinal fluid to reflect the wave or cushion its force. To date, there is few researches investigating the effect of transient vibration of the skull. Therefore, the overall goal of the proposed research is to gain better understanding of the role of transient vibration of the skull on concussion. This goal will be achieved by addressing three research objectives. First, a MRI skull and brain segmentation automatic technique is developed. Due to bones’ weak magnetic resonance signal, MRI scans struggle with differentiating bone tissue from other structures. One of the most important components for a successful segmentation is high-quality ground truth labels. Therefore, we introduce a deep learning framework for skull segmentation purpose where the ground truth labels are created from CT imaging using the standard tessellation language (STL). Furthermore, the brain region will be important for a future work, thus, we explore a new initialization concept of the convolutional neural network (CNN) by orthogonal moments to improve brain segmentation in MRI. Second, the creation of a novel 2D and 3D Automatic Method to Align the Facial Skeleton is introduced. An important aspect for further impact analysis is the ability to precisely simulate the same point of impact on multiple bone models. To perform this task, the skull must be precisely aligned in all anatomical planes. Therefore, we introduce a 2D/3D technique to align the facial skeleton that was initially developed for automatically calculating the craniofacial symmetry midline. In the 2D version, the entire concept of using cephalometric landmarks and manual image grid alignment to construct the training dataset was introduced. Then, this concept was extended to a 3D version where coronal and transverse planes are aligned using CNN approach. As the alignment in the sagittal plane is still undefined, a new alignment based on these techniques will be created to align the sagittal plane using Frankfort plane as a framework. Finally, the resonant frequencies of multiple skulls are assessed to determine how the skull resonant frequency vibrations propagate into the brain tissue. After applying material properties and mesh to the skull, modal analysis is performed to assess the skull natural frequencies. Finally, theories will be raised regarding the relation between the skull geometry, such as shape and thickness, and vibration with brain tissue injury, which may result in concussive injury

    Machine learning for managing structured and semi-structured data

    Get PDF
    As the digitalization of private, commercial, and public sectors advances rapidly, an increasing amount of data is becoming available. In order to gain insights or knowledge from these enormous amounts of raw data, a deep analysis is essential. The immense volume requires highly automated processes with minimal manual interaction. In recent years, machine learning methods have taken on a central role in this task. In addition to the individual data points, their interrelationships often play a decisive role, e.g. whether two patients are related to each other or whether they are treated by the same physician. Hence, relational learning is an important branch of research, which studies how to harness this explicitly available structural information between different data points. Recently, graph neural networks have gained importance. These can be considered an extension of convolutional neural networks from regular grids to general (irregular) graphs. Knowledge graphs play an essential role in representing facts about entities in a machine-readable way. While great efforts are made to store as many facts as possible in these graphs, they often remain incomplete, i.e., true facts are missing. Manual verification and expansion of the graphs is becoming increasingly difficult due to the large volume of data and must therefore be assisted or substituted by automated procedures which predict missing facts. The field of knowledge graph completion can be roughly divided into two categories: Link Prediction and Entity Alignment. In Link Prediction, machine learning models are trained to predict unknown facts between entities based on the known facts. Entity Alignment aims at identifying shared entities between graphs in order to link several such knowledge graphs based on some provided seed alignment pairs. In this thesis, we present important advances in the field of knowledge graph completion. For Entity Alignment, we show how to reduce the number of required seed alignments while maintaining performance by novel active learning techniques. We also discuss the power of textual features and show that graph-neural-network-based methods have difficulties with noisy alignment data. For Link Prediction, we demonstrate how to improve the prediction for unknown entities at training time by exploiting additional metadata on individual statements, often available in modern graphs. Supported with results from a large-scale experimental study, we present an analysis of the effect of individual components of machine learning models, e.g., the interaction function or loss criterion, on the task of link prediction. We also introduce a software library that simplifies the implementation and study of such components and makes them accessible to a wide research community, ranging from relational learning researchers to applied fields, such as life sciences. Finally, we propose a novel metric for evaluating ranking results, as used for both completion tasks. It allows for easier interpretation and comparison, especially in cases with different numbers of ranking candidates, as encountered in the de-facto standard evaluation protocols for both tasks.Mit der rasant fortschreitenden Digitalisierung des privaten, kommerziellen und öffentlichen Sektors werden immer größere Datenmengen verfügbar. Um aus diesen enormen Mengen an Rohdaten Erkenntnisse oder Wissen zu gewinnen, ist eine tiefgehende Analyse unerlässlich. Das immense Volumen erfordert hochautomatisierte Prozesse mit minimaler manueller Interaktion. In den letzten Jahren haben Methoden des maschinellen Lernens eine zentrale Rolle bei dieser Aufgabe eingenommen. Neben den einzelnen Datenpunkten spielen oft auch deren Zusammenhänge eine entscheidende Rolle, z.B. ob zwei Patienten miteinander verwandt sind oder ob sie vom selben Arzt behandelt werden. Daher ist das relationale Lernen ein wichtiger Forschungszweig, der untersucht, wie diese explizit verfügbaren strukturellen Informationen zwischen verschiedenen Datenpunkten nutzbar gemacht werden können. In letzter Zeit haben Graph Neural Networks an Bedeutung gewonnen. Diese können als eine Erweiterung von CNNs von regelmäßigen Gittern auf allgemeine (unregelmäßige) Graphen betrachtet werden. Wissensgraphen spielen eine wesentliche Rolle bei der Darstellung von Fakten über Entitäten in maschinenlesbaren Form. Obwohl große Anstrengungen unternommen werden, so viele Fakten wie möglich in diesen Graphen zu speichern, bleiben sie oft unvollständig, d. h. es fehlen Fakten. Die manuelle Überprüfung und Erweiterung der Graphen wird aufgrund der großen Datenmengen immer schwieriger und muss daher durch automatisierte Verfahren unterstützt oder ersetzt werden, die fehlende Fakten vorhersagen. Das Gebiet der Wissensgraphenvervollständigung lässt sich grob in zwei Kategorien einteilen: Link Prediction und Entity Alignment. Bei der Link Prediction werden maschinelle Lernmodelle trainiert, um unbekannte Fakten zwischen Entitäten auf der Grundlage der bekannten Fakten vorherzusagen. Entity Alignment zielt darauf ab, gemeinsame Entitäten zwischen Graphen zu identifizieren, um mehrere solcher Wissensgraphen auf der Grundlage einiger vorgegebener Paare zu verknüpfen. In dieser Arbeit stellen wir wichtige Fortschritte auf dem Gebiet der Vervollständigung von Wissensgraphen vor. Für das Entity Alignment zeigen wir, wie die Anzahl der benötigten Paare reduziert werden kann, während die Leistung durch neuartige aktive Lerntechniken erhalten bleibt. Wir erörtern auch die Leistungsfähigkeit von Textmerkmalen und zeigen, dass auf Graph-Neural-Networks basierende Methoden Schwierigkeiten mit verrauschten Paar-Daten haben. Für die Link Prediction demonstrieren wir, wie die Vorhersage für unbekannte Entitäten zur Trainingszeit verbessert werden kann, indem zusätzliche Metadaten zu einzelnen Aussagen genutzt werden, die oft in modernen Graphen verfügbar sind. Gestützt auf Ergebnisse einer groß angelegten experimentellen Studie präsentieren wir eine Analyse der Auswirkungen einzelner Komponenten von Modellen des maschinellen Lernens, z. B. der Interaktionsfunktion oder des Verlustkriteriums, auf die Aufgabe der Link Prediction. Außerdem stellen wir eine Softwarebibliothek vor, die die Implementierung und Untersuchung solcher Komponenten vereinfacht und sie einer breiten Forschungsgemeinschaft zugänglich macht, die von Forschern im Bereich des relationalen Lernens bis hin zu angewandten Bereichen wie den Biowissenschaften reicht. Schließlich schlagen wir eine neuartige Metrik für die Bewertung von Ranking-Ergebnissen vor, wie sie für beide Aufgaben verwendet wird. Sie ermöglicht eine einfachere Interpretation und einen leichteren Vergleich, insbesondere in Fällen mit einer unterschiedlichen Anzahl von Kandidaten, wie sie in den de-facto Standardbewertungsprotokollen für beide Aufgaben vorkommen

    Unraveling the effect of sex on human genetic architecture

    Get PDF
    Sex is arguably the most important differentiating characteristic in most mammalian species, separating populations into different groups, with varying behaviors, morphologies, and physiologies based on their complement of sex chromosomes, amongst other factors. In humans, despite males and females sharing nearly identical genomes, there are differences between the sexes in complex traits and in the risk of a wide array of diseases. Sex provides the genome with a distinct hormonal milieu, differential gene expression, and environmental pressures arising from gender societal roles. This thus poses the possibility of observing gene by sex (GxS) interactions between the sexes that may contribute to some of the phenotypic differences observed. In recent years, there has been growing evidence of GxS, with common genetic variation presenting different effects on males and females. These studies have however been limited in regards to the number of traits studied and/or statistical power. Understanding sex differences in genetic architecture is of great importance as this could lead to improved understanding of potential differences in underlying biological pathways and disease etiology between the sexes and in turn help inform personalised treatments and precision medicine. In this thesis we provide insights into both the scope and mechanism of GxS across the genome of circa 450,000 individuals of European ancestry and 530 complex traits in the UK Biobank. We found small yet widespread differences in genetic architecture across traits through the calculation of sex-specific heritability, genetic correlations, and sex-stratified genome-wide association studies (GWAS). We further investigated whether sex-agnostic (non-stratified) efforts could potentially be missing information of interest, including sex-specific trait-relevant loci and increased phenotype prediction accuracies. Finally, we studied the potential functional role of sex differences in genetic architecture through sex biased expression quantitative trait loci (eQTL) and gene-level analyses. Overall, this study marks a broad examination of the genetics of sex differences. Our findings parallel previous reports, suggesting the presence of sexual genetic heterogeneity across complex traits of generally modest magnitude. Furthermore, our results suggest the need to consider sex-stratified analyses in future studies in order to shed light into possible sex-specific molecular mechanisms

    A game theory approach for estimating reliability of crowdsourced relevance assessments

    Get PDF
    In this article, we propose an approach to improve quality in crowdsourcing (CS) tasks using Task Completion Time (TCT) as a source of information about the reliability of workers in a game-theoretical competitive scenario. Our approach is based on the hypothesis that some workers are more risk-inclined and tend to gamble with their use of time when put to compete with other workers. This hypothesis is supported by our previous simulation study. We test our approach with 35 topics from experiments on the TREC-8 collection being assessed as relevant or non-relevant by crowdsourced workers both in a competitive (referred to as "Game") and non-competitive (referred to as "Base") scenario. We find that competition changes the distributions of TCT, making them sensitive to the quality (i.e., wrong or right) and outcome (i.e., relevant or non-relevant) of the assessments. We also test an optimal function of TCT as weights in a weighted majority voting scheme. From probabilistic considerations, we derive a theoretical upper bound for the weighted majority performance of cohorts of 2, 3, 4, and 5 workers, which we use as a criterion to evaluate the performance of our weighting scheme. We find our approach achieves a remarkable performance, significantly closing the gap between the accuracy of the obtained relevance judgements and the upper bound. Since our approach takes advantage of TCT, which is an available quantity in any CS tasks, we believe it is cost-effective and, therefore, can be applied for quality assurance in crowdsourcing for micro-tasks
    corecore