607 research outputs found

    Multivariate Modeling of Quasar Variability with an Attention-based Variational Autoencoder

    Get PDF
    This thesis applied HeTVAE, an attention-based VAE neural network capable of multivariate modeling of time series, to a dataset of several thousand multi-band AGN light curves from ZTF and was one of the first attempts to use a neural network to harness the stochastic light curves in their multivariate form. Whereas standard models of AGN variability make prior assumptions, HeTVAE uses no prior knowledge and is able to learn the data distribution in a regularized latent space, reading semantic information via its up-to-date self-supervised training regimen. We have successfully created a dataset class for preprocessing the irregular multivariate time series and in order to interface with the quasi-off-the-shelf network more conveniently. Also, we have trained several different model iterations using one, two or all three of the filter dimensions from ZTF on Durham’s NCC compute cluster, while configuring useful hyper parameter choices to work robustly for the astronomical dataset. In the network's training, we employed the Adam optimizer with a reduce-on-plateau learning rate schedule and a KL-annealing schedule optimize the VAE’s performance. In experimenting, we show how the VAE has learned the data distribution of the light curves by generating simulated light curves and its interpretability by visualizing attention scores and by visualizing the way the light curves are distributed along the continuous latent space using PCA. We show it orders the light curves across a smooth gradient from those those that have both low amplitude short-term variation and high amplitude long-term variation, to those with little variability, to those with both short-term and long-term high-amplitude variation in the condensed space. We also use PCA to display a potential filtering algorithm that enables parsing through large datasets in an intuitive way and present some of the pitfalls of algorithmic bias in anomaly detection. Finally, we fine-tuned the structurally correct but imprecise multivariate interpolations output by HeTVAE to three objects to show how they could improve constraints on time-delay estimates in the context of reverberation mapping for the relatively poor-cadenced ZTF data. In short, HeTVAE's use cases are ranged and it is a step in the right direction as far as being able to help organize and process the millions of AGN light curves incoming from Vera C. Rubin Observatory’s Legacy Survey of Space and Time in their full 6 optical broadband filter multivariate form

    Guided rewriting and constraint satisfaction for parallel GPU code generation

    Get PDF
    Graphics Processing Units (GPUs) are notoriously hard to optimise for manually due to their scheduling and memory hierarchies. What is needed are good automatic code generators and optimisers for such parallel hardware. Functional approaches such as Accelerate, Futhark and LIFT leverage a high-level algorithmic Intermediate Representation (IR) to expose parallelism and abstract the implementation details away from the user. However, producing efficient code for a given accelerator remains challenging. Existing code generators depend on the user input to choose a subset of hard-coded optimizations or automated exploration of implementation search space. The former suffers from the lack of extensibility, while the latter is too costly due to the size of the search space. A hybrid approach is needed, where a space of valid implementations is built automatically and explored with the aid of human expertise. This thesis presents a solution combining user-guided rewriting and automatically generated constraints to produce high-performance code. The first contribution is an automatic tuning technique to find a balance between performance and memory consumption. Leveraging its functional patterns, the LIFT compiler is empowered to infer tuning constraints and limit the search to valid tuning combinations only. Next, the thesis reframes parallelisation as a constraint satisfaction problem. Parallelisation constraints are extracted automatically from the input expression, and a solver is used to identify valid rewriting. The constraints truncate the search space to valid parallel mappings only by capturing the scheduling restrictions of the GPU in the context of a given program. A synchronisation barrier insertion technique is proposed to prevent data races and improve the efficiency of the generated parallel mappings. The final contribution of this thesis is the guided rewriting method, where the user encodes a design space of structural transformations using high-level IR nodes called rewrite points. These strongly typed pragmas express macro rewrites and expose design choices as explorable parameters. The thesis proposes a small set of reusable rewrite points to achieve tiling, cache locality, data reuse and memory optimisation. A comparison with the vendor-provided handwritten kernel ARM Compute Library and the TVM code generator demonstrates the effectiveness of this thesis' contributions. With convolution as a use case, LIFT-generated direct and GEMM-based convolution implementations are shown to perform on par with the state-of-the-art solutions on a mobile GPU. Overall, this thesis demonstrates that a functional IR yields well to user-guided and automatic rewriting for high-performance code generation

    Fine Tuning Transformer Models for Domain Specific Feature Extraction

    Get PDF
    La naturalesa del processament de llengües naturals ha canviat dràsticament en els últims anys. La implementació de Large Language Models pre-entrenat en milers de dades sense etiquetar ha obert la porta a una nova capa de comprensió del processament de text. Això ha desplaçat la investigació a la zona per explotar aquests grans models per obtenir millors resultats per a les tasques més petites. D'aquesta manera, el processament de llengües naturals està adquirint una importància cada vegada major. Afinant els diferents models de llenguatge gran amb dades específiques de context i de tasques, aquests models ràpidament aprenen a seguir patrons i generalitzar-los a nous conceptes. Entenen el llenguatge natural en gran mesura i poden generar relacions en paraules, frases i paràgrafs. La sintonització fina neuronal s'ha convertit en una tasca cada vegada més important per simplificar l'ús de solucions d'aprenentatge automàtic amb pocs recursos. L'augment dels models de transformadors pre-entrenats per al processament del llenguatge natural ha complicat la selecció i l'experimentació d'aquests models, augmentant el temps de recerca i experimentació. Aquest estudi passa per l'estat actual de l'art dels models transformadors i intenta estudiar l'abast i l'aplicabilitat d'aquests models. A partir d'aquest treball inicial, el document produeix un gasoducte complet d'ajust fi del model que permet a l'usuari obtenir fàcilment un model llest per a utilitzar per a una tasca de llenguatge natural. Per provar aquest model, la canonada es prova i s'avalua per a l'extracció automàtica de característiques (és a dir, funcionalitats) des d'aplicacions mòbils utilitzant documents de llenguatge natural disponibles, com ara descripcions.The nature of Natural Language Processing has drastically changed in the past years. The implementation of Large Language Models pre-trained on thousands of unlabelled data has opened the door to a new layer of comprehension of text processing. This has shifted research in the area to exploit these large models to obtain better results for smaller tasks. In this way, fine-tuning Natural Language Processing is becoming increasingly important. By fine-tuning the different large language models with context and task-specific data, these models quickly learn to track patterns and generalize to new concepts. They understand natural language to a great extent and can generate relationships in words, phrases, and paragraphs. Fine Tuning has become an increasingly important task to simplify the use of machine learning solutions with low resources. The increase in pre-trained transformer models for Natural Language Processing has complicated the selection and experimentation of these models, increasing research and experimentation time. This study goes through the current state of the art of transformer models and attempts to study the scope and applicability of these models. From this initial work, the paper produces a compre- hensive pipeline of model fine-tuning that allows the user to easily obtain a ready-to-use model for a natural language task. To test this model, the pipeline is tested and evaluated for the automatic extraction of features (i.e. functionalities) from mobile applications using available natural language documents, such as descriptions

    Simulating substrate binding sites in the S. aureus Type II NADH Dehydrogenase

    Get PDF
    "Type II NADH Oxidoreductase (NDH-2) from Staphylococcus aureus was established as a therapeutic target against the virulency of this bacterium and an alternative to treat Complex I-derived diseases. To accurately model interactions of NDH-2 with its substrates such as menaquinones and NADH, Coarse-Grain (CG) simulations were employed. "N/

    Classification of Corn Seed Quality Using Convolutional Neural Network with Region Proposal and Data Augmentation

    Get PDF
    Corn is a commodity in agriculture and essential to human food and animal feed. All components of corn can be utilized and accommodated for the benefit of humans. One of the supporting components is the quality of corn seeds, where specific sources have physiological properties to survive. The problem is how to get information on the quality of corn seeds at agricultural locations and get information through direct visual observations. This research tries to find a solution for classifying corn kernels with high accuracy using a convolutional neural network. It is because in-depth training is used in deep learning. The problem with convolutional neural networks is that the training process takes a long time, depending on the number of layers in the architecture. The research contribution is adding Convex Hull. This method looks for edge points on an object and forms a polygon that encloses that point. It helps increase focus on the convolution multiplication process by removing images on the background. The 34-layer architecture maintains feature maps and uses dropout layers to save computation time. The dataset used is primary data. There are six classes, AR21, Pioner_P35, BISI_18, NK212, Pertiwi, and Betras1—data augmentation techniques to overcome data limitations so that overfitting does not occur. The results of the classification of corn kernels obtained a model with an average accuracy of 99.33%, 99.33% precision, 99.33% recall, and 99.36% F-1 score. The computational training time to obtain the model was 2 minutes 30 seconds. The average error value for MSE is 0.0125, RMSE is 0.118, and MAE is 0.0108. The experimental data testing process has an accuracy ranging from 77% -99%. In conclusion, using the proposal area can improve accuracy by about 0.3% because the focused object helps the convolution process

    Automated detection of tumoural cells with graph neural networks

    Get PDF
    La detecció de cèl·lules tumorals en imatges de seccions completes és una tasca essencial en el diagnòstic mèdic i la investigació. En aquesta tesi, proposem i analitzem un enfocament innovador que combina models basats en visió amb xarxes neuronals en grafs per millorar la precisió de la detecció automatitzada de cèl·lules tumorals. La nostra proposta aprofita l'estructura inherent i les relacions entre cèl·lules en el teixit. Els resultats experimentals en el nostre propi conjunt de dades curat mostrin que diversos indicadors milloren fins a un 15\% en comparació amb només usar l'enfocament de visió. S'ha demostrat que funciona amb teixit pulmonar tenyit amb H\&E i teixit mamari tenyit amb HER2. Creiem que el nostre mètode proposat té el potencial de millorar la precisió de la detecció automatitzada de cèl·lules tumorals, el que pot portar a uns diagnòstics més ràpids i una investigació accelerada en el camp degut a la reducció en la càrrega de treball dels histopatòlegs.La detección de células tumorales en imágenes de portaobjeto completo juega un papel esencial en el diagnóstico médico y es un elemento fundamental de la investigación sobre el cáncer. En esta tesis proponemos y analizamos un enfoque novedoso que combina modelos de visión por ordenador con redes neuronales en grafos para mejorar la precisión de la detección automatizada de células tumorales. Nuestra propuesta aprovecha la estructura inherente y las relaciones entre las células del tejido. Los resultados experimentales obtenidos sobre nuestra propia base de datos muestran que varias métricas mejoran hasta en un 15\% en comparación con solo usar el enfoque de visión. Se ha demostrado que funciona con tejido pulmonar teñido con H\&E y tejido mamario teñido con HER2. Creemos que nuestro método tiene el potencial de mejorar la precisión de los métodos automáticos de detección de células tumorales, lo que puede llevar a acelerar los diagnósticos y la investigación en este ámbito al reducir la carga de trabajo de los histopatólogos.The detection of tumoural cells from whole slide images is an essential task in medical diagnosis and research. In this thesis, we propose and analyse a novel approach that combines computer vision-based models with graph neural networks to improve the accuracy of automated tumoural cell detection. Our proposal leverages the inherent structure and relationships between cells in the tissue. Experimental results on our own curated dataset shows that several different metrics improve by up to 15%15\% compared to just using the computer vision approach. It has been proved to work with H\&E stained lung tissue and HER2 stained breast tissue. We believe that our proposed method has the potential to improve the accuracy of automated tumoural cell detection, which can lead to accelerated diagnosis and research in the field by reducing the worload of hystopathologists

    Big Data - Supply Chain Management Framework for Forecasting: Data Preprocessing and Machine Learning Techniques

    Full text link
    This article intends to systematically identify and comparatively analyze state-of-the-art supply chain (SC) forecasting strategies and technologies. A novel framework has been proposed incorporating Big Data Analytics in SC Management (problem identification, data sources, exploratory data analysis, machine-learning model training, hyperparameter tuning, performance evaluation, and optimization), forecasting effects on human-workforce, inventory, and overall SC. Initially, the need to collect data according to SC strategy and how to collect them has been discussed. The article discusses the need for different types of forecasting according to the period or SC objective. The SC KPIs and the error-measurement systems have been recommended to optimize the top-performing model. The adverse effects of phantom inventory on forecasting and the dependence of managerial decisions on the SC KPIs for determining model performance parameters and improving operations management, transparency, and planning efficiency have been illustrated. The cyclic connection within the framework introduces preprocessing optimization based on the post-process KPIs, optimizing the overall control process (inventory management, workforce determination, cost, production and capacity planning). The contribution of this research lies in the standard SC process framework proposal, recommended forecasting data analysis, forecasting effects on SC performance, machine learning algorithms optimization followed, and in shedding light on future research

    Behavior quantification as the missing link between fields: Tools for digital psychiatry and their role in the future of neurobiology

    Full text link
    The great behavioral heterogeneity observed between individuals with the same psychiatric disorder and even within one individual over time complicates both clinical practice and biomedical research. However, modern technologies are an exciting opportunity to improve behavioral characterization. Existing psychiatry methods that are qualitative or unscalable, such as patient surveys or clinical interviews, can now be collected at a greater capacity and analyzed to produce new quantitative measures. Furthermore, recent capabilities for continuous collection of passive sensor streams, such as phone GPS or smartwatch accelerometer, open avenues of novel questioning that were previously entirely unrealistic. Their temporally dense nature enables a cohesive study of real-time neural and behavioral signals. To develop comprehensive neurobiological models of psychiatric disease, it will be critical to first develop strong methods for behavioral quantification. There is huge potential in what can theoretically be captured by current technologies, but this in itself presents a large computational challenge -- one that will necessitate new data processing tools, new machine learning techniques, and ultimately a shift in how interdisciplinary work is conducted. In my thesis, I detail research projects that take different perspectives on digital psychiatry, subsequently tying ideas together with a concluding discussion on the future of the field. I also provide software infrastructure where relevant, with extensive documentation. Major contributions include scientific arguments and proof of concept results for daily free-form audio journals as an underappreciated psychiatry research datatype, as well as novel stability theorems and pilot empirical success for a proposed multi-area recurrent neural network architecture.Comment: PhD thesis cop

    Data simulation in deep learning-based human recognition

    Get PDF
    Human recognition is an important part of perception systems, such as those used in autonomous vehicles or robots. These systems often use deep neural networks for this purpose, which rely on large amounts of data that ideally cover various situations, movements, visual appearances, and interactions. However, obtaining such data is typically complex and expensive. In addition to raw data, labels are required to create training data for supervised learning. Thus, manual annotation of bounding boxes, keypoints, orientations, or actions performed is frequently necessary. This work addresses whether the laborious acquisition and creation of data can be simplified through targeted simulation. If data are generated in a simulation, information such as positions, dimensions, orientations, surfaces, and occlusions are already known, and appropriate labels can be generated automatically. A key question is whether deep neural networks, trained with simulated data, can be applied to real data. This work explores the use of simulated training data using examples from the field of pedestrian detection for autonomous vehicles. On the one hand, it is shown how existing systems can be improved by targeted retraining with simulation data, for example to better recognize corner cases. On the other hand, the work focuses on the generation of data that hardly or not occur at all in real standard datasets. It will be demonstrated how training data can be generated by targeted acquisition and combination of motion data and 3D models, which contain finely graded action labels to recognize even complex pedestrian situations. Through the diverse annotation data that simulations provide, it becomes possible to train deep neural networks for a wide variety of tasks with one dataset. In this work, such simulated data is used to train a novel deep multitask network that brings together diverse, previously mostly independently considered but related, tasks such as 2D and 3D human pose recognition and body and orientation estimation
    corecore