20 research outputs found

    Special Properties of Gradient Descent with Large Learning Rates

    Get PDF
    When training neural networks, it has been widely observed that a large step size is essential in stochastic gradient descent (SGD) for obtaining superior models. However, the effect of large step sizes on the success of SGD is not well understood theoretically. Several previous works have attributed this success to the stochastic noise present in SGD. However, we show through a novel set of experiments that the stochastic noise is not sufficient to explain good non-convex training, and that instead the effect of a large learning rate itself is essential for obtaining best performance.We demonstrate the same effects also in the noise-less case, i.e. for full-batch GD. We formally prove that GD with large step size -- on certain non-convex function classes -- follows a different trajectory than GD with a small step size, which can lead to convergence to a global minimum instead of a local one. Our settings provide a framework for future analysis which allows comparing algorithms based on behaviors that can not be observed in the traditional settings.Comment: A short version of this work appeared in ICML 22 ICML Workshop on Continuous Time Methods for Machine Learning under the title "The Gap Between Continuous and Discrete Gradient Descent

    Development and Testing of a Fractal Analysis Algorithm for Face Recognition

    Get PDF
    Following an earlier development for fingerprints by Deal (1) and Stoffa (2), it was suggested that this algorithm may work on faces (or more precisely, face images). First, this work transformed a 2-D electronic image file of a human face into a numeric system via a similar random walk process by Deal and Stoffa. Second, the numeric system was analyzed, and the numeric system may then be tested against a database of similarly converted images. The testing determined whether the subject of the image is part of the database. Finally, the efficiency, quickness, and accuracy of such an algorithm were tested and conclusions about the general effectiveness were made.;The algorithm employed a Random Walk analysis of digital photographs of human faces for a fixed number of binary images which were generated from the source photograph using a Boolean conversion scheme. The Random Walk generated a series of transition probabilities for a particular scale. In short, the numeric system used to describe the face will consisted of two dimensions of data---scale and binary image. The numeric system for a particular source photograph was tested against a database of similarly constructed systems to determine whether the subject of the source photograph was in the database.;For the purpose of this work, a database of 400 images was constructed from 167 individual subjects using the FERET database. The 400 images where then analyzed, and tested against the database to determine whether the algorithm could find the subjects in the database. The algorithm was able, in its best configuration, to identify correctly the subjects of 168 of the 400 photographs. However, the total time to run an image (after capture by a digital camera) to database comparison was only 62 seconds, which represents a substantial improvement over previous systems

    Tiny Machine Learning Environment: Enabling Intelligence on Constrained Devices

    Get PDF
    Running machine learning algorithms (ML) on constrained devices at the extreme edge of the network is problematic due to the computational overhead of ML algorithms, available resources on the embedded platform, and application budget (i.e., real-time requirements, power constraints, etc.). This required the development of specific solutions and development tools for what is now referred to as TinyML. In this dissertation, we focus on improving the deployment and performance of TinyML applications, taking into consideration the aforementioned challenges, especially memory requirements. This dissertation contributed to the construction of the Edge Learning Machine environment (ELM), a platform-independent open-source framework that provides three main TinyML services, namely shallow ML, self-supervised ML, and binary deep learning on constrained devices. In this context, this work includes the following steps, which are reflected in the thesis structure. First, we present the performance analysis of state-of-the-art shallow ML algorithms including dense neural networks, implemented on mainstream microcontrollers. The comprehensive analysis in terms of algorithms, hardware platforms, datasets, preprocessing techniques, and configurations shows similar performance results compared to a desktop machine and highlights the impact of these factors on overall performance. Second, despite the assumption that TinyML only permits models inference provided by the scarcity of resources, we have gone a step further and enabled self-supervised on-device training on microcontrollers and tiny IoT devices by developing the Autonomous Edge Pipeline (AEP) system. AEP achieves comparable accuracy compared to the typical TinyML paradigm, i.e., models trained on resource-abundant devices and then deployed on microcontrollers. Next, we present the development of a memory allocation strategy for convolutional neural networks (CNNs) layers, that optimizes memory requirements. This approach reduces the memory footprint without affecting accuracy nor latency. Moreover, e-skin systems share the main requirements of the TinyML fields: enabling intelligence with low memory, low power consumption, and low latency. Therefore, we designed an efficient Tiny CNN architecture for e-skin applications. The architecture leverages the memory allocation strategy presented earlier and provides better performance than existing solutions. A major contribution of the thesis is given by CBin-NN, a library of functions for implementing extremely efficient binary neural networks on constrained devices. The library outperforms state of the art NN deployment solutions by drastically reducing memory footprint and inference latency. All the solutions proposed in this thesis have been implemented on representative devices and tested in relevant applications, of which results are reported and discussed. The ELM framework is open source, and this work is clearly becoming a useful, versatile toolkit for the IoT and TinyML research and development community

    Provable security for lightweight message authentication and encryption

    Full text link
    The birthday bound often limits the security of a cryptographic scheme to half of the block size or internal state size. This implies that cryptographic schemes require a block size or internal state size that is twice the security level, resulting in larger and more resource-intensive designs. In this thesis, we introduce abstract constructions for message authentication codes and stream ciphers that we demonstrate to be secure beyond the birthday bound. Our message authentication codes were inspired by previous work, specifically the message authentication code EWCDM by Cogliati and Seurin, as well as the work by Mennink and Neves, which demonstrates easy proofs of security for the sum of permutations and an improved bound for EWCDM. We enhance the sum of permutations by incorporating a hash value and a nonce in our stateful design, and in our stateless design, we utilize two hash values. One advantage over EWCDM is that the permutation calls, or block cipher calls, can be parallelized, whereas in EWCDM they must be performed sequentially. We demonstrate that our constructions provide a security level of 2n/3 bits in the nonce-respecting setting. Subsequently, this bound was further improved to 3n/4 bits of security. Additionally, it was later discovered that security degrades gracefully with nonce repetitions, unlike EWCDM, where the security drops to the birthday bound with a single nonce repetition. Contemporary stream cipher designs aim to minimize the hardware module's resource requirements by incorporating an externally available resource, all while maintaining a high level of security. The security level is typically measured in relation to the size of the volatile internal state, i.e., the state cells within the cipher's hardware module. Several designs have been proposed that continuously access the externally available non-volatile secret key during keystream generation. However, there exists a generic distinguishing attack with birthday bound complexity. We propose schemes that continuously access the externally available non-volatile initial value. For all constructions, conventional or contemporary, we provide proofs of security against generic attacks in the random oracle model. Notably, stream ciphers that use the non-volatile initial value during keystream generation offer security beyond the birthday bound. Based on these findings, we propose a new stream cipher design called DRACO

    Genomic variation detection using dynamic programming methods

    Get PDF
    Thesis advisor: Gabor T. MarthBackground: Due to the rapid development and application of next generation sequencing (NGS) techniques, large amounts of NGS data have become available for genome-related biological research, such as population genetics, evolutionary research, and genome wide association studies. A crucial step of these genome-related studies is the detection of genomic variation between different species and individuals. Current approaches for the detection of genomic variation can be classified into alignment-based variation detection and assembly-based variation detection. Due to the limitation of current NGS read length, alignment-based variation detection remains the mainstream approach. The Smith-Waterman algorithm, which produces the optimal pairwise alignment between two sequences, is frequently used as a key component of fast heuristic read mapping and variation detection tools for next-generation sequencing data. Though various fast Smith-Waterman implementations are developed, they are either designed as monolithic protein database searching tools, which do not return detailed alignment, or they are embedded into other tools. These issues make reusing these efficient Smith-Waterman implementations impractical. After the alignment step in the traditional variation detection pipeline, the afterward variation detection using pileup data and the Bayesian model is also facing great challenges especially from low-complexity genomic regions. Sequencing errors and misalignment problems still influence variation detection (especially INDEL detection) a lot. The accuracy of genomic variation detection still needs to be improved, especially when we work on low- complexity genomic regions and low-quality sequencing data. Results: To facilitate easy integration of the fast Single-Instruction-Multiple-Data Smith-Waterman algorithm into third-party software, we wrote a C/C++ library, which extends Farrar's Striped Smith-Waterman (SSW) to return alignment information in addition to the optimal Smith-Waterman score. In this library we developed a new method to generate the full optimal alignment results and a suboptimal score in linear space at little cost of efficiency. This improvement makes the fast Single-Instruction-Multiple-Data Smith-Waterman become really useful in genomic applications. SSW is available both as a C/C++ software library, as well as a stand-alone alignment tool at: https://github.com/mengyao/Complete- Striped-Smith-Waterman-Library. The SSW library has been used in the primary read mapping tool MOSAIK, the split-read mapping program SCISSORS, the MEI detector TAN- GRAM, and the read-overlap graph generation program RZMBLR. The speeds of the mentioned software are improved significantly by replacing their ordinary Smith-Waterman or banded Smith-Waterman module with the SSW Library. To improve the accuracy of genomic variation detection, especially in low-complexity genomic regions and on low-quality sequencing data, we developed PHV, a genomic variation detection tool based on the profile hidden Markov model. PHV also demonstrates a novel PHMM application in the genomic research field. The banded PHMM algorithms used in PHV make it a very fast whole-genome variation detection tool based on the HMM method. The comparison of PHV to GATK, Samtools and Freebayes for detecting variation from both simulated data and real data shows PHV has good potential for dealing with sequencing errors and misalignments. PHV also successfully detects a 49 bp long deletion that is totally misaligned by the mapping tool, and neglected by GATK and Samtools. Conclusion: The efforts made in this thesis are very meaningful for methodology development in studies of genomic variation detection. The two novel algorithms stated here will also inspire future work in NGS data analysis.Thesis (PhD) — Boston College, 2014.Submitted to: Boston College. Graduate School of Arts and Sciences.Discipline: Biology

    Conformance checking and performance improvement in scheduled processes: A queueing-network perspective

    No full text
    Service processes, for example in transportation, telecommunications or the health sector, are the backbone of today's economies. Conceptual models of service processes enable operational analysis that supports, e.g., resource provisioning or delay prediction. In the presence of event logs containing recorded traces of process execution, such operational models can be mined automatically.In this work, we target the analysis of resource-driven, scheduled processes based on event logs. We focus on processes for which there exists a pre-defined assignment of activity instances to resources that execute activities. Specifically, we approach the questions of conformance checking (how to assess the conformance of the schedule and the actual process execution) and performance improvement (how to improve the operational process performance). The first question is addressed based on a queueing network for both the schedule and the actual process execution. Based on these models, we detect operational deviations and then apply statistical inference and similarity measures to validate the scheduling assumptions, thereby identifying root-causes for these deviations. These results are the starting point for our technique to improve the operational performance. It suggests adaptations of the scheduling policy of the service process to decrease the tardiness (non-punctuality) and lower the flow time. We demonstrate the value of our approach based on a real-world dataset comprising clinical pathways of an outpatient clinic that have been recorded by a real-time location system (RTLS). Our results indicate that the presented technique enables localization of operational bottlenecks along with their root-causes, while our improvement technique yields a decrease in median tardiness and flow time by more than 20%

    Online Learning for the Control of Human Standing via Spinal Cord Stimulation

    Get PDF
    Many applications in recommender systems or experimental design need to make decisions online. Each decision leads to a stochastic reward with initially unknown distribution, while new decisions are made based on the observations of previous rewards. To maximize the total reward, one needs to balance between exploring different strategies and exploiting currently optimal strategies within a given set of strategies. This is the underlying trade-off of a number of clinical neural engineering problems, including brain-computer interface, deep brain stimulation, and spinal cord injury therapy. In these systems, complex electronic and computational systems interact with the human central nervous system. A critical issue is how to control the agents to produce results which are optimal under some measure, for example, efficiently decoding the user's intention in a brain-computer interface or performs temporal and spatial specific stimulation in deep brain stimulation. This dissertation is motivated by electrical sipnal cord stimulation with high dimensional inputs(multi-electrode arrays). The stimulation is applied to promote the function and rehabilitation of the remaining neural circuitry below the spinal cord injury, and enable complex motor behaviors such as stepping and standing. To enable the careful tuning of these stimuli for each patient, the electrode arrays which deliver these stimuli have become increasingly more sophisticated, with a corresponding increase in the number of free parameters over which the stimuli need to be optimized. Since the number of stimuli is growing exponentially with the number of electrodes, algorithmic methods of selecting stimuli is necessary, particularly when the feedback is expensive to get. In many online learning settings, particularly those that involve human feedback, reliable feedback is often limited to pairwise preferences instead of real valued feedback. Examples include implicit or subjective feedback for information retrieval and recommender systems, such as clicks on search results, and subjective feedback on the quality of recommended care. Sometimes with real valued feedback, we require that the sampled function values exceed some prespecified ``safety'' threshold, a requirement that existing algorithms fail to meet. Examples include medical applications where the patients' comfort must be guaranteed; recommender systems aiming to avoid user dissatisfaction; and robotic control, where one seeks to avoid controls that cause physical harm to the platform. This dissertation provides online learning algorithms for several specific online decision-making problems. \selfsparring optimizes the cumulative reward with relative feedback. RankComparison deals with ranking feedback. \safeopt considers the optimization with real valued feedback and safety constraints. \cduel is designed for specific spinal cord injury therapy. A variant of \cduel was implemented in closed-loop human experiments, controlling which epidural stimulating electrodes are used in the spinal cord of SCI patients. The results obtained are compared with concurrent stimulus tuning carried out by human experimenter. These experiments show that this algorithm is at least as effective as the human experimenter, suggesting that this algorithm can be applied to the more challenging problems of enabling and optimizing complex, sensory-dependent behaviors, such as stepping and standing in SCI patients. In order to get reliable quantitative measurements besides comparisons, the standing behaviors of paralyzed patients under spinal cord stimulation are evaluated. The potential of quantifying the quality of bipedal standing in an automatic approach is also shown in this work.</p

    Alzheimer’s Dementia Recognition Through Spontaneous Speech

    Get PDF

    Tiny Machine Learning Environment: Enabling Intelligence on Constrained Devices

    Get PDF
    Running machine learning algorithms (ML) on constrained devices at the extreme edge of the network is problematic due to the computational overhead of ML algorithms, available resources on the embedded platform, and application budget (i.e., real-time requirements, power constraints, etc.). This required the development of specific solutions and development tools for what is now referred to as TinyML. In this dissertation, we focus on improving the deployment and performance of TinyML applications, taking into consideration the aforementioned challenges, especially memory requirements. This dissertation contributed to the construction of the Edge Learning Machine environment (ELM), a platform-independent open source framework that provides three main TinyML services, namely shallow ML, self-supervised ML, and binary deep learning on constrained devices. In this context, this work includes the following steps, which are reflected in the thesis structure. First, we present the performance analysis of state of the art shallow ML algorithms including dense neural networks, implemented on mainstream microcontrollers. The comprehensive analysis in terms of algorithms, hardware platforms, datasets, pre-processing techniques, and configurations shows similar performance results compared to a desktop machine and highlights the impact of these factors on overall performance. Second, despite the assumption that TinyML only permits models inference provided by the scarcity of resources, we have gone a step further and enabled self-supervised on-device training on microcontrollers and tiny IoT devices by developing the Autonomous Edge Pipeline (AEP) system. AEP achieves comparable accuracy compared to the typical TinyML paradigm, i.e., models trained on resource-abundant devices and then deployed on microcontrollers. Next, we present the development of a memory allocation strategy for convolutional neural networks (CNNs) layers, that optimizes memory requirements. This approach reduces the memory footprint without affecting accuracy nor latency. Moreover, e-skin systems share the main requirements of the TinyML fields: enabling intelligence with low memory, low power consumption, and low latency. Therefore, we designed an efficient Tiny CNN architecture for e-skin applications. The architecture leverages the memory allocation strategy presented earlier and provides better performance than existing solutions. A major contribution of the thesis is given by CBin-NN, a library of functions for implementing extremely efficient binary neural networks on constrained devices. The library outperforms state of the art NN deployment solutions by drastically reducing memory footprint and inference latency. All the solutions proposed in this thesis have been implemented on representative devices and tested in relevant applications, of which results are reported and discussed. The ELM framework is open source, and this work is clearly becoming a useful, versatile toolkit for the IoT and TinyML research and development community
    corecore