24 research outputs found

    2022 GREAT Day Program

    Get PDF
    SUNY Geneseo’s Sixteenth Annual GREAT Day.https://knightscholar.geneseo.edu/program-2007/1016/thumbnail.jp

    Adaptive Automated Machine Learning

    Get PDF
    The ever-growing demand for machine learning has led to the development of automated machine learning (AutoML) systems that can be used off the shelf by non-experts. Further, the demand for ML applications with high predictive performance exceeds the number of machine learning experts and makes the development of AutoML systems necessary. Automated Machine Learning tackles the problem of finding machine learning models with high predictive performance. Existing approaches incorporating deep learning techniques assume that all data is available at the beginning of the training process (offline learning). They configure and optimise a pipeline of preprocessing, feature engineering, and model selection by choosing suitable hyperparameters in each model pipeline step. Furthermore, they assume that the user is fully aware of the choice and, thus, the consequences of the underlying metric (such as precision, recall, or F1-measure). By variation of this metric, the search for suitable configurations and thus the adaptation of algorithms can be tailored to the user’s needs. With the creation of a vast amount of data from all kinds of sources every day, our capability to process and understand these data sets in a single batch is no longer viable. By training machine learning models incrementally (i.ex. online learning), the flood of data can be processed sequentially within data streams. However, if one assumes an online learning scenario, where an AutoML instance executes on evolving data streams, the question of the best model and its configuration remains open. In this work, we address the adaptation of AutoML in an offline learning scenario toward a certain utility an end-user might pursue as well as the adaptation of AutoML towards evolving data streams in an online learning scenario with three main contributions: 1. We propose a System that allows the adaptation of AutoML and the search for neural architectures towards a particular utility an end-user might pursue. 2. We introduce an online deep learning framework that fosters the research of deep learning models under the online learning assumption and enables the automated search for neural architectures. 3. We introduce an online AutoML framework that allows the incremental adaptation of ML models. We evaluate the contributions individually, in accordance with predefined requirements and to state-of-the- art evaluation setups. The outcomes lead us to conclude that (i) AutoML, as well as systems for neural architecture search, can be steered towards individual utilities by learning a designated ranking model from pairwise preferences and using the latter as the target function for the offline learning scenario; (ii) architectual small neural networks are in general suitable assuming an online learning scenario; (iii) the configuration of machine learning pipelines can be automatically be adapted to ever-evolving data streams and lead to better performances

    Differential evolution of non-coding DNA across eukaryotes and its close relationship with complex multicellularity on Earth

    Get PDF
    Here, I elaborate on the hypothesis that complex multicellularity (CM, sensu Knoll) is a major evolutionary transition (sensu Szathmary), which has convergently evolved a few times in Eukarya only: within red and brown algae, plants, animals, and fungi. Paradoxically, CM seems to correlate with the expansion of non-coding DNA (ncDNA) in the genome rather than with genome size or the total number of genes. Thus, I investigated the correlation between genome and organismal complexities across 461 eukaryotes under a phylogenetically controlled framework. To that end, I introduce the first formal definitions and criteria to distinguish ‘unicellularity’, ‘simple’ (SM) and ‘complex’ multicellularity. Rather than using the limited available estimations of unique cell types, the 461 species were classified according to our criteria by reviewing their life cycle and body plan development from literature. Then, I investigated the evolutionary association between genome size and 35 genome-wide features (introns and exons from protein-coding genes, repeats and intergenic regions) describing the coding and ncDNA complexities of the 461 genomes. To that end, I developed ‘GenomeContent’, a program that systematically retrieves massive multidimensional datasets from gene annotations and calculates over 100 genome-wide statistics. R-scripts coupled to parallel computing were created to calculate >260,000 phylogenetic controlled pairwise correlations. As previously reported, both repetitive and non-repetitive DNA are found to be scaling strongly and positively with genome size across most eukaryotic lineages. Contrasting previous studies, I demonstrate that changes in the length and repeat composition of introns are only weakly or moderately associated with changes in genome size at the global phylogenetic scale, while changes in intron abundance (within and across genes) are either not or only very weakly associated with changes in genome size. Our evolutionary correlations are robust to: different phylogenetic regression methods, uncertainties in the tree of eukaryotes, variations in genome size estimates, and randomly reduced datasets. Then, I investigated the correlation between the 35 genome-wide features and the cellular complexity of the 461 eukaryotes with phylogenetic Principal Component Analyses. Our results endorse a genetic distinction between SM and CM in Archaeplastida and Metazoa, but not so clearly in Fungi. Remarkably, complex multicellular organisms and their closest ancestral relatives are characterized by high intron-richness, regardless of genome size. Finally, I argue why and how a vast expansion of non-coding RNA (ncRNA) regulators rather than of novel protein regulators can promote the emergence of CM in Eukarya. As a proof of concept, I co-developed a novel ‘ceRNA-motif pipeline’ for the prediction of “competing endogenous” ncRNAs (ceRNAs) that regulate microRNAs in plants. We identified three candidate ceRNAs motifs: MIM166, MIM171 and MIM159/319, which were found to be conserved across land plants and be potentially involved in diverse developmental processes and stress responses. Collectively, the findings of this dissertation support our hypothesis that CM on Earth is a major evolutionary transition promoted by the expansion of two major ncDNA classes, introns and regulatory ncRNAs, which might have boosted the irreversible commitment of cell types in certain lineages by canalizing the timing and kinetics of the eukaryotic transcriptome.:Cover page Abstract Acknowledgements Index 1. The structure of this thesis 1.1. Structure of this PhD dissertation 1.2. Publications of this PhD dissertation 1.3. Computational infrastructure and resources 1.4. Disclosure of financial support and information use 1.5. Acknowledgements 1.6. Author contributions and use of impersonal and personal pronouns 2. Biological background 2.1. The complexity of the eukaryotic genome 2.2. The problem of counting and defining “genes” in eukaryotes 2.3. The “function” concept for genes and “dark matter” 2.4. Increases of organismal complexity on Earth through multicellularity 2.5. Multicellularity is a “fitness transition” in individuality 2.6. The complexity of cell differentiation in multicellularity 3. Technical background 3.1. The Phylogenetic Comparative Method (PCM) 3.2. RNA secondary structure prediction 3.3. Some standards for genome and gene annotation 4. What is in a eukaryotic genome? GenomeContent provides a good answer 4.1. Background 4.2. Motivation: an interoperable tool for data retrieval of gene annotations 4.3. Methods 4.4. Results 4.5. Discussion 5. The evolutionary correlation between genome size and ncDNA 5.1. Background 5.2. Motivation: estimating the relationship between genome size and ncDNA 5.3. Methods 5.4. Results 5.5. Discussion 6. The relationship between non-coding DNA and Complex Multicellularity 6.1. Background 6.2. Motivation: How to define and measure complex multicellularity across eukaryotes? 6.3. Methods 6.4. Results 6.5. Discussion 7. The ceRNA motif pipeline: regulation of microRNAs by target mimics 7.1. Background 7.2. A revisited protocol for the computational analysis of Target Mimics 7.3. Motivation: a novel pipeline for ceRNA motif discovery 7.4. Methods 7.5. Results 7.6. Discussion 8. Conclusions and outlook 8.1. Contributions and lessons for the bioinformatics of large-scale comparative analyses 8.2. Intron features are evolutionarily decoupled among themselves and from genome size throughout Eukarya 8.3. “Complex multicellularity” is a major evolutionary transition 8.4. Role of RNA throughout the evolution of life and complex multicellularity on Earth 9. Supplementary Data Bibliography Curriculum Scientiae Selbständigkeitserklärung (declaration of authorship

    Geo-Information Technology and Its Applications

    Get PDF
    Geo-information technology has been playing an ever more important role in environmental monitoring, land resource quantification and mapping, geo-disaster damage and risk assessment, urban planning and smart city development. This book focuses on the fundamental and applied research in these domains, aiming to promote exchanges and communications, share the research outcomes of scientists worldwide and to put these achievements better social use. This Special Issue collects fourteen high-quality research papers and is expected to provide a useful reference and technical support for graduate students, scientists, civil engineers and experts of governments to valorize scientific research

    Gas, Water and Solid Waste Treatment Technology

    Get PDF
    This book introduces a variety of treatment technologies, such as physical, chemical, and biological methods for the treatment of gas emissions, wastewater, and solid waste. It provides a useful source of information for engineers and specialists, as well as for undergraduate and postgraduate students, in the areas of environmental science and engineering

    Data-Intensive Computing in Smart Microgrids

    Get PDF
    Microgrids have recently emerged as the building block of a smart grid, combining distributed renewable energy sources, energy storage devices, and load management in order to improve power system reliability, enhance sustainable development, and reduce carbon emissions. At the same time, rapid advancements in sensor and metering technologies, wireless and network communication, as well as cloud and fog computing are leading to the collection and accumulation of large amounts of data (e.g., device status data, energy generation data, consumption data). The application of big data analysis techniques (e.g., forecasting, classification, clustering) on such data can optimize the power generation and operation in real time by accurately predicting electricity demands, discovering electricity consumption patterns, and developing dynamic pricing mechanisms. An efficient and intelligent analysis of the data will enable smart microgrids to detect and recover from failures quickly, respond to electricity demand swiftly, supply more reliable and economical energy, and enable customers to have more control over their energy use. Overall, data-intensive analytics can provide effective and efficient decision support for all of the producers, operators, customers, and regulators in smart microgrids, in order to achieve holistic smart energy management, including energy generation, transmission, distribution, and demand-side management. This book contains an assortment of relevant novel research contributions that provide real-world applications of data-intensive analytics in smart grids and contribute to the dissemination of new ideas in this area

    Algorithms for Fault Detection and Diagnosis

    Get PDF
    Due to the increasing demand for security and reliability in manufacturing and mechatronic systems, early detection and diagnosis of faults are key points to reduce economic losses caused by unscheduled maintenance and downtimes, to increase safety, to prevent the endangerment of human beings involved in the process operations and to improve reliability and availability of autonomous systems. The development of algorithms for health monitoring and fault and anomaly detection, capable of the early detection, isolation, or even prediction of technical component malfunctioning, is becoming more and more crucial in this context. This Special Issue is devoted to new research efforts and results concerning recent advances and challenges in the application of “Algorithms for Fault Detection and Diagnosis”, articulated over a wide range of sectors. The aim is to provide a collection of some of the current state-of-the-art algorithms within this context, together with new advanced theoretical solutions

    Друга міжнародна конференція зі сталого майбутнього: екологічні, технологічні, соціальні та економічні питання (ICSF 2021). Кривий Ріг, Україна, 19-21 травня 2021 року

    Get PDF
    Second International Conference on Sustainable Futures: Environmental, Technological, Social and Economic Matters (ICSF 2021). Kryvyi Rih, Ukraine, May 19-21, 2021.Друга міжнародна конференція зі сталого майбутнього: екологічні, технологічні, соціальні та економічні питання (ICSF 2021). Кривий Ріг, Україна, 19-21 травня 2021 року

    Prediksi Inflasi Indonesia Berdasarkan Fuzzy Ann Menggunakan Algoritma Genetika

    Get PDF
    Pemangku kebijakan moneter memiliki ketakutan terhadap inflasi karena dapat memicu naiknya angka kemiskinan dan melonjaknya penggunaan anggaran. Tingkat Inflasi yang tinggi akan mengakibatkan jatuhnya perekonomian suatu negara. Pengambilan kebijakan moneter perlu dikaji secara mendalam un-tuk mencegah hal tersebut. Salah satu upaya yang dapat dilakukan adalah dengan melakukan prediksi inflasi yang akan terjadi. Data tingkat inflasi dari waktu ke waktu merupakan modal untuk melakukan prediksi tingkat inflasi pada waktu mendatang. Suatu prediksi yang baik memiliki nilai error yang kecil. Pada prediksi menggunakan fuzzy artificial neural network (Fuzzy ANN) metode backpropagation, nilai error dapat diperkecil dengan melakukan optimasi pada bobot yang dihasilkan. Pada penelitian ini, op-timasi bobot Fuzzy AAN dilakukan menggunakan algoritma genetika. Model prediksi yang diperoleh se-lanjutnya dievaluasi menggunakan MAPE untuk menentukan keakuratan prediksi. Hasil penelitian menunjukkan bahwa prediksi menggunakan backpropagation neural network dioptimasi menggunakan algoritma genetika (10,33%) lebih baik dibandingkan dengan prediksi menggunakan backpropagation neural network saja (11,67%). Setelah mengetahui bahwa kedua model memiliki hasil prediksi yang cukup baik, keakuratan kedua model dibandingkan menggunakan independent sampe t-test berdasarakan error yang dihasilkan. Hasilnya menjukkan bahwa pada tingkat kepercayaan 95% prediksi menggunakan Fuzzy ANN yang telah dioptimasi menggunakan algoritma genetika (M= 0,69, SD= 0,0421) lebih baik secara signifikan dibandingkan degan fuzzy  ANN saja (M= 0.97, SD= 0,04634 ),  t(22 )= 1.71714, p=0.013

    Knowledge Capturing in Design Briefing Process for Requirement Elicitation and Validation

    Get PDF
    Knowledge capturing and reusing are major processes of knowledge management that deal with the elicitation of valuable knowledge via some techniques and methods for use in actual and further studies, projects, services, or products. The construction industry, as well, adopts and uses some of these concepts to improve various construction processes and stages. From pre-design to building delivery knowledge management principles and briefing frameworks have been implemented across project stakeholders: client, design teams, construction teams, consultants, and facility management teams. At pre-design and design stages, understanding the client’s needs and users’ knowledge are crucial for identifying and articulating the expected requirements and objectives. Due to underperforming results and missed goals and objectives, many projects finish with highly dissatisfied clients and loss of contracts for some organizations. Knowledge capturing has beneficial effects via its principles and methods on requirement elicitation and validation at the briefing stage between user, client and designer. This paper presents the importance and usage of knowledge capturing and reusing in briefing process at pre-design and design stages especially the involvement of client and user, and explores the techniques and technologies that are usable in briefing process for requirement elicitation
    corecore