220 research outputs found

    WebSpecmine: a website for metabolomics data analysis and mining

    Get PDF
    Metabolomics data analysis is an important task in biomedical research. The available tools do not provide a wide variety of methods and data types, nor ways to store and share data and results generated. Thus, we have developed WebSpecmine to overcome the aforementioned limitations. WebSpecmine is a web-based application designed to perform the analysis of metabolomics data based on spectroscopic and chromatographic techniques (NMR, Infrared, UV-visible, and Raman, and LC/GC-MS) and compound concentrations. Users, even those not possessing programming skills, can access several analysis methods including univariate, unsupervised and supervised multivariate statistical analysis, as well as metabolite identification and pathway analysis, also being able to create accounts to store their data and results, either privately or publicly. The tool’s implementation is based in the R project, including its shiny web-based framework. Webspecmine is freely available, supporting all major browsers. We provide abundant documentation, including tutorials and a user guide with case studies.This study was supported by the Portuguese Foundation for Science and Technology (FCT) under the scope of the strategic funding of UID/BIO/04469/2019 unit and BioTecNorte operation (NORTE-01-0145-FEDER-000004) funded by the European Regional Development Fund (ERDF) under the scope of Norte2020—Programa Operacional Regional do Norte. The authors also acknowledge the funding of the project 22231/01/SAICT/2016, “Biodata.pt – Infraestrutura Portuguesa de Dados Biológicos”, funded by FCT and Lisboa 2020/Portugal2020 Partnership Agreement, through the ERDF.info:eu-repo/semantics/publishedVersio

    A Machine Learning Approach to Reduce Dimensional Space in Large Datasets

    Get PDF
    Large datasets computing is a research problem as well as a huge challenge due to massive amounts of data that are mined and crunched in order to successfully analyze these massive datasets because they constitute a valuable source of information over different and cross-folded domains, and therefore it represents an irreplaceable opportunity. Hence, the increasing number of environments that use data-intensive computations need more complex calculations than the ones applied to grid-based infrastructures. In this way, this paper analyzes the most commonly used algorithms regarding to this complex problem of handling large datasets whose part of research efforts are focused on reducing dimensional space. Consequently, we present a novel machine learning method that reduces dimensional space in large datasets. This approach is carried out by developing different phases: merging all datasets as a huge one, performing the Extract, Transform and Load (ETL) process, applying the Principal Component Analysis (PCA) algorithm to machine learning techniques, and finally displaying the data results by means of dashboards. The major contribution in this paper is the development of a novel architecture divided into five phases that presents an hybrid method of machine learning for reducing dimensional space in large datasets. In order to verify the correctness of our proposal, we have presented a case study with a complex dataset, specifically an epileptic seizure recognition database. The experiments carried out are very promising since they present very encouraging results to be applied to a great number of different domains.This work was partially funded by Grant RTI2018-094283-B-C32, ECLIPSE-UA (Spanish Ministry of Education and Science), and in part by the Lucentia AGI Grant. This work was partially funded by GENDER-NET Plus Joint Call on Gender an UN Sustainable Development Goals (European Commission - Grant Agreement 741874), funded in Spain by “La Caixa” Foundation (ID 100010434) with code LCF/PR/DE18/52010001 to MTH

    Development of computational tools for the analysis of 2D-nuclear magnetic resonance data

    Get PDF
    Dissertação de mestrado em BioinformaticsMetabolomics is one of the omics’ sciences that has been gaining a lot of interest due to its potential on correlating an organism’s biochemical activity and its phenotype. The applications of metabolomics are being extended as new techniques reveal new information on metabolic profiles and molecules, thus elucidating biological, chemical and functional knowledge. The main techniques that collect data are based on mass spectrometry and nuclear magnetic resonance (NMR) spectroscopy. The last one has the advantage of analyzing a sample in vivo without damaging it and while its sensitivity is pointed out as a disadvantage, multidimensional NMR delivers a solution to this issue. It adds layers of information, generating new data that requires advanced bioinformatics methods in order to extract biological meaning. Since multidimensional NMR has different approaches within itself, the need to estab lish an integrated framework that allows a researcher to load its data and extract relevant knowledge has become more imperative over the years. Also, establishing common data analysis pipelines on one-dimensional and multidimensional NMR remains a challenge in current scientific research hindering reproducibility across research groups. In recent work from the host group, specmine, an R package for metabolomics and spectral data analysis/mining, has been developed to wrap and deliver key metabolomic methods that allow a researcher to perform a complete analysis. In this dissertation, tools integrated in specmine were developed to read, visualize and analyze two-dimensional (2D) NMR. A new specmine structure was created for this type of data, easing interpretation and data visualization. In terms of visualization a novel approach towards three-dimensional environments enables users to interact with their data allowing peak hovering or identification of rich resonance regions. The selection of which samples to plot, when the user does not specify an input, is based on a signal-to-noise ratio scale which plots samples with opposite signal-to-noise ratios. A method to perform peak detection on 2D NMR based on local maximum search was implemented to obtain a data structure that best benefits from specmine’s functionalities. These include preprocessing, univariate and multivariate analysis as well as machine learning and feature selection methods. The 2D NMR functions were validated using experimental data from two scientific papers, available on metabolomic databases and applying the necessary preprocessing steps to compare spectra and results. These data originated two case studies from different NMR sources, Bruker and Varian, which reinforces specmine’s flexibility. The case studies were carried out using mainly specmine and other packages for specific processing steps, such as, probabilistic quotient normalization. A pipeline to analyze 2D NMR was added to specmine, in a form of a vignette, to provide a guideline for the newly developed functionalities.A metabolómica é uma das ciências ómicas que tem vindo a ganhar muito interesse devido ao seu potencial para correlacionar a atividade bioquímica de um organismo com o seu fenótipo. As aplicações da metabolómica estão em constante crescimento à medida que novas técnicas revelam nova informação sobre perfis metabólicos e moleculares, elucidando conhecimento biológico, químico e funcional. As principais técnicas para recolher este tipo de dados são baseadas em espectrometria de massa e em ressonância magnética nuclear (RMN). Esta última tem a vantagem de analisar uma amostra in vivo sem a danificar e enquanto a sensibilidade da mesma tem sido apontada como uma desvantagem, surge a abordagem de RMN multidimensional melhorando a versão tradicional. Através da medição de outros núcleos adiciona camadas de informação, gerando um novo tipo de dados que requere métodos bioinformáticos avançados para se extrair significado biológico. A existência de várias abordagens para realizar RMN multidimensional leva à crescente necessidade da existência de uma ferramenta que integre este tipo de dados, de forma a permitir ao investigador executar a sua análise de forma eficaz. Adicionalmente, a consolidação de pipelines comuns para analisar dados de RMN uni- e multidimensional permanece um desafio a investigação científica, dificultando a reprodutibilidade de resultados por diferentes grupos de investigação. Em trabalhos recentes do grupo de acolhimento foi desenvolvido um package para o programa R focado na metabolómica e na análise/mineração de dados. Este package, specmine, tem sido melhorado desde o seu desenvolvimento funcionando como uma ferramenta que engloba diferentes métodos permitindo uma análise total a um determinado conjunto de dados. Baseado neste package, mais recentemente foi desenvolvida uma plataforma web integrada, WebSpecmine, com o mesmo propósito que providencia ao utilizador uma interface de utilizador mais fácil e amigável. Nesta dissertação, ferramentas que permitem a leitura, visualização e análise de NMR bidimensional (2D) foram desenvolvidas tendo em conta a sua integração no specmine. Uma nova estrutura foi adicionada ao package, facilitando a interpretação e esquemetazição dos dados. Quanto a visualização, uma abordagem inovadora para ambientes tridimensionais permite ao utilizador interagir com os seus dados através da identificação de regiões espectrais de interesse ou reconhecimento de picos. A visualização de espectros 2D, sem especificação por parte do utilizador, tem por base uma escala de relação sinal/ruído que permite numa primeira instância visualizar as amostras com uma maior e menor diferença entre sinal e ruído. Foi também implementado um método para realizar a deteção de picos em RMN 2D baseado na procura por valores máximos locais. Esta operação tem por objectivo obter uma estrutura de dados simplificada que melhor beneficia das funcionalidades do specmine. Estas incluem operações de pré-processamento, análises uni- e multivariada, métodos de seleção de variáveis e aprendizagem máquina. As funções desenvolvidas para RMN 2D foram validadas com dados experimentais recolhidos de dois artigos científicos, disponíveis em bases de dados de metabolómica e sobre os quais foram aplicados os passos de pré-processamento que permitissem a comparação de resultados. Estes dados originaram dois casos de estudos que abordavam diferentes instrumentos utilizados em RMN, Bruker e Varian, reforçando desta forma a flexibilidade do specmine relativamente as tipologias de dados capazes de serem lidas. Estes casos foram realizados utilizando principalmente o specmine, no entanto, a utilização de packages externos foi necessária para passos de processamento específicos, como por exemplo, a normalização por quociente probabilístico. Uma pipeline para analise de dados RMN 2D foi adicionada ao specmine, sob a forma de vignette, um formato de documentação longa adequado a packages implementados no programa R. Desta forma e proporcionado ao utilizador um conjunto de procedimentos, orientados a utilização correta das funcionalidades implementadas

    Recent Advances in Social Data and Artificial Intelligence 2019

    Get PDF
    The importance and usefulness of subjects and topics involving social data and artificial intelligence are becoming widely recognized. This book contains invited review, expository, and original research articles dealing with, and presenting state-of-the-art accounts pf, the recent advances in the subjects of social data and artificial intelligence, and potentially their links to Cyberspace

    Biometric Systems

    Get PDF
    Because of the accelerating progress in biometrics research and the latest nation-state threats to security, this book's publication is not only timely but also much needed. This volume contains seventeen peer-reviewed chapters reporting the state of the art in biometrics research: security issues, signature verification, fingerprint identification, wrist vascular biometrics, ear detection, face detection and identification (including a new survey of face recognition), person re-identification, electrocardiogram (ECT) recognition, and several multi-modal systems. This book will be a valuable resource for graduate students, engineers, and researchers interested in understanding and investigating this important field of study

    Computational Methods for Medical and Cyber Security

    Get PDF
    Over the past decade, computational methods, including machine learning (ML) and deep learning (DL), have been exponentially growing in their development of solutions in various domains, especially medicine, cybersecurity, finance, and education. While these applications of machine learning algorithms have been proven beneficial in various fields, many shortcomings have also been highlighted, such as the lack of benchmark datasets, the inability to learn from small datasets, the cost of architecture, adversarial attacks, and imbalanced datasets. On the other hand, new and emerging algorithms, such as deep learning, one-shot learning, continuous learning, and generative adversarial networks, have successfully solved various tasks in these fields. Therefore, applying these new methods to life-critical missions is crucial, as is measuring these less-traditional algorithms' success when used in these fields

    Aprendizado de variedades para a síntese de áudio espacial

    Get PDF
    Orientadores: Luiz César Martini, Bruno Sanches MasieroTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: O objetivo do áudio espacial gerado com a técnica binaural é simular uma fonte sonora em localizações espaciais arbitrarias através das Funções de Transferência Relativas à Cabeça (HRTFs) ou também chamadas de Funções de Transferência Anatômicas. As HRTFs modelam a interação entre uma fonte sonora e a antropometria de uma pessoa (e.g., cabeça, torso e orelhas). Se filtrarmos uma fonte de áudio através de um par de HRTFs (uma para cada orelha), o som virtual resultante parece originar-se de uma localização espacial específica. Inspirados em nossos resultados bem sucedidos construindo uma aplicação prática de reconhecimento facial voltada para pessoas com deficiência visual que usa uma interface de usuário baseada em áudio espacial, neste trabalho aprofundamos nossa pesquisa para abordar vários aspectos científicos do áudio espacial. Neste contexto, esta tese analisa como incorporar conhecimentos prévios do áudio espacial usando uma nova representação não-linear das HRTFs baseada no aprendizado de variedades para enfrentar vários desafios de amplo interesse na comunidade do áudio espacial, como a personalização de HRTFs, a interpolação de HRTFs e a melhoria da localização de fontes sonoras. O uso do aprendizado de variedades para áudio espacial baseia-se no pressuposto de que os dados (i.e., as HRTFs) situam-se em uma variedade de baixa dimensão. Esta suposição também tem sido de grande interesse entre pesquisadores em neurociência computacional, que argumentam que as variedades são cruciais para entender as relações não lineares subjacentes à percepção no cérebro. Para todas as nossas contribuições usando o aprendizado de variedades, a construção de uma única variedade entre os sujeitos através de um grafo Inter-sujeito (Inter-subject graph, ISG) revelou-se como uma poderosa representação das HRTFs capaz de incorporar conhecimento prévio destas e capturar seus fatores subjacentes. Além disso, a vantagem de construir uma única variedade usando o nosso ISG e o uso de informações de outros indivíduos para melhorar o desempenho geral das técnicas aqui propostas. Os resultados mostram que nossas técnicas baseadas no ISG superam outros métodos lineares e não-lineares nos desafios de áudio espacial abordados por esta teseAbstract: The objective of binaurally rendered spatial audio is to simulate a sound source in arbitrary spatial locations through the Head-Related Transfer Functions (HRTFs). HRTFs model the direction-dependent influence of ears, head, and torso on the incident sound field. When an audio source is filtered through a pair of HRTFs (one for each ear), a listener is capable of perceiving a sound as though it were reproduced at a specific location in space. Inspired by our successful results building a practical face recognition application aimed at visually impaired people that uses a spatial audio user interface, in this work we have deepened our research to address several scientific aspects of spatial audio. In this context, this thesis explores the incorporation of spatial audio prior knowledge using a novel nonlinear HRTF representation based on manifold learning, which tackles three major challenges of broad interest among the spatial audio community: HRTF personalization, HRTF interpolation, and human sound localization improvement. Exploring manifold learning for spatial audio is based on the assumption that the data (i.e. the HRTFs) lies on a low-dimensional manifold. This assumption has also been of interest among researchers in computational neuroscience, who argue that manifolds are crucial for understanding the underlying nonlinear relationships of perception in the brain. For all of our contributions using manifold learning, the construction of a single manifold across subjects through an Inter-subject Graph (ISG) has proven to lead to a powerful HRTF representation capable of incorporating prior knowledge of HRTFs and capturing the underlying factors of spatial hearing. Moreover, the use of our ISG to construct a single manifold offers the advantage of employing information from other individuals to improve the overall performance of the techniques herein proposed. The results show that our ISG-based techniques outperform other linear and nonlinear methods in tackling the spatial audio challenges addressed by this thesisDoutoradoEngenharia de ComputaçãoDoutor em Engenharia Elétrica2014/14630-9FAPESPCAPE

    Systems Analytics and Integration of Big Omics Data

    Get PDF
    A “genotype"" is essentially an organism's full hereditary information which is obtained from its parents. A ""phenotype"" is an organism's actual observed physical and behavioral properties. These may include traits such as morphology, size, height, eye color, metabolism, etc. One of the pressing challenges in computational and systems biology is genotype-to-phenotype prediction. This is challenging given the amount of data generated by modern Omics technologies. This “Big Data” is so large and complex that traditional data processing applications are not up to the task. Challenges arise in collection, analysis, mining, sharing, transfer, visualization, archiving, and integration of these data. In this Special Issue, there is a focus on the systems-level analysis of Omics data, recent developments in gene ontology annotation, and advances in biological pathways and network biology. The integration of Omics data with clinical and biomedical data using machine learning is explored. This Special Issue covers new methodologies in the context of gene–environment interactions, tissue-specific gene expression, and how external factors or host genetics impact the microbiome

    Physical Diagnosis and Rehabilitation Technologies

    Get PDF
    The book focuses on the diagnosis, evaluation, and assistance of gait disorders; all the papers have been contributed by research groups related to assistive robotics, instrumentations, and augmentative devices

    SYNERGY OF BUILDING CYBERSECURITY SYSTEMS

    Get PDF
    The development of the modern world community is closely related to advances in computing resources and cyberspace. The formation and expansion of the range of services is based on the achievements of mankind in the field of high technologies. However, the rapid growth of computing resources, the emergence of a full-scale quantum computer tightens the requirements for security systems not only for information and communication systems, but also for cyber-physical systems and technologies. The methodological foundations of building security systems for critical infrastructure facilities based on modeling the processes of behavior of antagonistic agents in security systems are discussed in the first chapter. The concept of information security in social networks, based on mathematical models of data protection, taking into account the influence of specific parameters of the social network, the effects on the network are proposed in second chapter. The nonlinear relationships of the parameters of the defense system, attacks, social networks, as well as the influence of individual characteristics of users and the nature of the relationships between them, takes into account. In the third section, practical aspects of the methodology for constructing post-quantum algorithms for asymmetric McEliece and Niederreiter cryptosystems on algebraic codes (elliptic and modified elliptic codes), their mathematical models and practical algorithms are considered. Hybrid crypto-code constructions of McEliece and Niederreiter on defective codes are proposed. They can significantly reduce the energy costs for implementation, while ensuring the required level of cryptographic strength of the system as a whole. The concept of security of corporate information and educational systems based on the construction of an adaptive information security system is proposed. ISBN 978-617-7319-31-2 (on-line)ISBN 978-617-7319-32-9 (print) ------------------------------------------------------------------------------------------------------------------ How to Cite: Yevseiev, S., Ponomarenko, V., Laptiev, O., Milov, O., Korol, O., Milevskyi, S. et. al.; Yevseiev, S., Ponomarenko, V., Laptiev, O., Milov, O. (Eds.) (2021). Synergy of building cybersecurity systems. Kharkiv: РС ТЕСHNOLOGY СЕNTЕR, 188. doi: http://doi.org/10.15587/978-617-7319-31-2 ------------------------------------------------------------------------------------------------------------------ Indexing:                    Розвиток сучасної світової спільноти тісно пов’язаний з досягненнями в області обчислювальних ресурсів і кіберпростору. Формування та розширення асортименту послуг базується на досягненнях людства у галузі високих технологій. Однак стрімке зростання обчислювальних ресурсів, поява повномасштабного квантового комп’ютера посилює вимоги до систем безпеки не тільки інформаційно-комунікаційних, але і до кіберфізичних систем і технологій. У першому розділі обговорюються методологічні основи побудови систем безпеки для об'єктів критичної інфраструктури на основі моделювання процесів поведінки антагоністичних агентів у систем безпеки. У другому розділі пропонується концепція інформаційної безпеки в соціальних мережах, яка заснована на математичних моделях захисту даних, з урахуванням впливу конкретних параметрів соціальної мережі та наслідків для неї. Враховуються нелінійні взаємозв'язки параметрів системи захисту, атак, соціальних мереж, а також вплив індивідуальних характеристик користувачів і характеру взаємовідносин між ними. У третьому розділі розглядаються практичні аспекти методології побудови постквантових алгоритмів для асиметричних криптосистем Мак-Еліса та Нідеррейтера на алгебраїчних кодах (еліптичних та модифікованих еліптичних кодах), їх математичні моделі та практичні алгоритми. Запропоновано гібридні конструкції криптокоду Мак-Еліса та Нідеррейтера на дефектних кодах. Вони дозволяють істотно знизити енергетичні витрати на реалізацію, забезпечуючи при цьому необхідний рівень криптографічної стійкості системи в цілому. Запропоновано концепцію безпеки корпоративних інформаційних та освітніх систем, які засновані на побудові адаптивної системи захисту інформації. ISBN 978-617-7319-31-2 (on-line)ISBN 978-617-7319-32-9 (print) ------------------------------------------------------------------------------------------------------------------ Як цитувати: Yevseiev, S., Ponomarenko, V., Laptiev, O., Milov, O., Korol, O., Milevskyi, S. et. al.; Yevseiev, S., Ponomarenko, V., Laptiev, O., Milov, O. (Eds.) (2021). Synergy of building cybersecurity systems. Kharkiv: РС ТЕСHNOLOGY СЕNTЕR, 188. doi: http://doi.org/10.15587/978-617-7319-31-2 ------------------------------------------------------------------------------------------------------------------ Індексація:                 &nbsp
    corecore