5,405 research outputs found

    Neurocognitive Informatics Manifesto.

    Get PDF
    Informatics studies all aspects of the structure of natural and artificial information systems. Theoretical and abstract approaches to information have made great advances, but human information processing is still unmatched in many areas, including information management, representation and understanding. Neurocognitive informatics is a new, emerging field that should help to improve the matching of artificial and natural systems, and inspire better computational algorithms to solve problems that are still beyond the reach of machines. In this position paper examples of neurocognitive inspirations and promising directions in this area are given

    The Five Factor Model of personality and evaluation of drug consumption risk

    Full text link
    The problem of evaluating an individual's risk of drug consumption and misuse is highly important. An online survey methodology was employed to collect data including Big Five personality traits (NEO-FFI-R), impulsivity (BIS-11), sensation seeking (ImpSS), and demographic information. The data set contained information on the consumption of 18 central nervous system psychoactive drugs. Correlation analysis demonstrated the existence of groups of drugs with strongly correlated consumption patterns. Three correlation pleiades were identified, named by the central drug in the pleiade: ecstasy, heroin, and benzodiazepines pleiades. An exhaustive search was performed to select the most effective subset of input features and data mining methods to classify users and non-users for each drug and pleiad. A number of classification methods were employed (decision tree, random forest, kk-nearest neighbors, linear discriminant analysis, Gaussian mixture, probability density function estimation, logistic regression and na{\"i}ve Bayes) and the most effective classifier was selected for each drug. The quality of classification was surprisingly high with sensitivity and specificity (evaluated by leave-one-out cross-validation) being greater than 70\% for almost all classification tasks. The best results with sensitivity and specificity being greater than 75\% were achieved for cannabis, crack, ecstasy, legal highs, LSD, and volatile substance abuse (VSA).Comment: Significantly extended report with 67 pages, 27 tables, 21 figure

    Disease diagnosis in smart healthcare: Innovation, technologies and applications

    Get PDF
    To promote sustainable development, the smart city implies a global vision that merges artificial intelligence, big data, decision making, information and communication technology (ICT), and the internet-of-things (IoT). The ageing issue is an aspect that researchers, companies and government should devote efforts in developing smart healthcare innovative technology and applications. In this paper, the topic of disease diagnosis in smart healthcare is reviewed. Typical emerging optimization algorithms and machine learning algorithms are summarized. Evolutionary optimization, stochastic optimization and combinatorial optimization are covered. Owning to the fact that there are plenty of applications in healthcare, four applications in the field of diseases diagnosis (which also list in the top 10 causes of global death in 2015), namely cardiovascular diseases, diabetes mellitus, Alzheimer’s disease and other forms of dementia, and tuberculosis, are considered. In addition, challenges in the deployment of disease diagnosis in healthcare have been discussed

    The multiple pheromone ant clustering algorithm and its application to real world domains

    Get PDF
    The Multiple Pheromone Ant Clustering Algorithm (MPACA) models the collective behaviour of ants to find clusters in data and to assign objects to the most appropriate class. It is an ant colony optimisation approach that uses pheromones to mark paths linking objects that are similar and potentially members of the same cluster or class. Its novelty is in the way it uses separate pheromones for each descriptive attribute of the object rather than a single pheromone representing the whole object. Ants that encounter other ants frequently enough can combine the attribute values they are detecting, which enables the MPACA to learn influential variable interactions. This paper applies the model to real-world data from two domains. One is logistics, focusing on resource allocation rather than the more traditional vehicle-routing problem. The other is mental-health risk assessment. The task for the MPACA in each domain was to predict class membership where the classes for the logistics domain were the levels of demand on haulage company resources and the mental-health classes were levels of suicide risk. Results on these noisy real-world data were promising, demonstrating the ability of the MPACA to find patterns in the data with accuracy comparable to more traditional linear regression models

    Performance Evaluation of Smart Decision Support Systems on Healthcare

    Get PDF
    Medical activity requires responsibility not only from clinical knowledge and skill but also on the management of an enormous amount of information related to patient care. It is through proper treatment of information that experts can consistently build a healthy wellness policy. The primary objective for the development of decision support systems (DSSs) is to provide information to specialists when and where they are needed. These systems provide information, models, and data manipulation tools to help experts make better decisions in a variety of situations. Most of the challenges that smart DSSs face come from the great difficulty of dealing with large volumes of information, which is continuously generated by the most diverse types of devices and equipment, requiring high computational resources. This situation makes this type of system susceptible to not recovering information quickly for the decision making. As a result of this adversity, the information quality and the provision of an infrastructure capable of promoting the integration and articulation among different health information systems (HIS) become promising research topics in the field of electronic health (e-health) and that, for this same reason, are addressed in this research. The work described in this thesis is motivated by the need to propose novel approaches to deal with problems inherent to the acquisition, cleaning, integration, and aggregation of data obtained from different sources in e-health environments, as well as their analysis. To ensure the success of data integration and analysis in e-health environments, it is essential that machine-learning (ML) algorithms ensure system reliability. However, in this type of environment, it is not possible to guarantee a reliable scenario. This scenario makes intelligent SAD susceptible to predictive failures, which severely compromise overall system performance. On the other hand, systems can have their performance compromised due to the overload of information they can support. To solve some of these problems, this thesis presents several proposals and studies on the impact of ML algorithms in the monitoring and management of hypertensive disorders related to pregnancy of risk. The primary goals of the proposals presented in this thesis are to improve the overall performance of health information systems. In particular, ML-based methods are exploited to improve the prediction accuracy and optimize the use of monitoring device resources. It was demonstrated that the use of this type of strategy and methodology contributes to a significant increase in the performance of smart DSSs, not only concerning precision but also in the computational cost reduction used in the classification process. The observed results seek to contribute to the advance of state of the art in methods and strategies based on AI that aim to surpass some challenges that emerge from the integration and performance of the smart DSSs. With the use of algorithms based on AI, it is possible to quickly and automatically analyze a larger volume of complex data and focus on more accurate results, providing high-value predictions for a better decision making in real time and without human intervention.A atividade médica requer responsabilidade não apenas com base no conhecimento e na habilidade clínica, mas também na gestão de uma enorme quantidade de informações relacionadas ao atendimento ao paciente. É através do tratamento adequado das informações que os especialistas podem consistentemente construir uma política saudável de bem-estar. O principal objetivo para o desenvolvimento de sistemas de apoio à decisão (SAD) é fornecer informações aos especialistas onde e quando são necessárias. Esses sistemas fornecem informações, modelos e ferramentas de manipulação de dados para ajudar os especialistas a tomar melhores decisões em diversas situações. A maioria dos desafios que os SAD inteligentes enfrentam advêm da grande dificuldade de lidar com grandes volumes de dados, que é gerada constantemente pelos mais diversos tipos de dispositivos e equipamentos, exigindo elevados recursos computacionais. Essa situação torna este tipo de sistemas suscetível a não recuperar a informação rapidamente para a tomada de decisão. Como resultado dessa adversidade, a qualidade da informação e a provisão de uma infraestrutura capaz de promover a integração e a articulação entre diferentes sistemas de informação em saúde (SIS) tornam-se promissores tópicos de pesquisa no campo da saúde eletrônica (e-saúde) e que, por essa mesma razão, são abordadas nesta investigação. O trabalho descrito nesta tese é motivado pela necessidade de propor novas abordagens para lidar com os problemas inerentes à aquisição, limpeza, integração e agregação de dados obtidos de diferentes fontes em ambientes de e-saúde, bem como sua análise. Para garantir o sucesso da integração e análise de dados em ambientes e-saúde é importante que os algoritmos baseados em aprendizagem de máquina (AM) garantam a confiabilidade do sistema. No entanto, neste tipo de ambiente, não é possível garantir um cenário totalmente confiável. Esse cenário torna os SAD inteligentes suscetíveis à presença de falhas de predição que comprometem seriamente o desempenho geral do sistema. Por outro lado, os sistemas podem ter seu desempenho comprometido devido à sobrecarga de informações que podem suportar. Para tentar resolver alguns destes problemas, esta tese apresenta várias propostas e estudos sobre o impacto de algoritmos de AM na monitoria e gestão de transtornos hipertensivos relacionados com a gravidez (gestação) de risco. O objetivo das propostas apresentadas nesta tese é melhorar o desempenho global de sistemas de informação em saúde. Em particular, os métodos baseados em AM são explorados para melhorar a precisão da predição e otimizar o uso dos recursos dos dispositivos de monitorização. Ficou demonstrado que o uso deste tipo de estratégia e metodologia contribui para um aumento significativo do desempenho dos SAD inteligentes, não só em termos de precisão, mas também na diminuição do custo computacional utilizado no processo de classificação. Os resultados observados buscam contribuir para o avanço do estado da arte em métodos e estratégias baseadas em inteligência artificial que visam ultrapassar alguns desafios que advêm da integração e desempenho dos SAD inteligentes. Como o uso de algoritmos baseados em inteligência artificial é possível analisar de forma rápida e automática um volume maior de dados complexos e focar em resultados mais precisos, fornecendo previsões de alto valor para uma melhor tomada de decisão em tempo real e sem intervenção humana

    Agrupamiento, predicción y clasificación ordinal para series temporales utilizando técnicas de machine learning: aplicaciones

    Get PDF
    In the last years, there has been an increase in the number of fields improving their standard processes by using machine learning (ML) techniques. The main reason for this is that the vast amount of data generated by these processes is difficult to be processed by humans. Therefore, the development of automatic methods to process and extract relevant information from these data processes is of great necessity, giving that these approaches could lead to an increase in the economic benefit of enterprises or to a reduction in the workload of some current employments. Concretely, in this Thesis, ML approaches are applied to problems concerning time series data. Time series is a special kind of data in which data points are collected chronologically. Time series are present in a wide variety of fields, such as atmospheric events or engineering applications. Besides, according to the main objective to be satisfied, there are different tasks in the literature applied to time series. Some of them are those on which this Thesis is mainly focused: clustering, classification, prediction and, in general, analysis. Generally, the amount of data to be processed is huge, arising the need of methods able to reduce the dimensionality of time series without decreasing the amount of information. In this sense, the application of time series segmentation procedures dividing the time series into different subsequences is a good option, given that each segment defines a specific behaviour. Once the different segments are obtained, the use of statistical features to characterise them is an excellent way to maximise the information of the time series and simultaneously reducing considerably their dimensionality. In the case of time series clustering, the objective is to find groups of similar time series with the idea of discovering interesting patterns in time series datasets. In this Thesis, we have developed a novel time series clustering technique. The aim of this proposal is twofold: to reduce as much as possible the dimensionality and to develop a time series clustering approach able to outperform current state-of-the-art techniques. In this sense, for the first objective, the time series are segmented in order to divide the them identifying different behaviours. Then, these segments are projected into a vector of statistical features aiming to reduce the dimensionality of the time series. Once this preprocessing step is done, the clustering of the time series is carried out, with a significantly lower computational load. This novel approach has been tested on all the time series datasets available in the University of East Anglia and University of California Riverside (UEA/UCR) time series classification (TSC) repository. Regarding time series classification, two main paths could be differentiated: firstly, nominal TSC, which is a well-known field involving a wide variety of proposals and transformations applied to time series. Concretely, one of the most popular transformation is the shapelet transform (ST), which has been widely used in this field. The original method extracts shapelets from the original time series and uses them for classification purposes. Nevertheless, the full enumeration of all possible shapelets is very time consuming. Therefore, in this Thesis, we have developed a hybrid method that starts with the best shapelets extracted by using the original approach with a time constraint and then tunes these shapelets by using a convolutional neural network (CNN) model. Secondly, time series ordinal classification (TSOC) is an unexplored field beginning with this Thesis. In this way, we have adapted the original ST to the ordinal classification (OC) paradigm by proposing several shapelet quality measures taking advantage of the ordinal information of the time series. This methodology leads to better results than the state-of-the-art TSC techniques for those ordinal time series datasets. All these proposals have been tested on all the time series datasets available in the UEA/UCR TSC repository. With respect to time series prediction, it is based on estimating the next value or values of the time series by considering the previous ones. In this Thesis, several different approaches have been considered depending on the problem to be solved. Firstly, the prediction of low-visibility events produced by fog conditions is carried out by means of hybrid autoregressive models (ARs) combining fixed-size and dynamic windows, adapting itself to the dynamics of the time series. Secondly, the prediction of convective cloud formation (which is a highly imbalance problem given that the number of convective cloud events is much lower than that of non-convective situations) is performed in two completely different ways: 1) tackling the problem as a multi-objective classification task by the use of multi-objective evolutionary artificial neural networks (MOEANNs), in which the two conflictive objectives are accuracy of the minority class and the global accuracy, and 2) tackling the problem from the OC point of view, in which, in order to reduce the imbalance degree, an oversampling approach is proposed along with the use of OC techniques. Thirdly, the prediction of solar radiation is carried out by means of evolutionary artificial neural networks (EANNs) with different combinations of basis functions in the hidden and output layers. Finally, the last challenging problem is the prediction of energy flux from waves and tides. For this, a multitask EANN has been proposed aiming to predict the energy flux at several prediction time horizons (from 6h to 48h). All these proposals and techniques have been corroborated and discussed according to physical and atmospheric models. The work developed in this Thesis is supported by 11 JCR-indexed papers in international journals (7 Q1, 3 Q2, 1 Q3), 11 papers in international conferences, and 4 papers in national conferences
    corecore