10 research outputs found

    Forschungsnachrichten

    Get PDF

    Predicting the Critical Number of Layers for Hierarchical Support Vector Regression

    Full text link
    Hierarchical support vector regression (HSVR) models a function from data as a linear combination of SVR models at a range of scales, starting at a coarse scale and moving to finer scales as the hierarchy continues. In the original formulation of HSVR, there were no rules for choosing the depth of the model. In this paper, we observe in a number of models a phase transition in the training error -- the error remains relatively constant as layers are added, until a critical scale is passed, at which point the training error drops close to zero and remains nearly constant for added layers. We introduce a method to predict this critical scale a priori with the prediction based on the support of either a Fourier transform of the data or the Dynamic Mode Decomposition (DMD) spectrum. This allows us to determine the required number of layers prior to training any models.Comment: 18 pages, 9 figure

    Scenario Approach for Parametric Markov Models

    Get PDF
    In this paper, we propose an approximating framework for analyzing parametric Markov models. Instead of computing complex rational functions encoding the reachability probability and the reward values of the parametric model, we exploit the scenario approach to synthesize a relatively simple polynomial approximation. The approximation is probably approximately correct (PAC), meaning that with high confidence, the approximating function is close to the actual function with an allowable error. With the PAC approximations, one can check properties of the parametric Markov models. We show that the scenario approach can also be used to check PRCTL properties directly – without synthesizing the polynomial at first hand. We have implemented our algorithm in a prototype tool and conducted thorough experiments. The experimental results demonstrate that our tool is able to compute polynomials for more benchmarks than state-of-the-art tools such as PRISM and Storm, confirming the efficacy of our PAC-based synthesis.</p

    Computational Complexity in Tile Self-Assembly

    Get PDF
    One of the most fundamental and well-studied problems in Tile Self-Assembly is the Unique Assembly Verification (UAV) problem. This algorithmic problem asks whether a given tile system uniquely assembles a specific assembly. The complexity of this problem in the 2-Handed Assembly Model (2HAM) at a constant temperature is a long-standing open problem since the model was introduced. Previously, only membership in the class coNP was known and that the problem is in P if the temperature is one (τ = 1). The problem is known to be hard for many generalizations of the model, such as allowing one step into the third dimension or allowing the temperature of the system to be a variable, but the most fundamental version has remained open. In this Thesis I will cover verification problems in different models of self-assembly leading to the proof that the UAV problem in the 2HAM is hard even with a small constant temperature (τ = 2), and finally answer the complexity of this problem (open since 2013). Further, this result proves that UAV in the staged self-assembly model is coNP-complete with a single bin and stage (open since 2007), and that UAV in the q-tile model is also coNP-complete (open since 2004). We reduce from Monotone Planar 3-SAT with Neighboring Variable Pairs, a special case of 3SAT recently proven to be NP-hard

    Data mining applied to neurorehabilitation data

    Get PDF
    Tese de mestrado integrado, Engenharia Biomédica e Biofísica (Engenharia Clínica e Instrumentação Médica) Universidade de Lisboa, Faculdade de Ciências, 2017Apesar de não serem a principal causa de morte no Mundo, as lesões cerebrais são talvez a principal razão de existirem tantos casos de pessoas que veem a sua vida quotidiana afetada. Tal acontece devido a grandes dificuldades cognitivas que podem ser derivadas de um acidente de automóvel, de uma queda, da presença de um tumor, de um acidente vascular cerebral, da exposição a substâncias tóxicas ou de uma outra qualquer situação que tenha envolvido uma lesão do cérebro. De entre este tipo de lesões podem considerar-se aquelas que são provenientes de traumas por forças externas, ou seja, as chamadas lesões cerebrais traumáticas ou traumatismos crânio-encefálicos. É precisamente em pessoas que sofreram uma lesão desse tipo que se foca este estudo. Em pessoas que, depois dessas lesões, foram sujeitas a um tratamento de neuro reabilitação. Este tratamento, baseado na realização de tarefas especialmente desenhadas para estimular a reorganização das ligações neuronais, permite que os doentes tenham a possibilidade de voltar a conseguir realizar tarefas do dia-a-dia com a menor dificuldade possível. O objetivo da realização destas tarefas é a estimulação da capacidade de plasticidade cerebral, responsável pelo desenvolvimento das conexões sinápticas desde o nascimento e que permite ao cérebro voltar a estabelecer o seu funcionamento normal depois de uma lesão. Naturalmente, o grau de afetação de uma pessoa depende do tipo de lesão e tem uma grande influência não só no tempo de recuperação física e mental, como também no seu estado final. O estudo documentado neste relatório de estágio constitui um meio para atingir um objetivo comum a outros trabalhos de investigação nesta área; pretende-se que os tratamentos de neuro reabilitação possam vir a ser personalizados para cada paciente, para que a sua recuperação seja otimizada. A ideia é que, conhecendo alguns dos dados pessoais de um doente, considerando informação sobre o seu estado inicial e através dos resultados de testes realizados, seja possível associá-lo a um determinado perfil disfuncional, de características bastante específicas, para o terapeuta poder adaptar o seu tratamento. O Institut Guttmann, em Barcelona, foi o primeiro hospital espanhol a prestar cuidados a doentes de lesões medulares. Hoje em dia, um dos seus muitos projetos chama-se GNPT Guttmann NeuroPersonalTrainer e leva a casa dos seus doentes uma plataforma que lhes permite realizar as tarefas definidas pelos terapeutas, no âmbito dos seus tratamentos de neuro reabilitação. Dados desses doentes, incluindo informação démica e resultados de testes realizados antes e depois dos tratamentos, foram cedidos pelo Institut Guttmann ao Grupo de Biomédica e Telemedicina (GBT) sob a forma de bases de dados. Através da sua análise e utilizando ferramentas de Data Mining foi possível obter perfis gerais de disfunção cognitiva e descrever a evolução desses perfis, o principal objetivo desta dissertação. Encontrar padrões em grandes volumes de dados é a principal função de um processo de Data Mining, tratando o assunto de forma muito genérica. Na verdade, é este o conceito utilizado quando são abordados temas de extração de conhecimento a partir de grandes quantidades de dados. Há diversas técnicas que o permitem fazer, que utilizam algoritmos baseados em funções estatísticas e redes neuronais e que têm vindo a ser melhoradas ao longo dos últimos anos, desde que surgiu a primeira necessidade de lidar com grandes conjuntos de elementos. O propósito é sempre o mesmo: que a análise feita a partir destas técnicas permita converter a informação oculta dos dados em informação que pode ser depois utilizada para caracterizar populações, tomar decisões ou para validar resultados. Neste caso, foram utilizados algoritmos de Clustering, um método de Data Mining que permite obter grupos de elementos semelhantes entre si, os clusters, considerando as características de cada um destes elementos. Dados de 698 doentes que sofreram um traumatismo craniano e cuja informação disponível nas bases de dados fornecidas pelo Institut Guttmann satisfazia todas as condições necessárias para serem considerados no estudo, foram integrados num Data Warehouse - um depósito de armazenamento de dados - e depois estruturados. A partir de funções criadas em SQL - a principal linguagem de consultas e organização de bases de dados relacionais - foram obtidas as pontuações correspondentes aos testes realizados pelos doentes, antes do início do tratamento e depois de este ser terminado. Estes testes visaram avaliar, utilizando cinco diferentes níveis de pontuação correspondentes a cada grau de afetação (0 para sem afetação, 1 para afetação suave, 2 para afetação moderada, 3 para afetação severa e 4 para afetação aguda), três funções estritamente relacionadas com o nível cognitivo, a atenção, a memória e algumas funções executivas. As pontuações obtidas para cada uma das funções constituem uma média ponderada da pontuação cada uma das subfunções (atenção dividida, atenção seletiva, memória de trabalho, entre outras), calculadas por pelo menos um dos 24 itens de avaliação a que cada pessoa foi sujeita. De seguida, foram determinados os grupos iniciais e finais, recorrendo a uma ferramenta muito útil para encontrar correlações em grandes conjuntos de dados, o software SPSS. Para determinar a constituição dos clusters iniciais foi aplicado um algoritmo de Clustering designado K-means e, para os finais, um outro denominado TwoStep. A principal característica desta técnica descritiva de Data Mining é a utilização da distância como medida de verificação da proximidade entre dois elementos de um cluster. Os seus algoritmos diferem no tipo de dados a que se aplicam e também na forma como calculam os agrupamentos de elementos. Para cada um dos clusters, e de acordo com cada uma das funções, foi observada a distribuição das pontuações, através de gráficos de barras. Foram também confrontados ambos os conjuntos de clusters para se poder interpretar a relação entre eles. Os clusters, que neste contexto correspondem a perfis de afetação cognitiva, foram validados, e concluiu-se que permitem descrever bem a população em estudo. Por um lado, os seis clusters iniciais determinados representam de uma forma fiel, e com muito sentido do ponto de vista clínico, os conjuntos de pessoas com características suficientemente definidas que os distinguem entre si. Já os três clusters finais, usados para retratar a população no final do tratamento e analisar as evoluções dos pacientes, retratam perfis bastante opostos, o que permitiu, de certa forma interpretar com maior facilidade para que pacientes o efeito da neuro-reabilitação foi mais ou menos positivo. Alguns estudos citados no estado de arte revelaram que algumas variáveis são suscetíveis de influenciar o estado final de um doente. Aproveitando a existência de dados suficientes para tal, foi observado se, tendo em conta os clusters finais, se poderia fazer alguma inferência sobre o efeito de algumas das variáveis – incluindo a idade, o nível de estudos, o intervalo de tempo entre a lesão e o início do tratamento e a sua duração – em cada um destes. No final, considerando apenas as pontuações dos testes em cada função, antes e depois dos tratamentos, foram analisados e interpretados, recorrendo a gráficos, os desenvolvimentos e a evolução global de cada doente. Como desenvolvimentos possíveis, foram tidos em conta os casos em que houve melhorias, agravamentos e também os casos em que os doentes mantiveram o seu estado. Fazendo uso da informação sobre a forma como evoluíram os pacientes, foi possível verificar se, de facto, utilizando apenas os valores das pontuações obtidas nos testes, se poderia ou não confirmar que outras variáveis poderiam ter efeitos na determinação do estado final de um paciente. Os gráficos obtidos demonstraram que há diferenças muito subtis considerando algumas das variáveis, principalmente entre os dos doentes que melhoraram e os dos doentes que viram a sua condição agravada. Concluiu-se que o facto de os clusters agruparem pessoas com tipos de evolução diferentes levou a que o efeito de outras variáveis se mostrasse muito disperso. O tipo de investigação sugerido para futuros desenvolvimentos inclui: (i) o estudo das outras hipóteses de perfis apresentados pelo software usado (SPSS); (ii) considerar os diferentes aspetos das funções avaliadas a um nível mais detalhado; (iii) ter em conta outras variáveis com possíveis efeitos no estado final de um doente.Although they are not the leading cause of death in the world, brain injuries are perhaps the main reason why there are so many cases of people who see their daily lives affected. This is due to the major cognitive difficulties that appear after brain lesion. Brain injuries include those that are derived from traumas due to external forces – the traumatic brain injuries. This study is focused in people who, after these injuries, were subjected to a neuro rehabilitation treatment. The treatment, based on tasks specially designed to stimulate the reorganization of neural connections, allows patients to regain their abilities to perform their everyday tasks with the least possible difficulty. These tasks aim to stimulate the brain plasticity capacity, responsible for the development of synaptic connections which allows the brain to re-establish its normal functioning after an injury. The study documented in this internship report constitutes another step for a major goal, common to other studies in this area: that neuro rehabilitation treatments can be personalized for each patient, so that their recovery is optimized. Knowing some of the personal data of a patient, considering information about their initial state and through the results of tests performed, it is possible to assign a person to a certain dysfunctional profile, with specific characteristics and for the therapist to adapt treatment. One of his many projects of the Institut Guttmann (IG) is called GNPT Guttmann NeuroPersonalTrainer and brings into its patients’ home a platform that allows them to perform the tasks set by the therapists in the context of their neurorehabilitation treatments. Data from these patients, including clinical information and test results performed before and after the treatment, were provided by the IG to the Biomedical and Telemedicine Group (GBT) as databases. Through its analysis and using Data Mining techniques it was possible to obtain general profiles of cognitive dysfunction and to characterize the evolution of these profiles, the objective of this work. Finding patterns and extracting knowledge from large volumes of data are the main functions of a Data Mining process. An analysis performed using these techniques enables the conversion of information hidden in data into information that can later be used to make decisions or to validate results. In this case, Clustering algorithms, which build groups of elements with the similar characteristics called clusters, were used. Also, data from 698 patients who suffered brain trauma and whose information available in the databases provided by the IG satisfied all the conditions considered necessary were integrated into a Data Warehouse and then structured. The scores corresponding to the tests performed before and after the treatment were calculated, for each patient. These tests aimed to evaluate, using five different punctuation levels corresponding to each degree of affectation, three functions strictly related to cognitive level: attention, memory and some executive functions (cognitive processes necessary for the cognitive control of behavior). The initial and final clusters, representing patients’ profiles, were determined, using the SPSS software. The distribution of the scores over the clusters was observed through bar graphs. Both groups of clusters were also confronted to interpret the relationship between them. The clusters, which in this context correspond to profiles of cognitive affectation, were validated, and it was concluded that, at this moment, they represent well the state of patients under study. As some variables, like age and study level, are likely to influence the final state of a patient, it was observed if, given the final clusters, some inference could be made about the effect of those variables. No valuable conclusions were taken from this part. Also, considering the tests scores, patients’ evolution was identified as improvements, aggravations and cases where the conditions is maintained. Using that information, conclusions were extracted, regarding the population and the variables effect. The plots obtained allowed us to correctly describe the patients’ evolution and also to see if the variables considered were good descriptors of that evolution. A simple interpretation from of the facts allows to conclude that the calculated are good general, but not perfect descriptors of the population. The type of research suggested for future developments includes: (i) the study of the other hypothesis of profiles presented by the Data Mining software; (ii) consider the different aspects of the functions evaluated at a more detailed level; (iii) take into account other variables with possible effects on describing the final state of a patient

    Hardware-conscious query processing for the many-core era

    Get PDF
    Die optimale Nutzung von moderner Hardware zur Beschleunigung von Datenbank-Anfragen ist keine triviale Aufgabe. Viele DBMS als auch DSMS der letzten Jahrzehnte basieren auf Sachverhalten, die heute kaum noch Gültigkeit besitzen. Ein Beispiel hierfür sind heutige Server-Systeme, deren Hauptspeichergröße im Bereich mehrerer Terabytes liegen kann und somit den Weg für Hauptspeicherdatenbanken geebnet haben. Einer der größeren letzten Hardware Trends geht hin zu Prozessoren mit einer hohen Anzahl von Kernen, den sogenannten Manycore CPUs. Diese erlauben hohe Parallelitätsgrade für Programme durch Multithreading sowie Vektorisierung (SIMD), was die Anforderungen an die Speicher-Bandbreite allerdings deutlich erhöht. Der sogenannte High-Bandwidth Memory (HBM) versucht diese Lücke zu schließen, kann aber ebenso wie Many-core CPUs jeglichen Performance-Vorteil negieren, wenn dieser leichtfertig eingesetzt wird. Diese Arbeit stellt die Many-core CPU-Architektur zusammen mit HBM vor, um Datenbank sowie Datenstrom-Anfragen zu beschleunigen. Es wird gezeigt, dass ein hardwarenahes Kostenmodell zusammen mit einem Kalibrierungsansatz die Performance verschiedener Anfrageoperatoren verlässlich vorhersagen kann. Dies ermöglicht sowohl eine adaptive Partitionierungs und Merge-Strategie für die Parallelisierung von Datenstrom-Anfragen als auch eine ideale Konfiguration von Join-Operationen auf einem DBMS. Nichtsdestotrotz ist nicht jede Operation und Anwendung für die Nutzung einer Many-core CPU und HBM geeignet. Datenstrom-Anfragen sind oft auch an niedrige Latenz und schnelle Antwortzeiten gebunden, welche von höherer Speicher-Bandbreite kaum profitieren können. Hinzu kommen üblicherweise niedrigere Taktraten durch die hohe Kernzahl der CPUs, sowie Nachteile für geteilte Datenstrukturen, wie das Herstellen von Cache-Kohärenz und das Synchronisieren von parallelen Thread-Zugriffen. Basierend auf den Ergebnissen dieser Arbeit lässt sich ableiten, welche parallelen Datenstrukturen sich für die Verwendung von HBM besonders eignen. Des Weiteren werden verschiedene Techniken zur Parallelisierung und Synchronisierung von Datenstrukturen vorgestellt, deren Effizienz anhand eines Mehrwege-Datenstrom-Joins demonstriert wird.Exploiting the opportunities given by modern hardware for accelerating query processing speed is no trivial task. Many DBMS and also DSMS from past decades are based on fundamentals that have changed over time, e.g., servers of today with terabytes of main memory capacity allow complete avoidance of spilling data to disk, which has prepared the ground some time ago for main memory databases. One of the recent trends in hardware are many-core processors with hundreds of logical cores on a single CPU, providing an intense degree of parallelism through multithreading as well as vectorized instructions (SIMD). Their demand for memory bandwidth has led to the further development of high-bandwidth memory (HBM) to overcome the memory wall. However, many-core CPUs as well as HBM have many pitfalls that can nullify any performance gain with ease. In this work, we explore the many-core architecture along with HBM for database and data stream query processing. We demonstrate that a hardware-conscious cost model with a calibration approach allows reliable performance prediction of various query operations. Based on that information, we can, therefore, come to an adaptive partitioning and merging strategy for stream query parallelization as well as finding an ideal configuration of parameters for one of the most common tasks in the history of DBMS, join processing. However, not all operations and applications can exploit a many-core processor or HBM, though. Stream queries optimized for low latency and quick individual responses usually do not benefit well from more bandwidth and suffer from penalties like low clock frequencies of many-core CPUs as well. Shared data structures between cores also lead to problems with cache coherence as well as high contention. Based on our insights, we give a rule of thumb which data structures are suitable to parallelize with focus on HBM usage. In addition, different parallelization schemas and synchronization techniques are evaluated, based on the example of a multiway stream join operation

    Propuesta metodológica para el cálculo de las penalidades por giro en modelos de accesibilidad

    Get PDF
    En esta tesis de maestría se busca desarrollar una metodología para el cálculo de las penalidades por giro a utilizar en los modelos de accesibilidad y en general en los modelos de transportes dada la utilización de algoritmos de caminos mínimos en el cálculo de los tiempos de viaje en la red vial que incluyen penalizaciones y restricciones por giro, entre estos la accesibilidad media global, utilizada en diversos temas como la planificación urbana y de transportes en Manizales (Colombia) y diferentes ciudades alrededor del mundo. En esta ciudad se han utilizado penalidades y restricciones por giro determinadas de manera subjetiva por lo que no se tiene un valor calculado a partir de un método científico. Por lo tanto, se calcularán las penalidades y restricciones por giro para la ciudad de Manizales realizando una cuantificación de los tiempos de giro de los vehículos en diversas intersecciones viales, escogidas a partir de un análisis de priorización y registrando un video en cada una. Con estos datos se podrá obtener el promedio de giro a izquierda y derecha, es decir, las penalidades por giro para Manizales a utilizar en los modelos de accesibilidad calculados en la ciudad o en general para los modelos de transportes. Las penalidades calculadas mediante está metodología serán comparadas con las penalidades utilizadas en investigaciones previas a través del gradiente de ahorro, el cual nos permite cuantificar las diferencias generadas por este dato y su importancia en los modelos de transportes, entre ellos la accesibilidadAbstract: In this Master’s degree thesis seeks develop a methodology for the calculation of turn penalties to use in accessibility models and in general for transport models given in the recent use of algorithms of shortest paths for the calculation of travel times in the road network that includes turn penalties and restrictions, among then the global mean accessibility, used in some issues such as urban and transport planning in Manizales (Colombia) and different cities around the world. At Manizales, turn penalties and restrictions used in accessibility models are determined by a subjective way, so there are not calculated from a scientific method. Therefore, turn and restrictions penalties for Manizales will be calculated, making a quantification of the turn times of the vehicles in different road intersections, chosen from a priorization analysis and recording a video in each one. With this data we can obtain the average time to turn to left and right, that is, the turn penalties for Manizales to be used in the accessibility models calculated in the city or in general in the transport models. The penalties calculated using this methodology will be compared with the penalties used in previous investigations through the saving gradient, which allows us to quantify the differences generated by this data and its importance in transport models, including accessibilityMaestrí

    Visual-auditory visualisation of dynamic multi-scale heterogeneous objects.

    Get PDF
    The multi-scale phenomena analysis is an area of active research that is connecting simulations with experiments to get a correct insight into the compound dynamic structure. Visualisation is a challenging task due to a large amount of data and a wide range of complex data representations. The analysis of dynamic multi-scale phenomena requires a combination of geometric modelling and rendering techniques for the analysis of the changes in the internal structure in the case of data coming from different sources of various nature. Moreover, the area often addresses the limitations of solely visual data representation and considers the introduction of other sensory stimuli as a well-known tool to enhance visual analysis. However, there is a lack of software tools allowing perform an advanced real-time analysis of heterogeneous phenomena properties. The hardware-accelerated volume rendering allows getting insight into the internal structure of complex multi-scale phenomena. The technique is convenient for detailed visual analysis and highlights the features of interest in complex structures and is an area of active research. However, the conventional volume visualisation is limited to the use of transfer functions that operate on homogeneous material and, as a result, does not provide flexibility in geometry and material distribution modelling that is crucial for the analysis of heterogeneous objects. Moreover, the extension to visual-auditory analysis emphasises the necessity to review the entire conventional volume visualisation pipeline. The multi-sensory feedback highly depends on the use of modern hardware and software advances for real-time modelling and evaluation. In this work, we explore the aspects of the design of visual-auditory pipelines for the analysis of dynamic multi-scale properties of heterogeneous objects that can allow overcoming well-known problems of complex representations solely visual analysis. We consider the similarities between light and sound propagation as a solution to the problem. The approach benefits from a combination of GPU accelerated ray-casting, geometry, optical and auditory properties modelling. We discuss how the modern GPU techniques application in those areas allows introducing a unified approach to the visual-auditory analysis of dynamic multi-scale heterogeneous objects. Similarly to the conventional volume rendering technique based on light propagation, we model auditory feedback as a result of initial impulse propagation through 3D space and its digital representation as a sampled sound wave obtained with the ray-casting procedure. The auditory stimuli can complement visual ones in the analysis of the dynamic multi-scale heterogeneous object. We propose a framework that facilitates the design of dynamic multi-scale heterogeneous objects visual-auditory pipeline and discuss the framework application for two case studies. The first is a molecular phenomena study that is a result of molecular dynamics simulation and quantum simulation. The second explores microstructures in digital fabrication with an arbitrary irregular lattice structure. For considered case studies, the visual-auditory techniques facilitate the interactive analysis of both spatial structure and internal multi-scale properties of volume nature in complex heterogeneous objects. A GPU-accelerated framework for visual-auditory analysis of heterogeneous objects can be applied and extend beyond this research. Thus, to specify the main direction of such extension from the point of view of the potential users, strengthen the value of this research as well as to evaluate the vision of the application of the techniques described above, we carry out a preliminary evaluation. The user study aims to compare our expectations on the visual-auditory approach with the views of the potential users of this system if it is implemented as a software product. A preliminary evaluation study was carried out with limitations imposed by 2020/2021 restrictions. However, it confirms that the main direction for the visual-auditory analysis of heterogeneous objects has been identified correctly and visual and auditory stimuli can complement each other in the analysis of both volume and spatial distribution properties of heterogeneous phenomena. The user reviews also highlight the necessary enhancements that should be introduced to the approach in terms of the design of more complex user interfaces and consideration of additional application cases. To provide a more detailed picture on evaluation results and recommendations introduced, we also identify the key factors that define the user vision of the approach further enhancement and its possible application areas, such as users experience in the area of complex physical phenomena analysis or multi-sensory area. The discussed in this work aspects of heterogeneous objects analysis task, theoretical and practical solutions allow considering the application, further development and enhancement of the results in multidisciplinary areas of GPU accelerated High-performance visualisation pipelines design and multi-sensory analysis

    Ausgezeichnete Informatikdissertationen 2017

    Full text link
    corecore