723 research outputs found

    Análise e previsão de acidentes rodoviários usando data mining

    Get PDF
    Road traffic crashes is an impactful problem in nowadays society, causing significant life and property losses. Due to the urbanization process across the world and the population’s growth, the number of crashes is also increasing. Predicting a crash severity and cost is an important step to better understand which causative variables have more influence and therefore, implement prevention measures that can reduce the number of crashes. Road traffic crashes predictions is a complex problem due to the high number of independent causative variables that contribute to the event. The used dataset contains crashes occurred in the State of Iowa in the recent years. Feature selection and data cleaning techniques are applied to improve the data quality and enhance the learning process. Previous research on the road safety field applied approaches that led to unsatisfactory results. Recent studies based on more complex approaches like neural networks had better results. This document’s work is based on deep learning, studying how the usage of deep neural networks can enhance previous results on road traffic crashes predictions taking causative variables as input. Various models are built using different optimization and activation functions. The evaluation is based on the comparison of these models.Os acidentes rodoviários representam um dos maiores problemas da comunidade atual, tendo um grande impacto social e económico. Além da enorme quantidade de feridos e mortos resultantes deste tipo de eventos (sendo mesmo considerada uma das maiores causas de morte a nível global, a maior em jovens adultos), a prevenção e consequentes custos de um acidente rodoviário representam também uma parte respeitável dos orçamentos de estado. Existe, um conjunto de variáveis envolvidas neste tipo de eventos que os tornam possíveis de prever e evitar, como por exemplo a existência de álcool, luminosidade no local e estado da estrada. Entender o impacto destas variáveis permite criar relações lógicas entre os seus valores e a gravidade e custos inerentes a um acidente, tornando possível a implementação de medidas de prevenção mais eficientes. Contudo e devido ao elevado número de variáveis a considerar, este é um problema complexo. Apesar de ser um problema global, este documento foca-se num contexto mais específico, o do estado de Iowa nos Estados Unidos da América. O conjunto de dados utilizados foi recolhido pelo departamento de transportes do estado de Iowa e contém variáveis ambiente, gravidade e custo dos acidentes rodoviários ocorridos nos últimos anos. O número de registos é elevado, o que permite a existência de diversificados cenários. No entanto, estes dados contêm algumas falhas (valores não recolhidos) e, em alguns cenários, não se encontram balanceados. Diversas técnicas de pré-processamento de dados como limpeza e transformação destes são aplicadas de forma a ultrapassar este problema. A partir da análise dos dados é possível ainda identificar quais os campos que não representam interesse no contexto deste problema, procedendo-se com a sua remoção e consequente redução do tamanho do conjunto de dados. A área de prevenção e previsão de acidentes rodoviários utilizando técnicas de data mining já foi explorada anteriormente. A aplicação de modelos mais clássicos (como modelos probabilísticos e baseados em procura) não obteve resultados totalmente satisfatórios. Nos estudos mais recentes, onde técnicas com maior poder computacional foram aplicadas (métodos baseados em otimização), os resultados foram melhores. Desta forma e tendo em consideração as conclusões dos estudos referidos na literatura, este documento pretende abordar como a utilização de deep learning, uma técnica de redes neuronais profundas e de elevado poder computacional, pode melhorar os resultados previamente obtidos. Para tal, são implementados diversos modelos para prever a gravidade e custo de um acidente com recurso a redes neuronais. A configuração dos modelos varia, sendo utlizados diferentes funções de custo e de ativação, de forma a explorar quais são as melhores abordagens a estes problemas. De forma a otimizar o processo de desenvolvimento é também utilizada uma framework de deep learning, o Tensorflow. Esta framework, além de primar pela flexibilidade e capacidade de implementação de arquiteturas variadas, permite uma elevada abstração do processo de treino das redes neuronais, calculando dinamicamente qual a profundidade e largura da rede mais indicada. A sua utilização teve também por base a comunidade open-source, que garante a manutenção e otimização desta framework no futuro. Os resultados da utilização de frameworks no processo de treino de redes neuronais no contexto de acidentes rodoviários não são ainda conclusivos, sendo este um fator a ter em conta no desenvolvimento do projeto. Os modelos desenvolvidos são depois comparados, utilizando métricas como Exatidão e AUC (Area Under the Curve), e com recurso a validação do tipo Holdout de forma a perceber se os resultados obtidos são válidos. São utilizados dois conjuntos de dados, um de treino e um outro de teste, para a avaliação da solução

    Exploration and Coverage with Swarms of Settling Agents

    Full text link
    We consider several algorithms for exploring and filling an unknown, connected region, by simple, airborne agents. The agents are assumed to be identical, autonomous, anonymous and to have a finite amount of memory. The region is modeled as a connected sub-set of a regular grid composed of square cells. The algorithms described herein are suited for Micro Air Vehicles (MAV) since these air vehicles enable unobstructed views of the ground below and can move freely in space at various heights. The agents explore the region by applying various action-rules based on locally acquired information Some of them may settle in unoccupied cells as the exploration progresses. Settled agents become virtual pheromones for the exploration and coverage process, beacons that subsequently aid the remaining, and still exploring, mobile agents. We introduce a backward propagating information diffusion process as a way to implement a deterministic indicator of process termination and guide the mobile agents. For the proposed algorithms, complete covering of the graph in finite time is guaranteed when the size of the region is fixed. Bounds on the coverage times are also derived. Extensive simulation results exhibit good agreement with the theoretical predictions

    A Shark Conservationists Toolbox: Current DNA Methods and Techniques Aiding in the Conservation of Sharks

    Get PDF
    Elasmobranchs are important members of their community. Many sharks are important apex predators that help maintain the health of their ecosystem. However, shark populations are globally declining. This is partially due to the fact that sharks are highly targeted for their fins, meat, liver oil, teeth, and skin. However, they are also killed from anthropogenic effects such as habitat destruction and pollution. Most shark species have life history characteristics that also make them more vulnerable to overfishing. Sharks are also difficult to study due to their elusive nature and identification issues. That is why molecular tools are increasingly becoming important for studying sharks. This paper discusses four different types of molecular tools: mitochondrial and nuclear DNA, environmental DNA, sequence-based, and PCR-based tools. All of these techniques are currently being used to help study and conserve sharks. These techniques can obtain important ecological information for a given species. The majority of the research has been conducted on species identification. Specifically, you can use these tools to identify a particular species of importance, or to classify the global fin trade, or even to identify species in highly processed samples. Species identification isn’t the only useful information that can be obtained however. Molecular tools can also help us better understand the species composition, stock structure, mating system, or population size of a given area. Molecular tools are still a growing area of research. In the future these techniques will continue to improve, and the information that we can learn will continue to grow. One of the biggest hurdles for this type of research is a lack of communication between geneticists and fishery managers and policy makers. Molecular tools have the potential to help with current and future policy and management. That is why it is important for anyone interested in the conservation of elasmobranchs to have a better understanding of molecular techniques

    Monitoring and analysis system for performance troubleshooting in data centers

    Get PDF
    It was not long ago. On Christmas Eve 2012, a war of troubleshooting began in Amazon data centers. It started at 12:24 PM, with an mistaken deletion of the state data of Amazon Elastic Load Balancing Service (ELB for short), which was not realized at that time. The mistake first led to a local issue that a small number of ELB service APIs were affected. In about six minutes, it evolved into a critical one that EC2 customers were significantly affected. One example was that Netflix, which was using hundreds of Amazon ELB services, was experiencing an extensive streaming service outage when many customers could not watch TV shows or movies on Christmas Eve. It took Amazon engineers 5 hours 42 minutes to find the root cause, the mistaken deletion, and another 15 hours and 32 minutes to fully recover the ELB service. The war ended at 8:15 AM the next day and brought the performance troubleshooting in data centers to world’s attention. As shown in this Amazon ELB case.Troubleshooting runtime performance issues is crucial in time-sensitive multi-tier cloud services because of their stringent end-to-end timing requirements, but it is also notoriously difficult and time consuming. To address the troubleshooting challenge, this dissertation proposes VScope, a flexible monitoring and analysis system for online troubleshooting in data centers. VScope provides primitive operations which data center operators can use to troubleshoot various performance issues. Each operation is essentially a series of monitoring and analysis functions executed on an overlay network. We design a novel software architecture for VScope so that the overlay networks can be generated, executed and terminated automatically, on-demand. From the troubleshooting side, we design novel anomaly detection algorithms and implement them in VScope. By running anomaly detection algorithms in VScope, data center operators are notified when performance anomalies happen. We also design a graph-based guidance approach, called VFocus, which tracks the interactions among hardware and software components in data centers. VFocus provides primitive operations by which operators can analyze the interactions to find out which components are relevant to the performance issue. VScope’s capabilities and performance are evaluated on a testbed with over 1000 virtual machines (VMs). Experimental results show that the VScope runtime negligibly perturbs system and application performance, and requires mere seconds to deploy monitoring and analytics functions on over 1000 nodes. This demonstrates VScope’s ability to support fast operation and online queries against a comprehensive set of application to system/platform level metrics, and a variety of representative analytics functions. When supporting algorithms with high computation complexity, VScope serves as a ‘thin layer’ that occupies no more than 5% of their total latency. Further, by using VFocus, VScope can locate problematic VMs that cannot be found via solely application-level monitoring, and in one of the use cases explored in the dissertation, it operates with levels of perturbation of over 400% less than what is seen for brute-force and most sampling-based approaches. We also validate VFocus with real-world data center traces. The experimental results show that VFocus has troubleshooting accuracy of 83% on average.Ph.D

    Byzantine fault-tolerant agreement protocols for wireless Ad hoc networks

    Get PDF
    Tese de doutoramento, Informática (Ciências da Computação), Universidade de Lisboa, Faculdade de Ciências, 2010.The thesis investigates the problem of fault- and intrusion-tolerant consensus in resource-constrained wireless ad hoc networks. This is a fundamental problem in distributed computing because it abstracts the need to coordinate activities among various nodes. It has been shown to be a building block for several other important distributed computing problems like state-machine replication and atomic broadcast. The thesis begins by making a thorough performance assessment of existing intrusion-tolerant consensus protocols, which shows that the performance bottlenecks of current solutions are in part related to their system modeling assumptions. Based on these results, the communication failure model is identified as a model that simultaneously captures the reality of wireless ad hoc networks and allows the design of efficient protocols. Unfortunately, the model is subject to an impossibility result stating that there is no deterministic algorithm that allows n nodes to reach agreement if more than n2 omission transmission failures can occur in a communication step. This result is valid even under strict timing assumptions (i.e., a synchronous system). The thesis applies randomization techniques in increasingly weaker variants of this model, until an efficient intrusion-tolerant consensus protocol is achieved. The first variant simplifies the problem by restricting the number of nodes that may be at the source of a transmission failure at each communication step. An algorithm is designed that tolerates f dynamic nodes at the source of faulty transmissions in a system with a total of n 3f + 1 nodes. The second variant imposes no restrictions on the pattern of transmission failures. The proposed algorithm effectively circumvents the Santoro- Widmayer impossibility result for the first time. It allows k out of n nodes to decide despite dn 2 e(nk)+k2 omission failures per communication step. This algorithm also has the interesting property of guaranteeing safety during arbitrary periods of unrestricted message loss. The final variant shares the same properties of the previous one, but relaxes the model in the sense that the system is asynchronous and that a static subset of nodes may be malicious. The obtained algorithm, called Turquois, admits f < n 3 malicious nodes, and ensures progress in communication steps where dnf 2 e(n k f) + k 2. The algorithm is subject to a comparative performance evaluation against other intrusiontolerant protocols. The results show that, as the system scales, Turquois outperforms the other protocols by more than an order of magnitude.Esta tese investiga o problema do consenso tolerante a faltas acidentais e maliciosas em redes ad hoc sem fios. Trata-se de um problema fundamental que captura a essência da coordenação em actividades envolvendo vários nós de um sistema, sendo um bloco construtor de outros importantes problemas dos sistemas distribuídos como a replicação de máquina de estados ou a difusão atómica. A tese começa por efectuar uma avaliação de desempenho a protocolos tolerantes a intrusões já existentes na literatura. Os resultados mostram que as limitações de desempenho das soluções existentes estão em parte relacionadas com o seu modelo de sistema. Baseado nestes resultados, é identificado o modelo de falhas de comunicação como um modelo que simultaneamente permite capturar o ambiente das redes ad hoc sem fios e projectar protocolos eficientes. Todavia, o modelo é restrito por um resultado de impossibilidade que afirma não existir algoritmo algum que permita a n nós chegaram a acordo num sistema que admita mais do que n2 transmissões omissas num dado passo de comunicação. Este resultado é válido mesmo sob fortes hipóteses temporais (i.e., em sistemas síncronos) A tese aplica técnicas de aleatoriedade em variantes progressivamente mais fracas do modelo até ser alcançado um protocolo eficiente e tolerante a intrusões. A primeira variante do modelo, de forma a simplificar o problema, restringe o número de nós que estão na origem de transmissões faltosas. É apresentado um algoritmo que tolera f nós dinâmicos na origem de transmissões faltosas em sistemas com um total de n 3f + 1 nós. A segunda variante do modelo não impõe quaisquer restrições no padrão de transmissões faltosas. É apresentado um algoritmo que contorna efectivamente o resultado de impossibilidade Santoro-Widmayer pela primeira vez e que permite a k de n nós efectuarem progresso nos passos de comunicação em que o número de transmissões omissas seja dn 2 e(n k) + k 2. O algoritmo possui ainda a interessante propriedade de tolerar períodos arbitrários em que o número de transmissões omissas seja superior a . A última variante do modelo partilha das mesmas características da variante anterior, mas com pressupostos mais fracos sobre o sistema. Em particular, assume-se que o sistema é assíncrono e que um subconjunto estático dos nós pode ser malicioso. O algoritmo apresentado, denominado Turquois, admite f < n 3 nós maliciosos e assegura progresso nos passos de comunicação em que dnf 2 e(n k f) + k 2. O algoritmo é sujeito a uma análise de desempenho comparativa com outros protocolos na literatura. Os resultados demonstram que, à medida que o número de nós no sistema aumenta, o desempenho do protocolo Turquois ultrapassa os restantes em mais do que uma ordem de magnitude.FC

    Estimating the trauma-death interval : a histological investigation of fracture healing.

    Get PDF
    The accurate, reliable estimation of the ‘age’ of a fracture, or the time elapsed since trauma was sustained, has important implications In a variety of forensic contexts. Such information could greatly aid the forensic diagnosis of child abuse, the reconstruction of events during a violent incident such as homicide or a road traffic accident, and assist in the identification of unknown remains. Forensic fracture dating has largely relied on radiographical and histological evidence, but has lacked precision and consistency. The research presented here alms to test the hypothesis that correlations exist between the histologically- and immunohistochemically-observable phenomena at a fracture site and the known trauma-death interval of an individual. This was achieved by comparing the known trauma-death interval (TDI) to the extent of healing visible on histological slides prepared from formalin-fixed, paraffin-embedded, decalcified blocks of bone excised from the fracture site of 52 rib, skull and femur fractures from 29 individual forensic cases submitted to the Medico-Legal Centre Sheffield between 1992 and 2002. The slides were stained with haematoxylin and eosin to stain nuclei and cytoplasm, Peris’ Prussian Blue stain for haemosiderin granules, mono-clonal anti-CD68 antibody for osteoclasts, and anti-bone sialoprotein antibody as an osteoblast and osteocyte marker. Quantifiable parameters such as the percentage cover of red blood cells, of living and necrotic compact bone, and the size, abundance and dispersal of immuno-positive and inflammatory cells were examined and compared to the TDI using human observers and Scion Image histomorphometry software. Statistically significant correlations were found between TDI and the presence of haemosiderin granules later than three days post-trauma; and the dispersal and location of CD68 positive cells; as well as the estimated percentage cover of fibroblasts and red blood cells at the fracture site. Other trends and correlations were found, which contribute to the understanding of bone’s immediate responses to trauma. It is hoped that this research may aid the prediction of the time elapsed since trauma in a forensic context and broaden the scope of trauma analysis in forensic anthropology

    Multi-stage stochastic optimization and reinforcement learning for forestry epidemic and covid-19 control planning

    Get PDF
    This dissertation focuses on developing new modeling and solution approaches based on multi-stage stochastic programming and reinforcement learning for tackling biological invasions in forests and human populations. Emerald Ash Borer (EAB) is the nemesis of ash trees. This research introduces a multi-stage stochastic mixed-integer programming model to assist forest agencies in managing emerald ash borer insects throughout the U.S. and maximize the public benets of preserving healthy ash trees. This work is then extended to present the first risk-averse multi-stage stochastic mixed-integer program in the invasive species management literature to account for extreme events. Significant computational achievements are obtained using a scenario dominance decomposition and cutting plane algorithm.The results of this work provide crucial insights and decision strategies for optimal resource allocation among surveillance, treatment, and removal of ash trees, leading to a better and healthier environment for future generations. This dissertation also addresses the computational difficulty of solving one of the most difficult classes of combinatorial optimization problems, the Multi-Dimensional Knapsack Problem (MKP). A novel 2-Dimensional (2D) deep reinforcement learning (DRL) framework is developed to represent and solve combinatorial optimization problems focusing on MKP. The DRL framework trains different agents for making sequential decisions and finding the optimal solution while still satisfying the resource constraints of the problem. To our knowledge, this is the first DRL model of its kind where a 2D environment is formulated, and an element of the DRL solution matrix represents an item of the MKP. Our DRL framework shows that it can solve medium-sized and large-sized instances at least 45 and 10 times faster in CPU solution time, respectively, with a maximum solution gap of 0.28% compared to the solution performance of CPLEX. Applying this methodology, yet another recent epidemic problem is tackled, that of COVID-19. This research investigates a reinforcement learning approach tailored with an agent-based simulation model to simulate the disease growth and optimize decision-making during an epidemic. This framework is validated using the COVID-19 data from the Center for Disease Control and Prevention (CDC). Research results provide important insights into government response to COVID-19 and vaccination strategies

    Impacts of invasive Opuntia cacti on wild mammals in Kenya

    Get PDF
    In this thesis, I explored the impacts of invasive plants on animal behaviour, using the invasion of Opuntia cacti in Laikipia County, Kenya, as a specific case study. In the opening chapter, I introduced the topic of biological invasions, addressing essential background material and identifying key knowledge gaps. In the second chapter, I focused on the impacts of invasive plants on animal behaviour, an important – yet neglected – topic. I synthesised the disparate literature on invasive plants’ behavioural impacts within a novel mechanistic framework, revealing that invasive plants can cause profound behavioural changes in native animals, with ecological consequences at multiple scales. I also found that environmental context played an important role in moderating how an invader’s modes of impact translate into behavioural changes in native species, and how these behavioural changes then generate ecological impacts. Finally, I identified priority research questions relating to the behavioural impacts of invasive plants. Invasive plants’ behavioural impacts can manifest as changes to the occurrence patterns of native animals. In Chapter 3, I used simulations to explore model selection in occupancy models, which are a powerful tool for studying the patterns and drivers of occurrence. Specifically, I investigated the consequences of collider bias – a type of confounding that can arise when adding explanatory variables to a model – for model selection using the Akaike Information Criterion (AIC) and Schwarz Criterion (or Bayesian Information Criterion, BIC). I found that the effect of collider bias, and consequently the inferential and predictive accuracy of the AIC/BIC-best model, depended on whether the collider bias was present in the occupancy or detection data-generating process. My findings illustrate the importance of distinguishing between inference and prediction in ecological modelling and have more general implications for the use of information criteria in all linear modelling approaches. In Chapter 4, I applied the mechanistic framework from Chapter 2 and the modelling conclusions from Chapter 3 to the problem of understanding Opuntia’s behavioural impacts in Laikipia County. Specifically, I used camera traps to explore the effects of Opuntia on occupancy and activity for eight key mammal species. I found that the effects of Opuntia varied among mammal species and depended on the spatial scale of the Opuntia cover covariate. These findings have important implications for the conservation of endangered mammal species in the region, the future spread of Opuntia through seed dispersal, and interactions between wildlife and local communities. In Chapter 5, I addressed key knowledge gaps pertaining to Opuntia’s biotic interactions with native animals. First, I quantified the relationship between height and fruiting in O. engelmannii and O. stricta, finding that height was positively related to fruiting for both species, and that the relationship was stronger for O. engelmannii than for O. stricta. I also found that local habitat variables were related to height and/or fruiting in both Opuntia species. Second, I documented the interactions between animals and Opuntia using camera traps. In so doing, I confirmed the importance of interactions that were previously thought to be important, while also highlighting interactions which have previously received little attention in the published scientific literature
    • …
    corecore