150 research outputs found
Hoeffding Tree Algorithms for Anomaly Detection in Streaming Datasets: A Survey
This survey aims to deliver an extensive and well-constructed overview of using machine learning for the problem of detecting anomalies in streaming datasets. The objective is to provide the effectiveness of using Hoeffding Trees as a machine learning algorithm solution for the problem of detecting anomalies in streaming cyber datasets. In this survey we categorize the existing research works of Hoeffding Trees which can be feasible for this type of study into the following: surveying distributed Hoeffding Trees, surveying ensembles of Hoeffding Trees and surveying existing techniques using Hoeffding Trees for anomaly detection. These categories are referred to as compositions within this paper and were selected based on their relation to streaming data and the flexibility of their techniques for use within different domains of streaming data. We discuss the relevance of how combining the techniques of the proposed research works within these compositions can be used to address the anomaly detection problem in streaming cyber datasets. The goal is to show how a combination of techniques from different compositions can solve a prominent problem, anomaly detection
Vector-borne pathogens found in carnivores in wild Namibia
Dissertação de Mestrado Integrado em Medicina VeterináriaThis dissertation aimed to identify and molecularly characterize vector-borne pathogens from several parasite families, all possessing stages found in peripheral blood, from a wide variety of free-ranging carnivores living in Namibia, in the southern part of Africa.
Blood samples collected from 9 bat-eared foxes (Otocyon megalotis), 17 brown hyenas (Parahyaena brunnea), 19 spotted hyenas (Crocuta crocuta) and 85 cheetahs (Acinonyx jubatus) were screened by Polymerase Chain Reactions (PCRs) and tested for pathogens of the Onchocercidae family, the order Piroplasmida, bacteria from the Anaplasmataceae and the Rickettsiaceae families and, lastly, the Hepatozoidae family. The PCRs targeted both the ITS-2 and 12S, 18S, 16S, 18S and 18S rRNA genes respectively and were followed by nucleotide sequencing.
In total, sampled animals showed a 43.1% rate of Onchocercidae infection, 67.7% of Piroplasmida, 60% of them were positive for Anaplasmataceae, 10% for Rickettsiaceae and Hepatozoidae were detected in 47.7% of them.
Obtained filaroid sequences showed high homologies with both Acanthocheilonema reconditum and Acanthocheilonema dracunculoides and further phylogenetic analysis were performed in both brown and spotted hyenas, with the construction of a phylogenetic tree. Piroplasmida results were not studied any further. For Anaplasmataceae, subsequent sequencing results indicated high similarity with both Anaplasma phagocytophilum and Anaplasma platys and varied PCR protocols were conducted in order to differentiate between these organisms, but no conclusions were reached. The Rickettsiaceae found displayed high homologies with Rickettsia raoultii. And finally, the Hepatozoidae infection showed to be a mixed one with both Hepatozoon canis and Hepatozoon felis.
These results are important not only on a conservation level for the infected host species, but are also relevant for domestic animals coexisting in the surrounding areas, as well as humans, especially since a few of the parasites found may have zoonotic potential. Future studies should focus on understanding vectors, transmission routes, infection dynamics and host specificity in order to better evaluate the possible danger these infections may withhold.RESUMO - Agentes patogénicos transmitidos por vetores presentes em carnívoros na Namíbia - Esta dissertação teve como principal objetivo identificar e caracterizar molecularmente agentes patogénicos transmitidos por vetores de várias famílias parasitárias, com o aspeto em comum de todas possuírem fases do desenvolvimento encontradas no sangue, de espécies variadas de carnívoros selvagens que habitam na Namíbia, no Sul de África.
Foram testadas amostras sanguíneas de 9 raposas-orelhas-de-morcego (Otocyon megalotis), 17 hienas-castanhas (Parahyaena brunnea), 19 hienas-malhadas (Crocuta crocuta) e 85 chitas (Acinonyx jubatus) por PCR e analisadas para pesquisa de parasitas da família Onchocercidae, da ordem Piroplasmida, bactérias das famílias Anaplasmataceae e Rickettsiaceae e, finalmente, da família Hepatozoidae. Os PCRs foram direcionados aos genes do rRNA ITS-2 e 12S, 18S, 16S, 18S e 18S respetivamente e foram seguidos de sequenciação de nucleótidos.
Na totalidade, os animais testados mostraram uma taxa de infeção de 43.1% por Onchocercidae, de 67.7% de Piroplasmida, 60% deles tiveram resultados positivos para Anaplasmataceae, 10% para Rickettsiaceae e Hepatozoidae foram detetados em 47.7% da população.
As sequências obtidas de filarídeos, mostraram possuir elevadas homologias com Acanthocheilonema reconditum e Acanthocheilonema dracunculoides, e estudos filogenéticos mais intensivos foram realizados, nomeadamente uma árvore filogenética que inclui ambas as espécies de hienas. Os resultados relativos a Piroplasmida não foram aprofundados. Para as Anaplasmataceae, as sequenciações subsequentes indicaram elevada similaridade com Anaplasma phagocytophilum e Anaplasma platys e múltiplos protocolos de PCRs foram efetuados, com o intuito de diferenciar entre estas duas espécies, mas não foram retiradas quaisquer conclusões. As Rickettsiaceae presentes evidenciaram fortes semelhanças com Rickettsia raoultii. E finalmente, as infeções por Hepatozoidae mostraram ser uma infeção mista por ambos Hepatozoon canis e Hepatozoon felis.
A importância destes resultados não se limita apenas à conservação das espécies animais em causa, mas são também relevantes em termos dos animais domésticos coabitantes na mesma região, assim como humanos, especialmente tendo em conta o possível potencial zoonótico de algumas espécies parasitárias. Estudos futuros devem ter como principais objetivos o estudo dos vetores respetivos, tipo de transmissão, dinâmica da infeção e especificidade parasitária, para melhor avaliar os possíveis perigos que podem advir da presença destes parasitas.N/
A Survey on Big Data for Network Traffic Monitoring and Analysis
Network Traffic Monitoring and Analysis (NTMA) represents a key component for network management, especially to guarantee the correct operation of large-scale networks such as the Internet. As the complexity of Internet services and the volume of traffic continue to increase, it becomes difficult to design scalable NTMA applications. Applications such as traffic classification and policing require real-time and scalable approaches. Anomaly detection and security mechanisms require to quickly identify and react to unpredictable events while processing millions of heterogeneous events. At last, the system has to collect, store, and process massive sets of historical data for post-mortem analysis. Those are precisely the challenges faced by general big data approaches: Volume, Velocity, Variety, and Veracity. This survey brings together NTMA and big data. We catalog previous work on NTMA that adopt big data approaches to understand to what extent the potential of big data is being explored in NTMA. This survey mainly focuses on approaches and technologies to manage the big NTMA data, additionally briefly discussing big data analytics (e.g., machine learning) for the sake of NTMA. Finally, we provide guidelines for future work, discussing lessons learned, and research directions
Towards Efficient Intrusion Detection using Hybrid Data Mining Techniques
The enormous development in the connectivity among different type of networks poses significant concerns in terms of privacy and security. As such, the exponential expansion in the deployment of cloud technology has produced a massive amount of data from a variety of applications, resources and platforms. In turn, the rapid rate and volume of data creation in high-dimension has begun to pose significant challenges for data management and security. Handling redundant and irrelevant features in high-dimensional space has caused a long-term challenge for network anomaly detection. Eliminating such features with spectral information not only speeds up the classification process, but also helps classifiers make accurate decisions during attack recognition time, especially when coping with large-scale and heterogeneous data such as network traffic data. Furthermore, the continued evolution of network attack patterns has resulted in the emergence of zero-day cyber attacks, which nowadays has considered as a major challenge in cyber security. In this threat environment, traditional security protections like firewalls, anti-virus software, and virtual private networks are not always sufficient. With this in mind, most of the current intrusion detection systems (IDSs) are either signature-based, which has been proven to be insufficient in identifying novel attacks, or developed based on absolute datasets. Hence, a robust mechanism for detecting intrusions, i.e. anomaly-based IDS, in the big data setting has therefore become a topic of importance. In this dissertation, an empirical study has been conducted at the initial stage to identify the challenges and limitations in the current IDSs, providing a systematic treatment of methodologies and techniques. Next, a comprehensive IDS framework has been proposed to overcome the aforementioned shortcomings. First, a novel hybrid dimensionality reduction technique is proposed combining information gain (IG) and principal component analysis (PCA) methods with an ensemble classifier based on three different classification techniques, named IG-PCA-Ensemble. Experimental results show that the proposed dimensionality reduction method contributes more critical features and reduced the detection time significantly. The results show that the proposed IG-PCA-Ensemble approach has also exhibits better performance than the majority of the existing state-of-the-art approaches
Ontogenetic Investigation of Underwater Hearing Capabilities in Loggerhead Sea Turtles (Caretta caretta) Using a Dual Testing Approach
Sea turtles reside in different acoustic environments with each life history stage and may have different hearing capacity throughout ontogeny. For this study, two independent yet complementary techniques for hearing assessment, i.e. behavioral and electrophysiological audiometry, were employed to (1) measure hearing in post-hatchling and juvenile loggerhead sea turtles Caretta caretta (19-62 cm straight carapace length) to determine whether these migratory turtles exhibit an ontogenetic shift in underwater auditory detection and (2) evaluate whether hearing frequency range and threshold sensitivity are consistent in behavioral and electrophysiological tests. Behavioral trials first required training turtles to respond to known frequencies, a multi-stage, time-intensive process, and then recording their behavior when they were presented with sound stimuli from an underwater speaker using a two-response forced-choice paradigm. Electrophysiological experiments involved submerging restrained, fully conscious turtles just below the air-water interface and recording auditory evoked potentials (AEPs) when sound stimuli were presented using an underwater speaker. No significant differences in behavior-derived auditory thresholds or AEP-derived auditory thresholds were detected between post-hatchling and juvenile sea turtles. While hearing frequency range (50-1000/1100 Hz) and highest sensitivity (100-400 Hz) were consistent in audiograms pooled by size class for both behavior and AEP experiments, both post-hatchlings and juveniles had significantly higher AEP-derived than behavior-derived auditory thresholds, indicating that behavioral assessment is a more sensitive testing approach. The results from this study suggest that post-hatchling and juvenile loggerhead sea turtles are low-frequency specialists, exhibiting little differences in threshold sensitivity and frequency bandwidth despite residence in acoustically distinct environments throughout ontogeny
Recommended from our members
Towards an Integrated Decision Tool for Managing Wildlife with Visitor Restrictions in Glacier Bay National Park
The National Park Service has a dual mission of providing public access to exceptional natural resources, but in a manner such that these resources are left “unimpaired for the enjoyment of future generations.” Human activities in parks undoubtedly affect wildlife, but the degree to which such activities cause impairment is often unclear and difficult to assess. It is the task of park administrators to take actions and impose restrictions to prevent impairment based on park values and the information provided through research and monitoring programs. Finding an appropriate balance between wildlife protection and visitor access is difficult because decision makers must consider numerous interrelated factors, many of which are not known with certainty. In light of these challenges, scientific approaches that allow decision makers to incorporate uncertainty and evaluate trade-offs between human access and resource protection are greatly needed. Glacier Bay National Park (the “Park” hereafter) contends with the challenge of managing visitors in an area containing many species of conservation concern. Therefore, the Park seeks a systematic and data-driven process for evaluating the tradeoffs that current and potential restrictions represent, in terms of protecting sensitive resources versus enabling full access to the public. The goal of my dissertation was to assist administrators and biologists at the Park with the development of an integrated decision tool for the Park through a structured decision making process.
This task entailed first identifying and structuring objectives, then coordinating with subject-matter experts on the development of biological sub-models for informing the future decision tool. Park Service administrators and staff drew on fundamental purposes of the Park to define measurable attributes that characterize the Park’s values and inform management decisions. This process also identified focal species whose conservation status was viewed as a priority and had motivated management actions in the past. Focal species included Steller sea lions (Eumetopias jubatus), harbor seals (Phoca vitulina richardsi), humpback whales (Megaptera novaeangliae), and several species of ground-nesting coastal waterbirds. Much of the work described here involved collaborating with subject-matter experts to develop biological models. These models served three main purposes: (1) characterize the state of focal species by incorporating available research and on-going monitoring; (2) respond interactively to changes in the value of population parameters (e.g., population size, distribution), whose influence decision makers would want to assess; and (3) generate estimates that would serve as valuable inputs in subsequent models of visitor-wildlife encounters.
Biological models provide data-driven descriptions of the state of populations. The structured decision-making process places emphasis on models that are as explicit as possible. To this end, I formulated biological sub-models in a manner that would permit estimation of actual population parameters for focal species rather than raw counts or indices. Survey data were modeled as a function of these key parameters, but also as filtered through an imperfect detection process affected by survey effort and uncontrollable variables, such as weather conditions. The Steller sea lion sub-model estimated abundance, spatial distribution, and the proportion of time spent on land (attendance probability) using counts at terrestrial sites and sightings-at-sea. I used a similar approach to model abundance for a sub-population of harbor seals, but with modifications meant to account for the excessive number of zero counts in the data set. The sub-model describing the condition of ground-nesting coastal waterbirds estimated probabilities of survey sites being occupied, of the species being abundant at the site, and of the nesting status for nine different species across 20 key concentration sites that are surveyed in the Park. Finally, the humpback whale sub-model used sightings of whales from active surveys and observers onboard cruise ships to estimate whale abundance and, for the first time, fine-scale spatial distribution in the Park.
Structuring objectives and developing biological sub-models was a key step in an ongoing process of decision tool development. The Park is now in the position to move forward with combining biological sub-models with information on visitor usage. I describe pathways for accomplishing this task, and assess the capacity of each biological sub-model for generating the measurable attributes that decision makers care about. Although decision tool development is ongoing, the work herein is a valuable contribution to the fields of ecology and resource management for several reasons. At the level of individual studies, population parameter estimates from sub-models contribute to conservation efforts for those species, and the novel modeling techniques described are readily generalizable to other systems. The broader contribution of this body of work, however, is in illustrating the value of adopting a structured decision-making approach to resource management in parks. Specifically, this work shows that the process of connecting fundamental objectives to monitoring information can be used identify information gaps and reveal creative ways of using available information to inform management
- …