Search CORE

546 research outputs found

Knowledge visualization: From theory to practice

Author: Jeong Dong Hyun
NC DOCKS at The University of North Carolina at Charlotte
Publication venue
Publication date: 01/01/2010
Field of study

Visualizations have been known as efficient tools that can help users analyze com- plex data. However, understanding the displayed data and finding underlying knowl- edge is still difficult. In this work, a new approach is proposed based on understanding the definition of knowledge. Although there are many definitions used in different ar- eas, this work focuses on representing knowledge as a part of a visualization and showing the benefit of adopting knowledge representation. Specifically, this work be- gins with understanding interaction and reasoning in visual analytics systems, then a new definition of knowledge visualization and its underlying knowledge conversion processes are proposed. The definition of knowledge is differentiated as either explicit or tacit knowledge. Instead of directly representing data, the value of the explicit knowledge associated with the data is determined based on a cost/benefit analysis. In accordance to its importance, the knowledge is displayed to help the user under- stand the complex data through visual analytical reasoning and discovery

A comparison of statistical machine learning methods in heartbeat detection and classification

Author: A.L. Goldberger
G.J. McLachlan
H. Feichtinger
J.A. Freeman
P. Chazal de
R.A. Johnson
R.O. Duda
T. Ince
Y.H. Hu
Publication venue: Springer Berlin Heidelberg
Publication date: 01/01/2012
Field of study

In health care, patients with heart problems require quick responsiveness in a clinical setting or in the operating theatre. Towards that end, automated classification of heartbeats is vital as some heartbeat irregularities are time consuming to detect. Therefore, analysis of electro-cardiogram (ECG) signals is an active area of research. The methods proposed in the literature depend on the structure of a heartbeat cycle. In this paper, we use interval and amplitude based features together with a few samples from the ECG signal as a feature vector. We studied a variety of classification algorithms focused especially on a type of arrhythmia known as the ventricular ectopic fibrillation (VEB). We compare the performance of the classifiers against algorithms proposed in the literature and make recommendations regarding features, sampling rate, and choice of the classifier to apply in a real-time clinical setting. The extensive study is based on the MIT-BIH arrhythmia database. Our main contribution is the evaluation of existing classifiers over a range sampling rates, recommendation of a detection methodology to employ in a practical setting, and extend the notion of a mixture of experts to a larger class of algorithms

Modeling Deception for Cyber Security

Author: Faveri Cristiano De
Publication venue
Publication date: 01/02/2022
Field of study

In the era of software-intensive, smart and connected systems, the growing power and so- phistication of cyber attacks poses increasing challenges to software security. The reactive posture of traditional security mechanisms, such as anti-virus and intrusion detection systems, has not been sufficient to combat a wide range of advanced persistent threats that currently jeopardize systems operation. To mitigate these extant threats, more ac- tive defensive approaches are necessary. Such approaches rely on the concept of actively hindering and deceiving attackers. Deceptive techniques allow for additional defense by thwarting attackers’ advances through the manipulation of their perceptions. Manipu- lation is achieved through the use of deceitful responses, feints, misdirection, and other falsehoods in a system. Of course, such deception mechanisms may result in side-effects that must be handled. Current methods for planning deception chiefly portray attempts to bridge military deception to cyber deception, providing only high-level instructions that largely ignore deception as part of the software security development life cycle. Con- sequently, little practical guidance is provided on how to engineering deception-based techniques for defense. This PhD thesis contributes with a systematic approach to specify and design cyber deception requirements, tactics, and strategies. This deception approach consists of (i) a multi-paradigm modeling for representing deception requirements, tac- tics, and strategies, (ii) a reference architecture to support the integration of deception strategies into system operation, and (iii) a method to guide engineers in deception mod- eling. A tool prototype, a case study, and an experimental evaluation show encouraging results for the application of the approach in practice. Finally, a conceptual coverage map- ping was developed to assess the expressivity of the deception modeling language created.Na era digital o crescente poder e sofisticação dos ataques cibernéticos apresenta constan- tes desafios para a segurança do software. A postura reativa dos mecanismos tradicionais de segurança, como os sistemas antivírus e de detecção de intrusão, não têm sido suficien- tes para combater a ampla gama de ameaças que comprometem a operação dos sistemas de software actuais. Para mitigar estas ameaças são necessárias abordagens ativas de defesa. Tais abordagens baseiam-se na ideia de adicionar mecanismos para enganar os adversários (do inglês deception). As técnicas de enganação (em português, "ato ou efeito de enganar, de induzir em erro; artimanha usada para iludir") contribuem para a defesa frustrando o avanço dos atacantes por manipulação das suas perceções. A manipula- ção é conseguida através de respostas enganadoras, de "fintas", ou indicações erróneas e outras falsidades adicionadas intencionalmente num sistema. É claro que esses meca- nismos de enganação podem resultar em efeitos colaterais que devem ser tratados. Os métodos atuais usados para enganar um atacante inspiram-se fundamentalmente nas técnicas da área militar, fornecendo apenas instruções de alto nível que ignoram, em grande parte, a enganação como parte do ciclo de vida do desenvolvimento de software seguro. Consequentemente, há poucas referências práticas em como gerar técnicas de defesa baseadas em enganação. Esta tese de doutoramento contribui com uma aborda- gem sistemática para especificar e desenhar requisitos, táticas e estratégias de enganação cibernéticas. Esta abordagem é composta por (i) uma modelação multi-paradigma para re- presentar requisitos, táticas e estratégias de enganação, (ii) uma arquitetura de referência para apoiar a integração de estratégias de enganação na operação dum sistema, e (iii) um método para orientar os engenheiros na modelação de enganação. Uma ferramenta protó- tipo, um estudo de caso e uma avaliação experimental mostram resultados encorajadores para a aplicação da abordagem na prática. Finalmente, a expressividade da linguagem de modelação de enganação é avaliada por um mapeamento de cobertura de conceitos

Limit order books in statistical arbitrage and anomaly detection

Author: Poutré Cédric
Publication venue
Publication date: 01/04/2023
Field of study

Cette thèse propose des méthodes exploitant la vaste information contenue dans les carnets d’ordres (LOBs). La première partie de cette thèse découvre des inefficacités dans les LOBs qui sont source d’arbitrage statistique pour les traders haute fréquence. Le chapitre 1 développe de nouvelles relations théoriques entre les actions intercotées afin que leurs prix soient exempts d’arbitrage. Toute déviation de prix est capturée par une stratégie novatrice qui est ensuite évaluée dans un nouvel environnement de backtesting permettant l’étude de la latence et de son importance pour les traders haute fréquence. Le chapitre 2 démontre empiriquement l’existence d’arbitrage lead-lag à haute fréquence. Les relations dites lead-lag ont été bien documentées par le passé, mais aucune étude n’a montré leur véritable potentiel économique. Un modèle économétrique original est proposé pour prédire les rendements de l’actif en retard, ce qu’il réalise de manière précise hors échantillon, conduisant à des opportunités d’arbitrage de courte durée. Dans ces deux chapitres, les inefficacités des LOBs découvertes sont démontrées comme étant rentables, fournissant ainsi une meilleure compréhension des activités des traders haute fréquence. La deuxième partie de cette thèse investigue les séquences anormales dans les LOBs. Le chapitre 3 évalue la performance de méthodes d’apprentissage automatique dans la détection d’ordres frauduleux. En raison de la grande quantité de données, les fraudes sont difficilement détectables et peu de cas sont disponibles pour ajuster les modèles de détection. Un nouveau cadre d’apprentissage profond non supervisé est proposé afin de discerner les comportements anormaux du LOB dans ce contexte ardu. Celui-ci est indépendant de l’actif et peut évoluer avec les marchés, offrant alors de meilleures capacités de détection pour les régulateurs financiers.This thesis proposes methods exploiting the vast informational content of limit order books (LOBs). The first part of this thesis discovers LOB inefficiencies that are sources of statistical arbitrage for high-frequency traders. Chapter 1 develops new theoretical relationships between cross-listed stocks, so their prices are arbitrage free. Price deviations are captured by a novel strategy that is then evaluated in a new backtesting environment enabling the study of latency and its importance for high-frequency traders. Chapter 2 empirically demonstrates the existence of lead-lag arbitrage at high-frequency. Lead-lag relationships have been well documented in the past, but no study has shown their true economic potential. An original econometric model is proposed to forecast returns on the lagging asset, and does so accurately out-of-sample, resulting in short-lived arbitrage opportunities. In both chapters, the discovered LOB inefficiencies are shown to be profitable, thus providing a better understanding of high-frequency traders’ activities. The second part of this thesis investigates anomalous patterns in LOBs. Chapter 3 studies the performance of machine learning methods in the detection of fraudulent orders. Because of the large amount of LOB data generated daily, trade frauds are challenging to catch, and very few cases are available to fit detection models. A novel unsupervised deep learning–based framework is proposed to discern abnormal LOB behavior in this difficult context. It is asset independent and can evolve alongside markets, providing better fraud detection capabilities to market regulators