6 research outputs found

    An Ontology based Enhanced Framework for Instant Messages Filtering for Detection of Cyber Crimes

    Get PDF
    Instant messaging is very appealing and relatively new class of social interaction. Instant Messengers (IMs) and Social Networking Sites (SNS) may contain messages which are capable of causing harm, which are untraced, leading to obstruction for network communication and cyber security. User ignorance towards the use of communication services like Instant Messengers, emails, websites, social networks etc, is creating favourable conditions for cyber threat activity. It is required to create technical awareness in users by educating them to create a suspicious detection application which would generate alerts for the user so that suspicious messages are not ignored. Very limited research contributions were available in for detection of suspicious cyber threat activity in IM. A context based, dynamic and intelligent suspicious detection methodology in IMs is proposed, to analyse and detect cyber threat activity in Instant Messages with relevance to domain ontology (OBIE) and utilizes the Association rule mining for generating rules and alerting the victims, also analyses results with high ratio of precision and recall. The results have proved improvisation over the existing methods by showing the increased percentage of precision and recall. DOI: 10.17762/ijritcc2321-8169.15056

    Normalisation of imprecise temporal expressions extracted from text

    Get PDF
    Orientador : Prof. Dr. Marcos Didonet Del FabroCo-Orientador : Prof. Dr. Angus RobertsTese (doutorado) - Universidade Federal do Paraná, Setor de Ciências Exatas, Programa de Pós-Graduação em Informática. Defesa: Curitiba, 05/04/2016Inclui referências : f. 95-105Resumo: Técnicas e sistemas de extração de informações são capazes de lidar com a crescente quantidade de dados não estruturados disponíveis hoje em dia. A informação temporal está entre os diferentes tipos de informações que podem ser extraídos a partir de tais fontes de dados não estruturados, como documentos de texto. Informações temporais descrevem as mudanças que acontecem através da ocorrência de eventos, e fornecem uma maneira de gravar, ordenar e medir a duração de tais ocorrências. A impossibilidade de identificar e extrair informação temporal a partir de documentos textuais faz com que seja difícil entender como os eventos são organizados em ordem cronológica. Além disso, em muitas situações, o significado das expressões temporais é impreciso, e não pode ser descrito com precisão, o que leva a erros de interpretação. As soluções existentes proporcionam formas alternativas de representar expressões temporais imprecisas. Elas são, entretanto, específicas e difíceis de generalizar. Além disso, a análise de dados temporais pode ser particularmente ineficiente na presença de erros ortográficos. As abordagens existentes usam métodos de similaridade para procurar palavras válidas dentro de um texto. No entanto, elas não são suficientes para processos erros de ortografia de uma forma eficiente. Nesta tese é apresentada uma metodologia para analisar e normalizar das expressões temporais imprecisas, em que, após a coleta e pré-processamento de dados sobre a forma como as pessoas interpretam descrições vagas de tempo no texto, diferentes técnicas são comparadas a fim de criar e selecionar o modelo de normalização mais apropriada para diferentes tipos de expressões imprecisas. Também são comparados um sistema baseado em regras e uma abordagem de aprendizagem de máquina na tentativa de identificar expressões temporais em texto, e é analisado o processo de produção de padrões de anotação, identificando possíveis fontes de problemas, dando algumas recomendações para serem consideradas no futuro esforços de anotação manual. Finalmente, é proposto um mapa fonético e é avaliado como a codificação de informação fonética poderia ser usado a fim de auxiliar os métodos de busca de similaridade e melhorar a qualidade da informação extraída.Abstract: Information Extraction systems and techniques are able to deal with the increasing amount of unstructured data available nowadays. Time is amongst the different kinds of information that may be extracted from such unstructured data sources, including text documents. Time describes changes which happen through the occurrence of events, and provides a way to record, order, and measure the duration of such occurrences. The inability to identify and extract temporal information from text makes it difficult to understand how the events are organized in a chronological order. Moreover, in many situations, the meaning of temporal expressions is imprecise, and cannot be accurately described, leading to interpretation errors. Existing solutions provide alternative ways of representing imprecise temporal expressions, though they are specific and hard to generalise. Furthermore, the analysis of temporal data may be particularly inefficient in the presence of spelling errors. Existing approaches use string similarity methods to search for valid words within a text. However, they are not rich enough to processes misspellings in an efficient way. In this thesis, we present a methodology to analyse and normalise of imprecise temporal expressions, in which, after collecting and pre-processing data on how people interpret vague descriptions of time in text, we compare different techniques in order to create and select the most appropriate normalisation model for different kinds of imprecise expressions. We also compare how a rule-based system and a machine learning approach perform on trying to identify temporal expression from text, and we analyse the process of producing gold standards, identifying possible sources of issues, giving some recommendations to be considered in future manual annotation efforts. Finally, we propose a phonetic map and evaluate how encoding phonetic information could be used in order to assist similarity search methods and improve information extraction quality

    Ontology Guided Information Extraction from Unstructured Text

    No full text

    Des spécifications en langage naturel aux spécifications formelles via une ontologie comme modèle pivot

    Get PDF
    Le développement d'un système a pour objectif de répondre à des exigences. Aussi, le succès de sa réalisation repose en grande partie sur la phase de spécification des exigences qui a pour vocation de décrire de manière précise et non ambiguë toutes les caractéristiques du système à développer.Les spécifications d'exigences sont le résultat d'une analyse des besoins faisant intervenir différentes parties. Elles sont généralement rédigées en langage naturel (LN) pour une plus large compréhension, ce qui peut mener à diverses interprétations, car les textes en LN peuvent contenir des ambiguïtés sémantiques ou des informations implicites. Il n'est donc pas aisé de spécifier un ensemble complet et cohérent d'exigences. D'où la nécessité d'une vérification formelle des spécifications résultats.Les spécifications LN ne sont pas considérées comme formelles et ne permettent pas l'application directe de méthodes vérification formelles.Ce constat mène à la nécessité de transformer les spécifications LN en spécifications formelles.C'est dans ce contexte que s'inscrit cette thèse.La difficulté principale d'une telle transformation réside dans l'ampleur du fossé entre spécifications LN et spécifications formelles.L'objectif de mon travail de thèse est de proposer une approche permettant de vérifier automatiquement des spécifications d'exigences utilisateur, écrites en langage naturel et décrivant le comportement d'un système.Pour cela, nous avons exploré les possibilités offertes par un modèle de représentation fondé sur un formalisme logique.Nos contributions portent essentiellement sur trois propositions :1) une ontologie en OWL-DL fondée sur les logiques de description, comme modèle de représentation pivot permettant de faire le lien entre spécifications en langage naturel et spécifications formelles; 2) une approche d'instanciation du modèle de représentation pivot, fondée sur une analyse dirigée par la sémantique de l'ontologie, permettant de passer automatiquement des spécifications en langage naturel à leur représentation conceptuelle; et 3) une approche exploitant le formalisme logique de l'ontologie, pour permettre un passage automatique du modèle de représentation pivot vers un langage de spécifications formelles nommé Maude.The main objective of system development is to address requirements. As such, success in its realisation is highly dependent on a requirement specification phase which aims to describe precisely and unambiguously all the characteristics of the system that should be developed. In order to arrive at a set of requirements, a user needs analysis is carried out which involves different parties (stakeholders). The system requirements are generally written in natural language to garantuee a wider understanding. However, since NL texts can contain semantic ambiguities, implicit information, or other inconsistenties, this can lead to diverse interpretations. Hence, it is not easy to specify a set of complete and consistent requirements, and therefore, the specified requirements must be formally checked. Specifications written in NL are not considered to be formal and do not allow for a direct application of formal methods. We must therefore transform NL requirements into formal specifications. The work presented in this thesis was carried out in this framework. The main difficulty of such transformation is the gap between NL requirements and formal specifications. The objective of this work is to propose an approach for an automatic verification of user requirements which are written in natural language and describe a system's expected behaviour. Our approach uses the potential offered by a representation model based on a logical formalism. Our contribution has three main aspects: 1) an OWL-DL ontology based on description logic, used as a pivot representation model that serves as a link between NL requirements to formal specifications; 2) an approach for the instantiation of the pivot ontology, which allows an automatic transformation of NL requirements to their conceptual representations; and 3) an approach exploiting the logical formalism of the ontology in order to automatically translate the ontology into a formal specification language called Maude.PARIS11-SCD-Bib. électronique (914719901) / SudocSudocFranceF
    corecore