22 research outputs found

    A Semantic-Based Framework for Summarization and Page Segmentation in Web Mining

    Get PDF
    This chapter addresses two crucial issues that arise when one applies Web-mining techniques for extracting relevant information. The first one is the acquisition of useful knowledge from textual data; the second issue stems from the fact that a web page often proposes a considerable amount of \u2018noise\u2019 with respect to the sections that are truly informative for the user's purposes. The novelty contribution of this work lies in a framework that can tackle both these tasks at the same time, supporting text summarization and page segmentation. The approach achieves this goal by exploiting semantic networks to map natural language into an abstract representation, which eventually supports the identification of the topics addressed in a text source. A heuristic algorithm uses the abstract representation to highlight the relevant segments of text in the original document. The verification of the approach effectiveness involved a publicly available benchmark, the DUC 2002 dataset, and satisfactory results confirmed the method effectiveness

    Investigation of Third Party Rights Service and Shibboleth Modification to Introduce the Service

    Get PDF
    Shibboleth is an architecture to support inter-institutional sharing of electronic resources that are subject to access control. Codifying copyright in Shibboleth authorization policies is difficult because of the copyright exceptions which can be highly subjective. Third Party Rights Service is a high-level concept that has been suggested as a solution to approximate the exceptions of copyright law. In this thesis, I investigate the components of the Third Party Rights Service. I design and analyze a modified Shibboleth architecture based on these components. The resulting architecture allows for the phased addition of the resources to make use of the Third Party Rights Service, while keeping the existing resources in Shibboleth

    Avoin data ja semanttinen verkko - yhdessä kohti älykkäämpää internetiä

    Get PDF
    Digitaalisen vallankumouksen tuoma datan määrän räjähdysmäinen kasvu on tuonut esiin toisaalta haasteita mutta myös mahdollisuuksia datan hyödyntämiseksi. Samaan aikaan käynnissä oleva avoimen ideologian esiinmarssi ja datan hyödyntämiseen tähtäävien teknisten menetelmien kehitys on muuttamassa suhtautumistamme dataan. Datasta on tulossa seuraava internetin resurssi. Internetin standardointiin tähtäävän W3-organisaation tavoitteena on tukea tätä kehitystä, ja se tuottaa tätä varten datan laadun parantamiseksi tarkoitettuja määrittelyitä. Datan kuvaamiseen tehdyt ja semanttisen datan ja semanttisen verkon mahdollistavat määrittelyt ovat näistä keskeisimmät. Avoimen datan ideologia on saanut julkiset instituutiot avaamaan dataa, ja tässä yhteydessä datan laadulle asetetaan vaatimuksia. Arvioidessani julkisen avoimen datan laatua tähän tarkoitukseen esitellyllä viiden tähden asteikolla tulen siihen tulokseen, ettei tämän datan laatu vastaa semanttisen verkon vaatimuksia. Asiasanat:avoin data, semanttinen data, semanttinen verkk

    Incident Prioritisation for Intrusion Response Systems

    Get PDF
    The landscape of security threats continues to evolve, with attacks becoming more serious and the number of vulnerabilities rising. To manage these threats, many security studies have been undertaken in recent years, mainly focusing on improving detection, prevention and response efficiency. Although there are security tools such as antivirus software and firewalls available to counter them, Intrusion Detection Systems and similar tools such as Intrusion Prevention Systems are still one of the most popular approaches. There are hundreds of published works related to intrusion detection that aim to increase the efficiency and reliability of detection, prevention and response systems. Whilst intrusion detection system technologies have advanced, there are still areas available to explore, particularly with respect to the process of selecting appropriate responses. Supporting a variety of response options, such as proactive, reactive and passive responses, enables security analysts to select the most appropriate response in different contexts. In view of that, a methodical approach that identifies important incidents as opposed to trivial ones is first needed. However, with thousands of incidents identified every day, relying upon manual processes to identify their importance and urgency is complicated, difficult, error-prone and time-consuming, and so prioritising them automatically would help security analysts to focus only on the most critical ones. The existing approaches to incident prioritisation provide various ways to prioritise incidents, but less attention has been given to adopting them into an automated response system. Although some studies have realised the advantages of prioritisation, they released no further studies showing they had continued to investigate the effectiveness of the process. This study concerns enhancing the incident prioritisation scheme to identify critical incidents based upon their criticality and urgency, in order to facilitate an autonomous mode for the response selection process in Intrusion Response Systems. To achieve this aim, this study proposed a novel framework which combines models and strategies identified from the comprehensive literature review. A model to estimate the level of risks of incidents is established, named the Risk Index Model (RIM). With different levels of risk, the Response Strategy Model (RSM) dynamically maps incidents into different types of response, with serious incidents being mapped to active responses in order to minimise their impact, while incidents with less impact have passive responses. The combination of these models provides a seamless way to map incidents automatically; however, it needs to be evaluated in terms of its effectiveness and performances. To demonstrate the results, an evaluation study with four stages was undertaken; these stages were a feasibility study of the RIM, comparison studies with industrial standards such as Common Vulnerabilities Scoring System (CVSS) and Snort, an examination of the effect of different strategies in the rating and ranking process, and a test of the effectiveness and performance of the Response Strategy Model (RSM). With promising results being gathered, a proof-of-concept study was conducted to demonstrate the framework using a live traffic network simulation with online assessment mode via the Security Incident Prioritisation Module (SIPM); this study was used to investigate its effectiveness and practicality. Through the results gathered, this study has demonstrated that the prioritisation process can feasibly be used to facilitate the response selection process in Intrusion Response Systems. The main contribution of this study is to have proposed, designed, evaluated and simulated a framework to support the incident prioritisation process for Intrusion Response Systems.Ministry of Higher Education in Malaysia and University of Malay

    Systematics of Clematis in Nepal, the evolution of tribe Anemoneae DC. (Ranunculaceae) and phylogeography and the dynamics of speciation in the Himalaya

    Get PDF
    The genus Clematis L. (Ranunculaceae) was used as a new model group to assess the role of the Himalayan orogeny on generation of biodiversity through investigations of its phylogeny, phylogeography and taxonomy. Although existing checklists include 28 species of Clematis from Nepal, a comprehensive taxonomic revision of available material in herbaria and additional sampling from fieldwork during this study has led to the recognition of 21 species of Clematis in Nepal, including one species (C. kilungensis) not previously recorded from Nepal. Exisiting phylogenetic and taxonomic concepts were tested with the addition of new samples from Nepal. The results highlight the shortcomings of the previous studies which were poorly resolved and indicate the need for a thorough revision of the sectional classification. Despite the increased sampling the results are still equivocal due to poor statistical support along the backbone of the phylogeny. Groups of species in well supported terminal clades are broadly comparable with results from previous studies although there are fewer clearly recognisable and well supported clades. The published dates for the evolution of Clematis were tested and the methodology of the previous study critically reappraised. The results indicate that the genus Clematis is approximately twice as old as previously reported and evolved in the middle Miocene. The phylogeny also demonstrates that, even allowing for poor support for the relationships between groups of species within Clematis, the extant Nepalese species must have multiple independent origins from at least 6 different colonisations. With their occurrence in the Pliocene and Pleistocene, these events are relatively recent in relation to the Himalayan orogeny, and may be linked more to the dispersal ability of Clematis than to the direct effects of the orogeny. Additional Nepalese samples of Koenigia and Meconopsis were added to exisiting datasets and these were reanalysed. The result from Clematis, Koenigia and Meconopsis were appraised in light of the the geocientific literature and previously published phylogeographic studies to create an overview of the drivers behind speciation in the Himalaya

    Exploring attributes, sequences, and time in Recommender Systems: From classical to Point-of-Interest recommendation

    Full text link
    Tesis Doctoral inédita leída en la Universidad Autónoma de Madrid, Escuela Politécnica Superior, Departamento de Ingenieria Informática. Fecha de lectura: 08-07-2021Since the emergence of the Internet and the spread of digital communications throughout the world, the amount of data stored on the Web has been growing exponentially. In this new digital era, a large number of companies have emerged with the purpose of ltering the information available on the web and provide users with interesting items. The algorithms and models used to recommend these items are called Recommender Systems. These systems are applied to a large number of domains, from music, books, or movies to dating or Point-of-Interest (POI), which is an increasingly popular domain where users receive recommendations of di erent places when they arrive to a city. In this thesis, we focus on exploiting the use of contextual information, especially temporal and sequential data, and apply it in novel ways in both traditional and Point-of-Interest recommendation. We believe that this type of information can be used not only for creating new recommendation models but also for developing new metrics for analyzing the quality of these recommendations. In one of our rst contributions we propose di erent metrics, some of them derived from previously existing frameworks, using this contextual information. Besides, we also propose an intuitive algorithm that is able to provide recommendations to a target user by exploiting the last common interactions with other similar users of the system. At the same time, we conduct a comprehensive review of the algorithms that have been proposed in the area of POI recommendation between 2011 and 2019, identifying the common characteristics and methodologies used. Once this classi cation of the algorithms proposed to date is completed, we design a mechanism to recommend complete routes (not only independent POIs) to users, making use of reranking techniques. In addition, due to the great di culty of making recommendations in the POI domain, we propose the use of data aggregation techniques to use information from di erent cities to generate POI recommendations in a given target city. In the experimental work we present our approaches on di erent datasets belonging to both classical and POI recommendation. The results obtained in these experiments con rm the usefulness of our recommendation proposals, in terms of ranking accuracy and other dimensions like novelty, diversity, and coverage, and the appropriateness of our metrics for analyzing temporal information and biases in the recommendations producedDesde la aparici on de Internet y la difusi on de las redes de comunicaciones en todo el mundo, la cantidad de datos almacenados en la red ha crecido exponencialmente. En esta nueva era digital, han surgido un gran n umero de empresas con el objetivo de ltrar la informaci on disponible en la red y ofrecer a los usuarios art culos interesantes. Los algoritmos y modelos utilizados para recomendar estos art culos reciben el nombre de Sistemas de Recomendaci on. Estos sistemas se aplican a un gran n umero de dominios, desde m usica, libros o pel culas hasta las citas o los Puntos de Inter es (POIs, en ingl es), un dominio cada vez m as popular en el que los usuarios reciben recomendaciones de diferentes lugares cuando llegan a una ciudad. En esta tesis, nos centramos en explotar el uso de la informaci on contextual, especialmente los datos temporales y secuenciales, y aplicarla de forma novedosa tanto en la recomendaci on cl asica como en la recomendaci on de POIs. Creemos que este tipo de informaci on puede utilizarse no s olo para crear nuevos modelos de recomendaci on, sino tambi en para desarrollar nuevas m etricas para analizar la calidad de estas recomendaciones. En una de nuestras primeras contribuciones proponemos diferentes m etricas, algunas derivadas de formulaciones previamente existentes, utilizando esta informaci on contextual. Adem as, proponemos un algoritmo intuitivo que es capaz de proporcionar recomendaciones a un usuario objetivo explotando las ultimas interacciones comunes con otros usuarios similares del sistema. Al mismo tiempo, realizamos una revisi on exhaustiva de los algoritmos que se han propuesto en el a mbito de la recomendaci o n de POIs entre 2011 y 2019, identi cando las caracter sticas comunes y las metodolog as utilizadas. Una vez realizada esta clasi caci on de los algoritmos propuestos hasta la fecha, dise~namos un mecanismo para recomendar rutas completas (no s olo POIs independientes) a los usuarios, haciendo uso de t ecnicas de reranking. Adem as, debido a la gran di cultad de realizar recomendaciones en el ambito de los POIs, proponemos el uso de t ecnicas de agregaci on de datos para utilizar la informaci on de diferentes ciudades y generar recomendaciones de POIs en una determinada ciudad objetivo. En el trabajo experimental presentamos nuestros m etodos en diferentes conjuntos de datos tanto de recomendaci on cl asica como de POIs. Los resultados obtenidos en estos experimentos con rman la utilidad de nuestras propuestas de recomendaci on en t erminos de precisi on de ranking y de otras dimensiones como la novedad, la diversidad y la cobertura, y c omo de apropiadas son nuestras m etricas para analizar la informaci on temporal y los sesgos en las recomendaciones producida

    Plant Virus Emergence

    Get PDF
    This compilation of articles elaborates on plant virus diseases that are among the most recent epidemiological concerns. The chapters explore several paradigms in plant virus epidemiology, outbreaks, epidemics, and pandemics paralleling zoonotic viruses and that can be consequential to global food security. There is evidence that the local, regional, national, and global trade of agricultural products has aided the global dispersal of plant virus diseases. Expanding farmlands into pristine natural areas has created opportunities for viruses in native landscapes to invade crops, while the movement of food and food products disseminates viruses, creating epidemics or pandemics. Moreover, plant virus outbreaks not only directly impact food supply, but also incidentally affect human health

    Automated Realistic Test Input Generation and Cost Reduction in Service-centric System Testing

    Get PDF
    Service-centric System Testing (ScST) is more challenging than testing traditional software due to the complexity of service technologies and the limitations that are imposed by the SOA environment. One of the most important problems in ScST is the problem of realistic test data generation. Realistic test data is often generated manually or using an existing source, thus it is hard to automate and laborious to generate. One of the limitations that makes ScST challenging is the cost associated with invoking services during testing process. This thesis aims to provide solutions to the aforementioned problems, automated realistic input generation and cost reduction in ScST. To address automation in realistic test data generation, the concept of Service-centric Test Data Generation (ScTDG) is presented, in which existing services used as realistic data sources. ScTDG minimises the need for tester input and dependence on existing data sources by automatically generating service compositions that can generate the required test data. In experimental analysis, our approach achieved between 93% and 100% success rates in generating realistic data while state-of-the-art automated test data generation achieved only between 2% and 34%. The thesis addresses cost concerns at test data generation level by enabling data source selection in ScTDG. Source selection in ScTDG has many dimensions such as cost, reliability and availability. This thesis formulates this problem as an optimisation problem and presents a multi-objective characterisation of service selection in ScTDG, aiming to reduce the cost of test data generation. A cost-aware pareto optimal test suite minimisation approach addressing testing cost concerns during test execution is also presented. The approach adapts traditional multi-objective minimisation approaches to ScST domain by formulating ScST concerns, such as invocation cost and test case reliability. In experimental analysis, the approach achieved reductions between 69% and 98.6% in monetary cost of service invocations during testin
    corecore