2,457 research outputs found

    Investigating sentence weighting components for automatic summarisation

    No full text
    The work described here initially formed part of a triangulation exercise to establish the effectiveness of the Query Term Order algorithm. The methodology produced subsequently proved to be a reliable indicator of quality for summarising English web documents. We utilised the human summaries from the Document Understanding Conference data, and generated queries automatically for testing the QTO algorithm. Six sentence weighting schemes that made use of Query Term Frequency and QTO were constructed to produce system summaries, and this paper explains the process of combining and balancing the weighting components. We also examined the five automatically generated query terms in their different permutations to check if the automatic generation of query terms resulting bias. The summaries produced were evaluated by the ROUGE-1 metric, and the results showed that using QTO in a weighting combination resulted in the best performance. We also found that using a combination of more weighting components always produced improved performance compared to any single weighting component

    Distributed detection of anomalous internet sessions

    Get PDF
    Financial service providers are moving many services online reducing their costs and facilitating customers¿ interaction. Unfortunately criminals have quickly found several ways to avoid most security measures applied to browsers and banking sites. The use of highly dangerous malware has become the most significant threat and traditional signature-detection methods are nowadays easily circumvented due to the amount of new samples and the use of sophisticated evasion techniques. Antivirus vendors and malware experts are pushed to seek for new methodologies to improve the identification and understanding of malicious applications behavior and their targets. Financial institutions are now playing an important role by deploying their own detection tools against malware that specifically affect their customers. However, most detection approaches tend to base on sequence of bytes in order to create new signatures. This thesis approach is based on new sources of information: the web logs generated from each banking session, the normal browser execution and customers mobile phone behavior. The thesis can be divided in four parts: The first part involves the introduction of the thesis along with the presentation of the problems and the methodology used to perform the experimentation. The second part describes our contributions to the research, which are based in two areas: *Server side: Weblogs analysis. We first focus on the real time detection of anomalies through the analysis of web logs and the challenges introduced due to the amount of information generated daily. We propose different techniques to detect multiple threats by deploying per user and global models in a graph based environment that will allow increase performance of a set of highly related data. *Customer side: Browser analysis. We deal with the detection of malicious behaviors from the other side of a banking session: the browser. Malware samples must interact with the browser in order to retrieve or add information. Such relation interferes with the normal behavior of the browser. We propose to develop models capable of detecting unusual patterns of function calls in order to detect if a given sample is targeting an specific financial entity. In the third part, we propose to adapt our approaches to mobile phones and Critical Infrastructures environments. The latest online banking attack techniques circumvent protection schemes such password verification systems send via SMS. Man in the Mobile attacks are capable of compromising mobile devices and gaining access to SMS traffic. Once the Transaction Authentication Number is obtained, criminals are free to make fraudulent transfers. We propose to model the behavior of the applications related messaging services to automatically detect suspicious actions. Real time detection of unwanted SMS forwarding can improve the effectiveness of second channel authentication and build on detection techniques applied to browsers and Web servers. Finally, we describe possible adaptations of our techniques to another area outside the scope of online banking: critical infrastructures, an environment with similar features since the applications involved can also be profiled. Just as financial entities, critical infrastructures are experiencing an increase in the number of cyber attacks, but the sophistication of the malware samples utilized forces to new detection approaches. The aim of the last proposal is to demonstrate the validity of out approach in different scenarios. Conclusions. Finally, we conclude with a summary of our findings and the directions for future work

    State of the art of a multi-agent based recommender system for active software engineering ontology

    Get PDF
    Software engineering ontology was first developed to provide efficient collaboration and coordination among distributed teams working on related software development projects across the sites. It helped to clarify the software engineering concepts and project information as well as enable knowledge sharing. However, a major challenge of the software engineering ontology users is that they need the competence to access and translate what they are looking for into the concepts and relations described in the ontology; otherwise, they may not be able to obtain required information. In this paper, we propose a conceptual framework of a multi-agent based recommender system to provide active support to access and utilize knowledge and project information in the software engineering ontology. Multi-agent system and semantic-based recommendation approach will be integrated to create collaborative working environment to access and manipulate data from the ontology and perform reasoning as well as generate expert recommendation facilities for dispersed software teams across the sites

    A collaborative, semantic and context-aware search engine

    Get PDF
    Search engines help people to find information in the largest public knowledge system of the world: the Web. Unfortunately its size makes very complex to discover the right information. The users are faced lots of useless results forcing them to select one by one the most suitable. The new generation of search engines evolve from keyword-based indexing and classification to more sophisticated techniques considering the meaning, the context and the usage of information. We argue about the three key aspects: collaboration, geo-referencing and semantics. Collaboration distributes storage, processing and trust on a world-wide network of nodes running on users’ computers, getting rid of bottlenecks and central points of failures. The geo-referencing of catalogued resources allows contextualisation based on user position. Semantic analysis lets to increase the results relevance. In this paper, we expose the studies, the concepts and the solutions of a research project to introduce these three key features in a novel search engine architecture.213-21

    A software system for agent-assisted ontology building

    Get PDF
    This thesis investigates how one can design a team of intelligent software agents that helps its human partner develop a formal ontology from a relational database and enhance it with higher-level abstractions. The resulting efficiency of ontology development could facilitate the building of intelligent decision support systems that allow: high-level semantic queries on legacy relational databases autonomous implementation within a host organization and incremental deployment without affecting the underlying database or its conventional use. We introduce a set of design principles, formulate the prototype system requirements and architecture, elaborate agent roles and interactions, develop suitable design techniques, and test the approach through practical implementation of selected features. We endow each agent with model meta-ontology, which enables it to reason and communicate about ontology, and planning meta-ontology, which captures the role-specific know-how of the ontology building method. We also assess the maturity of development tools for a larger-scale implementation. --Leaf i.The original print copy of this thesis may be available here: http://wizard.unbc.ca/record=b214471

    Model driven design and data integration in semantic web information systems

    Get PDF
    The Web is quickly evolving in many ways. It has evolved from a Web of documents into a Web of applications in which a growing number of designers offer new and interactive Web applications with people all over the world. However, application design and implementation remain complex, error-prone and laborious. In parallel there is also an evolution from a Web of documents into a Web of `knowledge' as a growing number of data owners are sharing their data sources with a growing audience. This brings the potential new applications for these data sources, including scenarios in which these datasets are reused and integrated with other existing and new data sources. However, the heterogeneity of these data sources in syntax, semantics and structure represents a great challenge for application designers. The Semantic Web is a collection of standards and technologies that offer solutions for at least the syntactic and some structural issues. If offers semantic freedom and flexibility, but this leaves the issue of semantic interoperability. In this thesis we present Hera-S, an evolution of the Model Driven Web Engineering (MDWE) method Hera. MDWEs allow designers to create data centric applications using models instead of programming. Hera-S especially targets Semantic Web sources and provides a flexible method for designing personalized adaptive Web applications. Hera-S defines several models that together define the target Web application. Moreover we implemented a framework called Hydragen, which is able to execute the Hera-S models to run the desired Web application. Hera-S' core is the Application Model (AM) in which the main logic of the application is defined, i.e. defining the groups of data elements that form logical units or subunits, the personalization conditions, and the relationships between the units. Hera-S also uses a so-called Domain Model (DM) that describes the content and its structure. However, this DM is not Hera-S specific, but instead allows any Semantic Web source representation as its DM, as long as its content can be queried by the standardized Semantic Web query language SPARQL. The same holds for the User Model (UM). The UM can be used for personalization conditions, but also as a source of user-related content if necessary. In fact, the difference between DM and UM is conceptual as their implementation within Hydragen is the same. Hera-S also defines a presentation model (PM) which defines presentation details of elements like order and style. In order to help designers with building their Web applications we have introduced a toolset, Hera Studio, which allows to build the different models graphically. Hera Studio also provides some additional functionality like model checking and deployment of the models in Hydragen. Both Hera-S and its implementation Hydragen are designed to be flexible regarding the user of models. In order to achieve this Hydragen is a stateless engine that queries for relevant information from the models at every page request. This allows the models and data to be changed in the datastore during runtime. We show that one way to exploit this flexibility is by applying aspect-orientation to the AM. Aspect-orientation allows us to dynamically inject functionality that pervades the entire application. Another way to exploit Hera-S' flexibility is in reusing specialized components, e.g. for presentation generation. We present a configuration of Hydragen in which we replace our native presentation generation functionality by the AMACONT engine. AMACONT provides more extensive multi-level presentation generation and adaptation capabilities as well aspect-orientation and a form of semantic based adaptation. Hera-S was designed to allow the (re-)use of any (Semantic) Web datasource. It even opens up the possibility for data integration at the back end, by using an extendible storage layer in our database of choice Sesame. However, even though theoretically possible it still leaves much of the actual data integration issue. As this is a recurring issue in many domains, a broader challenge than for Hera-S design only, we decided to look at this issue in isolation. We present a framework called Relco which provides a language to express data transformation operations as well as a collection of techniques that can be used to (semi-)automatically find relationships between concepts in different ontologies. This is done with a combination of syntactic, semantic and collaboration techniques, which together provide strong clues for which concepts are most likely related. In order to prove the applicability of Relco we explore five application scenarios in different domains for which data integration is a central aspect. This includes a cultural heritage portal, Explorer, for which data from several datasources was integrated and was made available by a mapview, a timeline and a graph view. Explorer also allows users to provide metadata for objects via a tagging mechanism. Another application is SenSee: an electronic TV-guide and recommender. TV-guide data was integrated and enriched with semantically structured data from several sources. Recommendations are computed by exploiting the underlying semantic structure. ViTa was a project in which several techniques for tagging and searching educational videos were evaluated. This includes scenarios in which user tags are related with an ontology, or other tags, using the Relco framework. The MobiLife project targeted the facilitation of a new generation of mobile applications that would use context-based personalization. This can be done using a context-based user profiling platform that can also be used for user model data exchange between mobile applications using technologies like Relco. The final application scenario that is shown is from the GRAPPLE project which targeted the integration of adaptive technology into current learning management systems. A large part of this integration is achieved by using a user modeling component framework in which any application can store user model information, but which can also be used for the exchange of user model data

    DART: the distributed agent based retrieval toolkit

    Get PDF
    The technology of search engines is evolving from indexing and classification of web resources based on keywords to more sophisticated techniques which take into account the meaning and the context of textual information and usage. Replying to query, commercial search engines face the user requests with a large amount of results, mostly useless or only partially related to the request; the subsequent refinement, operated downloading and examining as much pages as possible and simply ignoring whatever stays behind the first few pages, is left up to the user. Furthermore, architectures based on centralized indexes, allow commercial search engines to control the advertisement of online information, in contrast to P2P architectures that focus the attention on user requirements involving the end user in search engine maintenance and operation. To address such wishes, new search engines should focus on three key aspects: semantics, geo-referencing, collaboration/distribution. Semantic analysis lets to increase the results relevance. The geo-referencing of catalogued resources allows contextualisation based on user position. Collaboration distributes storage, processing, and trust on a world-wide network of nodes running on users’ computers, getting rid of bottlenecks and central points of failures. In this paper, we describe the studies, the concepts and the solutions developed in the DART project to introduce these three key features in a novel search engine architecture

    Internet based molecular collaborative and publishing tools

    No full text
    The scientific electronic publishing model has hitherto been an Internet based delivery of electronic articles that are essentially replicas of their paper counterparts. They contain little in the way of added semantics that may better expose the science, assist the peer review process and facilitate follow on collaborations, even though the enabling technologies have been around for some time and are mature. This thesis will examine the evolution of chemical electronic publishing over the past 15 years. It will illustrate, which the help of two frameworks, how publishers should be exploiting technologies to improve the semantics of chemical journal articles, namely their value added features and relationships with other chemical resources on the Web. The first framework is an early exemplar of structured and scalable electronic publishing where a Web content management system and a molecular database are integrated. It employs a test bed of articles from several RSC journals and supporting molecular coordinate and connectivity information. The value of converting 3D molecular expressions in chemical file formats, such as the MOL file, into more generic 3D graphics formats, such as Web3D, is assessed. This exemplar highlights the use of metadata management for bidirectional hyperlink maintenance in electronic publishing. The second framework repurposes this metadata management concept into a Semantic Web application called SemanticEye. SemanticEye demonstrates how relationships between chemical electronic articles and other chemical resources are established. It adapts the successful semantic model used for digital music metadata management by popular applications such as iTunes. Globally unique identifiers enable relationships to be established between articles and other resources on the Web and SemanticEye implements two: the Document Object Identifier (DOI) for articles and the IUPAC International Chemical Identifier (InChI) for molecules. SemanticEye’s potential as a framework for seeding collaborations between researchers, who have hitherto never met, is explored using FOAF, the friend-of-a-friend Semantic Web standard for social networks

    Creating Network Attack Priority Lists by Analyzing Email Traffic Using Predefined Profiles

    Get PDF
    Networks can be vast and complicated entities consisting of both servers and workstations that contain information sought by attackers. Searching for specific data in a large network can be a time consuming process. Vast amounts of data either passes through or is stored by various servers on the network. However, intermediate work products are often kept solely on workstations. Potential high value targets can be passively identified by comparing user email traffic against predefined profiles. This method provides a potentially smaller footprint on target systems, less human interaction, and increased efficiency of attackers. Collecting user email traffic and comparing each word in an email to a predefined profile, or a list of key words of interest to the attacker, can provide a prioritized list of systems containing the most relevant information. This research uses two experiments. The functionality experiment uses randomly generated emails and profiles, demonstrating MAPS (Merritt\u27s Adaptive Profiling System)ability to accurately identify matches. The utility experiment uses an email corpus and meaningful profiles, further demonstrating MAPS ability to accurately identify matches with non-random input. A meaningful profile is a list of words bearing a semantic relationship to a topic of interest to the attacker. Results for the functionality experiment show MAPS can parse randomly generated emails and identify matches with an accuracy of 99 percent or above. The utility experiment using an email corpus with meaningful profiles, shows slightly lower accuracies of 95 percent or above. Based upon the match results, network attack priority lists are generated. A network attack priority list is an ordered list of systems, where the potentially highest value systems exhibit the greatest fit to the profile. An attacker then uses the list when searching for target information on the network to prioritize the systems most likely to contain useful data

    A Reinforcement Learning Quality of Service Negotiation Framework For IoT Middleware

    Get PDF
    The Internet of Things (IoT) ecosystem is characterised by heterogeneous devices dynamically interacting with each other to perform a specific task, often without human intervention. This interaction typically occurs in a service-oriented manner and is facilitated by an IoT middleware. The service provision paradigm enables the functionalities of IoT devices to be provided as IoT services to perform actuation tasks in critical-safety systems such as autonomous, connected vehicle system and industrial control systems. As IoT systems are increasingly deployed into an environment characterised by continuous changes and uncertainties, there have been growing concerns on how to resolve the Quality of Service (QoS) contentions between heterogeneous devices with conflicting preferences to guarantee the execution of mission-critical actuation tasks. With IoT devices with different QoS constraints as IoT service providers spontaneously interacts with IoT service consumers with varied QoS requirements, it becomes essential to find the best way to establish and manage the QoS agreement in the middleware as a compromise in the QoS could lead to negative consequences. This thesis presents a QoS negotiation framework, IoTQoSystem, for IoT service-oriented middleware. The QoS framework is underpinned by a negotiation process that is modelled as a Markov Decision Process (MDP). A model-based Reinforcement Learning negotiation strategy is proposed for generating an acceptable QoS solution in a dynamic, multilateral and multi-parameter scenarios. A microservice-oriented negotiation architecture is developed that combines negotiation, monitoring and forecasting to provide a self-managing mechanism for ensuring the successful execution of actuation tasks in an IoT environment. Using a case study, the developed QoS negotiation framework was evaluated using real-world data sets with different negotiation scenarios to illustrate its scalability, reliability and performance
    • …
    corecore