48 research outputs found

    Configuring and Assembling Information Retrieval based Solutions for Software Engineering Tasks.

    Get PDF
    Information Retrieval (IR) approaches are used to leverage textual or unstructured data generated during the software development process to support various software engineering (SE) tasks (e.g., concept location, traceability link recovery, change impact analysis, etc.). Two of the most important steps for applying IR techniques to support SE tasks are preprocessing the corpus and configuring the IR technique, and these steps can significantly influence the outcome and the amount of effort developers have to spend for these maintenance tasks. We present the use of Genetic Algorithms (GAs) to automatically configure and assemble an IR process to support SE tasks. The approach named IR-GA determines the (near) optimal solution to be used for each step of the IR process without requiring any training. We applied IR-GA on three different SE tasks and the results of the study indicate that IR-GA outperforms approaches previously used in the literature, and that it does not significantly differ from an ideal upper bound that could be achieved by a supervised approach and a combinatorial approach

    Content Recommendation Through Linked Data

    Get PDF
    Nowadays, people can easily obtain a huge amount of information from the Web, but often they have no criteria to discern it. This issue is known as information overload. Recommender systems are software tools to suggest interesting items to users and can help them to deal with a vast amount of information. Linked Data is a set of best practices to publish data on the Web, and it is the basis of the Web of Data, an interconnected global dataspace. This thesis discusses how to discover information useful for the user from the vast amount of structured data, and notably Linked Data available on the Web. The work addresses this issue by considering three research questions: how to exploit existing relationships between resources published on the Web to provide recommendations to users; how to represent the user and his context to generate better recommendations for the current situation; and how to effectively visualize the recommended resources and their relationships. To address the first question, the thesis proposes a new algorithm based on Linked Data which exploits existing relationships between resources to recommend related resources. The algorithm was integrated into a framework to deploy and evaluate Linked Data based recommendation algorithms. In fact, a related problem is how to compare them and how to evaluate their performance when applied to a given dataset. The user evaluation showed that our algorithm improves the rate of new recommendations, while maintaining a satisfying prediction accuracy. To represent the user and their context, this thesis presents the Recommender System Context ontology, which is exploited in a new context-aware approach that can be used with existing recommendation algorithms. The evaluation showed that this method can significantly improve the prediction accuracy. As regards the problem of effectively visualizing the recommended resources and their relationships, this thesis proposes a visualization framework for DBpedia (the Linked Data version of Wikipedia) and mobile devices, which is designed to be extended to other datasets. In summary, this thesis shows how it is possible to exploit structured data available on the Web to recommend useful resources to users. Linked Data were successfully exploited in recommender systems. Various proposed approaches were implemented and applied to use cases of Telecom Italia

    Recommender system performance evaluation and prediction: information retrieval perspective

    Full text link
    Tesis doctoral inédita. Universidad Autónoma de Madrid, Escuela Politécnica Superior, octubre de 201

    Search-Based Software Maintenance and Testing

    Get PDF
    2012 - 2013In software engineering there are many expensive tasks that are performed during development and maintenance activities. Therefore, there has been a lot of e ort to try to automate these tasks in order to signi cantly reduce the development and maintenance cost of software, since the automation would require less human resources. One of the most used way to make such an automation is the Search-Based Software Engineering (SBSE), which reformulates traditional software engineering tasks as search problems. In SBSE the set of all candidate solutions to the problem de nes the search space while a tness function di erentiates between candidate solutions providing a guidance to the optimization process. After the reformulation of software engineering tasks as optimization problems, search algorithms are used to solve them. Several search algorithms have been used in literature, such as genetic algorithms, genetic programming, simulated annealing, hill climbing (gradient descent), greedy algorithms, particle swarm and ant colony. This thesis investigates and proposes the usage of search based approaches to reduce the e ort of software maintenance and software testing with particular attention to four main activities: (i) program comprehension; (ii) defect prediction; (iii) test data generation and (iv) test suite optimiza- tion for regression testing. For program comprehension and defect prediction, this thesis provided their rst formulations as optimization problems and then proposed the usage of genetic algorithms to solve them. More precisely, this thesis investigates the peculiarity of source code against textual documents written in natural language and proposes the usage of Genetic Algorithms (GAs) in order to calibrate and assemble IR-techniques for di erent software engineering tasks. This thesis also investigates and proposes the usage of Multi-Objective Genetic Algorithms (MOGAs) in or- der to build multi-objective defect prediction models that allows to identify defect-prone software components by taking into account multiple and practical software engineering criteria. Test data generation and test suite optimization have been extensively investigated as search- based problems in literature . However, despite the huge body of works on search algorithms applied to software testing, both (i) automatic test data generation and (ii) test suite optimization present several limitations and not always produce satisfying results. The success of evolutionary software testing techniques in general, and GAs in particular, depends on several factors. One of these factors is the level of diversity among the individuals in the population, which directly a ects the exploration ability of the search. For example, evolutionary test case generation techniques that employ GAs could be severely a ected by genetic drift, i.e., a loss of diversity between solutions, which lead to a premature convergence of GAs towards some local optima. For these reasons, this thesis investigate the role played by diversity preserving mechanisms on the performance of GAs and proposed a novel diversity mechanism based on Singular Value Decomposition and linear algebra. Then, this mechanism has been integrated within the standard GAs and evaluated for evolutionary test data generation. It has been also integrated within MOGAs and empirically evaluated for regression testing. [edited by author]XII n.s

    Recommender Systems based on Linked Data

    Get PDF
    Backgrounds: The increase in the amount of structured data published using the principles of Linked Data, means that now it is more likely to find resources in the Web of Data that describe real life concepts. However, discovering resources related to any given resource is still an open research area. This thesis studies Recommender Systems (RS) that use Linked Data as a source for generating recommendations exploiting the large amount of available resources and the relationships among them. Aims: The main objective of this study was to propose a recommendation tech- nique for resources considering semantic relationships between concepts from Linked Data. The specific objectives were: (i) Define semantic relationships derived from resources taking into account the knowledge found in Linked Data datasets. (ii) Determine semantic similarity measures based on the semantic relationships derived from resources. (iii) Propose an algorithm to dynami- cally generate automatic rankings of resources according to defined similarity measures. Methodology: It was based on the recommendations of the Project management Institute and the Integral Model for Engineering Professionals (Universidad del Cauca). The first one for managing the project, and the second one for developing the experimental prototype. Accordingly, the main phases were: (i) Conceptual base generation for identifying the main problems, objectives and the project scope. A Systematic Literature Review was conducted for this phase, which highlighted the relationships and similarity measures among resources in Linked Data, and the main issues, features, and types of RS based on Linked Data. (ii) Solution development is about designing and developing the experimental prototype for testing the algorithms studied in this thesis. Results: The main results obtained were: (i) The first Systematic Literature Re- view on RS based on Linked Data. (ii) A framework to execute and an- alyze recommendation algorithms based on Linked Data. (iii) A dynamic algorithm for resource recommendation based on on the knowledge of Linked Data relationships. (iv) A comparative study of algorithms for RS based on Linked Data. (v) Two implementations of the proposed framework. One with graph-based algorithms and other with machine learning algorithms. (vi) The application of the framework to various scenarios to demonstrate its feasibility within the context of real applications. Conclusions: (i) The proposed framework demonstrated to be useful for develop- ing and evaluating different configurations of algorithms to create novel RS based on Linked Data suitable to users’ requirements, applications, domains and contexts. (ii) The layered architecture of the proposed framework is also useful towards the reproducibility of the results for the research community. (iii) Linked data based RS are useful to present explanations of the recommen- dations, because of the graph structure of the datasets. (iv) Graph-based algo- rithms take advantage of intrinsic relationships among resources from Linked Data. Nevertheless, their execution time is still an open issue. Machine Learn- ing algorithms are also suitable, they provide functions useful to deal with large amounts of data, so they can help to improve the performance (execution time) of the RS. However most of them need a training phase that require to know a priory the application domain in order to obtain reliable results. (v) A log- ical evolution of RS based on Linked Data is the combination of graph-based with machine learning algorithms to obtain accurate results while keeping low execution times. However, research and experimentation is still needed to ex- plore more techniques from the vast amount of machine learning algorithms to determine the most suitable ones to deal with Linked Data

    24th International Conference on Information Modelling and Knowledge Bases

    Get PDF
    In the last three decades information modelling and knowledge bases have become essentially important subjects not only in academic communities related to information systems and computer science but also in the business area where information technology is applied. The series of European – Japanese Conference on Information Modelling and Knowledge Bases (EJC) originally started as a co-operation initiative between Japan and Finland in 1982. The practical operations were then organised by professor Ohsuga in Japan and professors Hannu Kangassalo and Hannu Jaakkola in Finland (Nordic countries). Geographical scope has expanded to cover Europe and also other countries. Workshop characteristic - discussion, enough time for presentations and limited number of participants (50) / papers (30) - is typical for the conference. Suggested topics include, but are not limited to: 1. Conceptual modelling: Modelling and specification languages; Domain-specific conceptual modelling; Concepts, concept theories and ontologies; Conceptual modelling of large and heterogeneous systems; Conceptual modelling of spatial, temporal and biological data; Methods for developing, validating and communicating conceptual models. 2. Knowledge and information modelling and discovery: Knowledge discovery, knowledge representation and knowledge management; Advanced data mining and analysis methods; Conceptions of knowledge and information; Modelling information requirements; Intelligent information systems; Information recognition and information modelling. 3. Linguistic modelling: Models of HCI; Information delivery to users; Intelligent informal querying; Linguistic foundation of information and knowledge; Fuzzy linguistic models; Philosophical and linguistic foundations of conceptual models. 4. Cross-cultural communication and social computing: Cross-cultural support systems; Integration, evolution and migration of systems; Collaborative societies; Multicultural web-based software systems; Intercultural collaboration and support systems; Social computing, behavioral modeling and prediction. 5. Environmental modelling and engineering: Environmental information systems (architecture); Spatial, temporal and observational information systems; Large-scale environmental systems; Collaborative knowledge base systems; Agent concepts and conceptualisation; Hazard prediction, prevention and steering systems. 6. Multimedia data modelling and systems: Modelling multimedia information and knowledge; Contentbased multimedia data management; Content-based multimedia retrieval; Privacy and context enhancing technologies; Semantics and pragmatics of multimedia data; Metadata for multimedia information systems. Overall we received 56 submissions. After careful evaluation, 16 papers have been selected as long paper, 17 papers as short papers, 5 papers as position papers, and 3 papers for presentation of perspective challenges. We thank all colleagues for their support of this issue of the EJC conference, especially the program committee, the organising committee, and the programme coordination team. The long and the short papers presented in the conference are revised after the conference and published in the Series of “Frontiers in Artificial Intelligence” by IOS Press (Amsterdam). The books “Information Modelling and Knowledge Bases” are edited by the Editing Committee of the conference. We believe that the conference will be productive and fruitful in the advance of research and application of information modelling and knowledge bases. Bernhard Thalheim Hannu Jaakkola Yasushi Kiyok

    Context-Aware Recommendation Systems in Mobile Environments

    Get PDF
    Nowadays, the huge amount of information available may easily overwhelm users when they need to take a decision that involves choosing among several options. As a solution to this problem, Recommendation Systems (RS) have emerged to offer relevant items to users. The main goal of these systems is to recommend certain items based on user preferences. Unfortunately, traditional recommendation systems do not consider the user’s context as an important dimension to ensure high-quality recommendations. Motivated by the need to incorporate contextual information during the recommendation process, Context-Aware Recommendation Systems (CARS) have emerged. However, these recent recommendation systems are not designed with mobile users in mind, where the context and the movements of the users and items may be important factors to consider when deciding which items should be recommended. Therefore, context-aware recommendation models should be able to effectively and efficiently exploit the dynamic context of the mobile user in order to offer her/him suitable recommendations and keep them up-to-date.The research area of this thesis belongs to the fields of context-aware recommendation systems and mobile computing. We focus on the following scientific problem: how could we facilitate the development of context-aware recommendation systems in mobile environments to provide users with relevant recommendations? This work is motivated by the lack of generic and flexible context-aware recommendation frameworks that consider aspects related to mobile users and mobile computing. In order to solve the identified problem, we pursue the following general goal: the design and implementation of a context-aware recommendation framework for mobile computing environments that facilitates the development of context-aware recommendation applications for mobile users. In the thesis, we contribute to bridge the gap not only between recommendation systems and context-aware computing, but also between CARS and mobile computing.<br /

    Trust models for mobile content-sharing applications

    Get PDF
    Using recent technologies such as Bluetooth, mobile users can share digital content (e.g., photos, videos) with other users in proximity. However, to reduce the cognitive load on mobile users, it is important that only appropriate content is stored and presented to them. This dissertation examines the feasibility of having mobile users filter out irrelevant content by running trust models. A trust model is a piece of software that keeps track of which devices are trusted (for sending quality content) and which are not. Unfortunately, existing trust models are not fit for purpose. Specifically, they lack the ability to: (1) reason about ratings other than binary ratings in a formal way; (2) rely on the trustworthiness of stored third-party recommendations; (3) aggregate recommendations to make accurate predictions of whom to trust; and (4) reason across categories without resorting to ontologies that are shared by all users in the system. We overcome these shortcomings by designing and evaluating algorithms and protocols with which portable devices are able automatically to maintain information about the reputability of sources of content and to learn from each other’s recommendations. More specifically, our contributions are: 1. An algorithm that formally reasons on generic (not necessarily binary) ratings using Bayes’ theorem. 2. A set of security protocols with which devices store ratings in (local) tamper-evident tables and are able to check the integrity of those tables through a gossiping protocol. 3. An algorithm that arranges recommendations in a “Web of Trust” and that makes predictions of trustworthiness that are more accurate than existing approaches by using graph-based learning. 4. An algorithm that learns the similarity between any two categories by extracting similarities between the two categories’ ratings rather than by requiring a universal ontology. It does so automatically by using Singular Value Decomposition. We combine these algorithms and protocols and, using real-world mobility and social network data, we evaluate the effectiveness of our proposal in allowing mobile users to select reputable sources of content. We further examine the feasibility of implementing our proposal on current mobile phones by examining the storage and computational overhead it entails. We conclude that our proposal is both feasible to implement and performs better across a range of parameters than a number of current alternatives
    corecore