1,315 research outputs found

    INRISCO: INcident monitoRing in Smart COmmunities

    Get PDF
    Major advances in information and communication technologies (ICTs) make citizens to be considered as sensors in motion. Carrying their mobile devices, moving in their connected vehicles or actively participating in social networks, citizens provide a wealth of information that, after properly processing, can support numerous applications for the benefit of the community. In the context of smart communities, the INRISCO [1] proposal intends for (i) the early detection of abnormal situations in cities (i.e., incidents), (ii) the analysis of whether, according to their impact, those incidents are really adverse for the community; and (iii) the automatic actuation by dissemination of appropriate information to citizens and authorities. Thus, INRISCO will identify and report on incidents in traffic (jam, accident) or public infrastructure (e.g., works, street cut), the occurrence of specific events that affect other citizens' life (e.g., demonstrations, concerts), or environmental problems (e.g., pollution, bad weather). It is of particular interest to this proposal the identification of incidents with a social and economic impact, which affects the quality of life of citizens.This work was supported in part by the Spanish Government through the projects INRISCO under Grant TEC2014-54335-C4-1-R, Grant TEC2014-54335-C4-2-R, Grant TEC2014-54335-C4-3-R, and Grant TEC2014-54335-C4-4-R, in part by the MAGOS under Grant TEC2017-84197-C4-1-R, Grant TEC2017-84197-C4-2-R, and Grant TEC2017-84197-C4-3-R, in part by the European Regional Development Fund (ERDF), and in part by the Galician Regional Government under agreement for funding the Atlantic Research Center for Information and Communication Technologies (AtlantTIC)

    Improving data preparation for the application of process mining

    Get PDF
    Immersed in what is already known as the fourth industrial revolution, automation and data exchange are taking on a particularly relevant role in complex environments, such as industrial manufacturing environments or logistics. This digitisation and transition to the Industry 4.0 paradigm is causing experts to start analysing business processes from other perspectives. Consequently, where management and business intelligence used to dominate, process mining appears as a link, trying to build a bridge between both disciplines to unite and improve them. This new perspective on process analysis helps to improve strategic decision making and competitive capabilities. Process mining brings together data and process perspectives in a single discipline that covers the entire spectrum of process management. Through process mining, and based on observations of their actual operations, organisations can understand the state of their operations, detect deviations, and improve their performance based on what they observe. In this way, process mining is an ally, occupying a large part of current academic and industrial research. However, although this discipline is receiving more and more attention, it presents severe application problems when it is implemented in real environments. The variety of input data in terms of form, content, semantics, and levels of abstraction makes the execution of process mining tasks in industry an iterative, tedious, and manual process, requiring multidisciplinary experts with extensive knowledge of the domain, process management, and data processing. Currently, although there are numerous academic proposals, there are no industrial solutions capable of automating these tasks. For this reason, in this thesis by compendium we address the problem of improving business processes in complex environments thanks to the study of the state-of-the-art and a set of proposals that improve relevant aspects in the life cycle of processes, from the creation of logs, log preparation, process quality assessment, and improvement of business processes. Firstly, for this thesis, a systematic study of the literature was carried out in order to gain an in-depth knowledge of the state-of-the-art in this field, as well as the different challenges faced by this discipline. This in-depth analysis has allowed us to detect a number of challenges that have not been addressed or received insufficient attention, of which three have been selected and presented as the objectives of this thesis. The first challenge is related to the assessment of the quality of input data, known as event logs, since the requeriment of the application of techniques for improving the event log must be based on the level of quality of the initial data, which is why this thesis presents a methodology and a set of metrics that support the expert in selecting which technique to apply to the data according to the quality estimation at each moment, another challenge obtained as a result of our analysis of the literature. Likewise, the use of a set of metrics to evaluate the quality of the resulting process models is also proposed, with the aim of assessing whether improvement in the quality of the input data has a direct impact on the final results. The second challenge identified is the need to improve the input data used in the analysis of business processes. As in any data-driven discipline, the quality of the results strongly depends on the quality of the input data, so the second challenge to be addressed is the improvement of the preparation of event logs. The contribution in this area is the application of natural language processing techniques to relabel activities from textual descriptions of process activities, as well as the application of clustering techniques to help simplify the results, generating more understandable models from a human point of view. Finally, the third challenge detected is related to the process optimisation, so we contribute with an approach for the optimisation of resources associated with business processes, which, through the inclusion of decision-making in the creation of flexible processes, enables significant cost reductions. Furthermore, all the proposals made in this thesis are validated and designed in collaboration with experts from different fields of industry and have been evaluated through real case studies in public and private projects in collaboration with the aeronautical industry and the logistics sector

    Models, Techniques, and Metrics for Managing Risk in Software Engineering

    Get PDF
    The field of Software Engineering (SE) is the study of systematic and quantifiable approaches to software development, operation, and maintenance. This thesis presents a set of scalable and easily implemented techniques for quantifying and mitigating risks associated with the SE process. The thesis comprises six papers corresponding to SE knowledge areas such as software requirements, testing, and management. The techniques for risk management are drawn from stochastic modeling and operational research. The first two papers relate to software testing and maintenance. The first paper describes and validates novel iterative-unfolding technique for filtering a set of execution traces relevant to a specific task. The second paper analyzes and validates the applicability of some entropy measures to the trace classification described in the previous paper. The techniques in these two papers can speed up problem determination of defects encountered by customers, leading to improved organizational response and thus increased customer satisfaction and to easing of resource constraints. The third and fourth papers are applicable to maintenance, overall software quality and SE management. The third paper uses Extreme Value Theory and Queuing Theory tools to derive and validate metrics based on defect rediscovery data. The metrics can aid the allocation of resources to service and maintenance teams, highlight gaps in quality assurance processes, and help assess the risk of using a given software product. The fourth paper characterizes and validates a technique for automatic selection and prioritization of a minimal set of customers for profiling. The minimal set is obtained using Binary Integer Programming and prioritized using a greedy heuristic. Profiling the resulting customer set leads to enhanced comprehension of user behaviour, leading to improved test specifications and clearer quality assurance policies, hence reducing risks associated with unsatisfactory product quality. The fifth and sixth papers pertain to software requirements. The fifth paper both models the relation between requirements and their underlying assumptions and measures the risk associated with failure of the assumptions using Boolean networks and stochastic modeling. The sixth paper models the risk associated with injection of requirements late in development cycle with the help of stochastic processes

    SMS: A Framework for Service Discovery by Incorporating Social Media Information

    Full text link
    © 2008-2012 IEEE. With the explosive growth of services, including Web services, cloud services, APIs and mashups, discovering the appropriate services for consumers is becoming an imperative issue. The traditional service discovery approaches mainly face two challenges: 1) the single source of description documents limits the effectiveness of discovery due to the insufficiency of semantic information; 2) more factors should be considered with the generally increasing functional and nonfunctional requirements of consumers. In this paper, we propose a novel framework, called SMS, for effectively discovering the appropriate services by incorporating social media information. Specifically, we present different methods to measure four social factors (semantic similarity, popularity, activity, decay factor) collected from Twitter. Latent Semantic Indexing (LSI) model is applied to mine semantic information of services from meta-data of Twitter Lists that contains them. In addition, we assume the target query-service matching function as a linear combination of multiple social factors and design a weight learning algorithm to learn an optimal combination of the measured social factors. Comprehensive experiments based on a real-world dataset crawled from Twitter demonstrate the effectiveness of the proposed framework SMS, through some compared approaches

    Unsupervised representation learning with Minimax distance measures

    Get PDF
    We investigate the use of Minimax distances to extract in a nonparametric way the features that capture the unknown underlying patterns and structures in the data. We develop a general-purpose and computationally efficient framework to employ Minimax distances with many machine learning methods that perform on numerical data. We study both computing the pairwise Minimax distances for all pairs of objects and as well as computing the Minimax distances of all the objects to/from a fixed (test) object. We first efficiently compute the pairwise Minimax distances between the objects, using the equivalence of Minimax distances over a graph and over a minimum spanning tree constructed on that. Then, we perform an embedding of the pairwise Minimax distances into a new vector space, such that their squared Euclidean distances in the new space equal to the pairwise Minimax distances in the original space. We also study the case of having multiple pairwise Minimax matrices, instead of a single one. Thereby, we propose an embedding via first summing up the centered matrices and then performing an eigenvalue decomposition to obtain the relevant features. In the following, we study computing Minimax distances from a fixed (test) object which can be used for instance in K-nearest neighbor search. Similar to the case of all-pair pairwise Minimax distances, we develop an efficient and general-purpose algorithm that is applicable with any arbitrary base distance measure. Moreover, we investigate in detail the edges selected by the Minimax distances and thereby explore the ability of Minimax distances in detecting outlier objects. Finally, for each setting, we perform several experiments to demonstrate the effectiveness of our framework

    Contents

    Get PDF

    Recommending places blased on the wisdom-of-the-crowd

    Get PDF
    The collective opinion of a great number of users, popularly known as wisdom of the crowd, has been seen as powerful tool for solving problems. As suggested by Surowiecki in his books [134], large groups of people are now considered smarter than an elite few, regardless of how brilliant at solving problems or coming to wise decisions they are. This phenomenon together with the availability of a huge amount of data on the Web has propitiated the development of solutions which employ the wisdom-of-the-crowd to solve a variety of problems in different domains, such as recommender systems [128], social networks [100] and combinatorial problems [152, 151]. The vast majority of data on the Web has been generated in the last few years by billions of users around the globe using their mobile devices and web applications, mainly on social networks. This information carries astonishing details of daily activities ranging from urban mobility and tourism behavior, to emotions and interests. The largest social network nowadays is Facebook, which in December 2015 had incredible 1.31 billion mobile active users, 4.5 billion “likes” generated daily. In addition, every 60 seconds 510 comments are posted, 293, 000 statuses are updated, and 136,000 photos are uploaded1. This flood of data has brought great opportunities to discover individual and collective preferences, and use this information to offer services to meet people’s needs, such as recommending relevant and interesting items (e.g. news, places, movies). Furthermore, it is now possible to exploit the experiences of groups of people as a collective behavior so as to augment the experience of other. This latter illustrates the important scenario where the discovery of collective behavioral patterns, the wisdom-of-the-crowd, may enrich the experience of individual users. In this light, this thesis has the objective of taking advantage of the wisdom of the crowd in order to better understand human mobility behavior so as to achieve the final purpose of supporting users (e.g. people) by providing intelligent and effective recommendations. We accomplish this objective by following three main lines of investigation as discussed below. In the first line of investigation we conduct a study of human mobility using the wisdom-of- the-crowd, culminating in the development of an analytical framework that offers a methodology to understand how the points of interest (PoIs) in a city are related to each other on the basis of the displacement of people. We experimented our methodology by using the PoI network topology to identify new classes of points of interest based on visiting patterns, spatial displacement from one PoI to another as well as popularity of the PoIs. Important relationships between PoIs are mined by discovering communities (groups) of PoIs that are closely related to each other based on user movements, where different analytical metrics are proposed to better understand such a perspective. The second line of investigation exploits the wisdom-of-the-crowd collected through user-generated content to recommend itineraries in tourist cities. To this end, we propose an unsupervised framework, called TripBuilder, that leverages large collections of Flickr photos, as the wisdom-of- the-crowd, and points of interest from Wikipedia in order to support tourists in planning their visits to the cities. We extensively experimented our framework using real data, thus demonstrating the effectiveness and efficiency of the proposal. Based on the theoretical framework, we designed and developed a platform encompassing the main features required to create personalized sightseeing tours. This platform has received significant interest within the research community, since it is recognized as crucial to understand the needs of tourists when they are planning a visit to a new city. Consequently this led to outstanding scientific results. In the third line of investigation, we exploit the wisdom-of-the-crowd to leverage recommendations of groups of people (e.g. friends) who can enjoy an item (e.g. restaurant) together. We propose GroupFinder to address the novel user-item group formation problem aimed at recommending the best group of friends for a pair. The proposal combines user-item relevance information with the user’s social network (ego network), while trying to balance the satisfaction of all the members of the group for the item with the intra-group relationships. Algorithmic solutions are proposed and experimented in the location-based recommendation domain by using four publicly available Location-Based Social Network (LBSN) datasets, showing that our solution is effective and outperforms strong baselines

    Information retrieval models for recommender systems

    Get PDF
    Programa Oficial de Doutoramento en Computación . 5009V01[Abstract] Information retrieval addresses the information needs of users by delivering relevant pieces of information but requires users to convey their information needs explicitly. In contrast, recommender systems offer personalized suggestions of items automatically. Ultimately, both fields help users cope with information overload by providing them with relevant items of information. This thesis aims to explore the connections between information retrieval and recommender systems. Our objective is to devise recommendation models inspired in information retrieval techniques. We begin by borrowing ideas from the information retrieval evaluation literature to analyze evaluation metrics in recommender systems. Second, we study the applicability of pseudo-relevance feedback models to different recommendation tasks. We investigate the conventional top-N recommendation task, but we also explore the recently formulated user-item group formation problem and propose a novel task based on the liquidation oflong tail items. Third, we exploit ad hoc retrieval models to compute neighborhoods in a collaborative filtering scenario. Fourth, we explore the opposite direction by adapting an effective recommendation framework to pseudo-relevance feedback. Finally, we discuss the results and present our concIusions. In summary, this doctoral thesis adapts a series of information retrieval models to recommender systems. Our investigation shows that many retrieval models can be accommodated to deal with different recommendation tasks. Moreover, we find that taking the opposite path is also possible. Exhaustive experimentation confirms that the proposed models are competitive. Finally, we also perform a theoretical analysis of sorne models to explain their effectiveness.[Resumen] La recuperación de información da respuesta a las necesidades de información de los usuarios proporcionando información relevante, pero requiere que los usuarios expresen explícitamente sus necesidades de información. Por el contrario, los sistemas de recomendación ofrecen sugerencias personalizadas de elementos automáticamente. En última instancia, ambos campos ayudan a los usuarios a lidiar con la sobrecarga de información al proporcionarles información relevante. Esta tesis tiene como propósito explorar las conexiones entre la recuperación de información y los sistemas de recomendación. Nuestro objetivo es diseñar modelos de recomendación inspirados en técnicas de recuperación de información. Comenzamos tomando prestadas ideas de la literatura de evaluación en recuperación de información para analizar las métricas de evaluación en los sistemas de recomendación. En segundo lugar, estudiamos la aplicabilidad de los modelos de retroalimentación de pseudo-relevancia a diferentes tareas de recomendación. Investigamos la tarea de recomendar listas ordenadas de elementos, pero también exploramos el problema recientemente formulado de formación de grupos usuario-elemento y proponemos una tarea novedosa basada en la liquidación de los elementos de la larga cola. Tercero, explotamos modelos de recuperación ad hoc para calcular vecindarios en un escenario de filtrado colaborativo. En cuarto lugar, exploramos la dirección opuesta adaptando un método eficaz de recomendación a la retroalimentación de pseudo-relevancia. Finalmente, discutimos los resultados y presentamos nuestras conclusiones. En resumen, esta tesis doctoral adapta varios modelos de recuperación de información para su uso como sistemas de recomendación. Nuestra investigación muestra que muchos modelos de recuperación de información se pueden aplicar para tratar diferentes tareas de recomendación. Además, comprobamos que tomar el camino contrario también es posible. Una experimentación exhaustiva confirma que los modelos propuestos son competitivos. Finalmente, también realizamos un análisis teórico de algunos modelos para explicar su efectividad.[Resumo] A recuperación de información dá resposta ás necesidades de información dos usuarios proporcionando información relevante, pero require que os usuarios expresen explicitamente as súas necesidades de información. Pola contra, os sistemas de recomendación ofrecen suxestións personalizadas de elementos automaticamente. En última instancia, ambos os campos axudan aos usuarios a lidar coa sobrecarga de información ao proporcionarlles información relevante. Esta tese ten como propósito explorar as conexións entre a recuperación de información e os sistemas de recomendación. O naso obxectivo é deseñar modelos de recomendación inspirados en técnicas de recuperación de información. Comezamos tomando prestadas ideas da literatura de avaliación en recuperación de información para analizar as métricas de avaliación nos sistemas de recomendación. En segundo lugar, estudamos a aplicabilidade dos modelos de retroalimentación de seudo-relevancia a diferentes tarefas de recomendación. Investigamos a tarefa de recomendar listas ordenadas de elementos, pero tamén exploramos o problema recentemente formulado de formación de grupos de usuario-elemento e propoñemos unha tarefa nova baseada na liquidación dos elementos da longa cola. Terceiro, explotamos modelos de recuperación ad hoc para calcular veciñanzas nun escenario de filtrado colaborativo. En cuarto lugar, exploramos a dirección aposta adaptando un método eficaz de recomendación á retroalimentación de seudo-relevancia. Finalmente, discutimos os resultados e presentamos as nasas conclusións. En resumo, esta tese doutoral adapta varios modelos de recuperación de información para o seu uso como sistemas de recomendación. A nosa investigación mostra que moitos modelos de recuperación de información pódense aplicar para tratar diferentes tarefas de recomendación. Ademais, comprobamos que tomar o camiño contrario tamén é posible. Unha experimentación exhaustiva confirma que os modelos propostos son competitivos. Finalmente, tamén realizamos unha análise teórica dalgúns modelos para explicar a súa efectividade
    corecore