8 research outputs found

    Building Representative Composite Items

    No full text
    International audienceThe problem of summarizing a large collection of homogeneous items has been addressed extensively in particular in the case of geo-tagged datasets (e.g. Flickr photos and tags). In our work, we study the problem of summarizing large collections of heterogeneous items. For example, a user planning to spend extended periods of time in a given city would be interested in seeing a map of that city with item summaries in different geographic areas, each containing a theater, a gym, a bakery, a few restaurants and a subway station. We propose to solve that problem by building representative Composite Items (CIs). To the best of our knowledge, this is the first work that addresses the problem of finding representative CIs for heterogeneous items. Our problem naturally arises when summarizing geo-tagged datasets but also in other datasets such as movie or music summarization. We formalize building representative CIs as an optimization problem and propose KFC, an extended fuzzy clustering algorithm to solve it. We show that KFC converges and run extensive experiments on a variety of real datasets that validate its effectiveness

    Personalized Travel Itineraries with Multi-access Edge Computing Touristic Services

    Get PDF
    International audienceThe 5G networks enable new touristic services with challenging communication requirements, such as augmented reality (AR) applications, and allow the visitors to enjoy a touristic experience that involves both the physical and virtual space. Here, we propose a novel multiuser travel itinerary planning framework based on an optimal problem formulation that considers both individual trip itinerary (e.g., tourist's preferences, time or cost) and touristic service constraints (e.g., nearby edge cloud resources and application requirements). The main idea is to maximize the itinerary score of individual visitors, while also optimizing the resource allocation at the edge. We consider two services, video streaming and AR, and evaluate our framework using data from Flickr. Results demonstrate gains up to 100% in the resource allocation and user experience in comparison with a state-of-the-art solution adapted to this scenario

    Trajectory data mining: A review of methods and applications

    Get PDF
    The increasing use of location-aware devices has led to an increasing availability of trajectory data. As a result, researchers devoted their efforts to developing analysis methods including different data mining methods for trajectories. However, the research in this direction has so far produced mostly isolated studies and we still lack an integrated view of problems in applications of trajectory mining that were solved, the methods used to solve them, and applications using the obtained solutions. In this paper, we first discuss generic methods of trajectory mining and the relationships between them. Then, we discuss and classify application problems that were solved using trajectory data and relate them to the generic mining methods that were used and real world applications based on them. We classify trajectory-mining application problems under major problem groups based on how they are related. This classification of problems can guide researchers in identifying new application problems. The relationships between the methods together with the association between the application problems and mining methods can help researchers in identifying gaps between methods and inspire them to develop new methods. This paper can also guide analysts in choosing a suitable method for a specific problem. The main contribution of this paper is to provide an integrated view relating applications of mining trajectory data and the methods used

    ToPI - uma abordagem online para identificar locais de interesse utilizando fotografias geo-referenciadas

    Get PDF
    Due to the growing use of social networks people no longer just consume data, they also produce and share it. Geo-tagged information, i.e., data with geographical location, have been used in many attempts to identify popular places and help tourists that will visit unfamiliar cities. This Master Thesis presents an online strategy that uses geo-tagged photos and their metadata in order to identify places of interest inside a given geographical area and retrieve relevant related information. The whole process runs automatically in real time, returning updated information about places. The proposed strategy takes into account the inherent dynamism of social media, and thus is robust under inconsistencies and/or outdated information, a common issue in solutions that rely on previously stored data. The analysis of the results showed that our approach is very promising, returning places that present high agreement with those from a popular travel website.Coordenação de Aperfeiçoamento de Pessoal de Nível SuperiorDissertação (Mestrado)Devido ao crescente uso de redes sociais, as pessoas deixaram de ser apenas consumidores de informações, elas passaram a também produzí-las e compartilhá-las. Informações geo-referenciadas, isto é, informações com dados de localização geográfica, têm sido utilizadas em várias propostas da literatura para identificar locais de interesse e auxiliar turistas que visitarão cidades que ainda não lhe são familiares. Este trabalho apresenta uma estratégia online que utiliza fotografias geo-referenciadas e seus metadados para identificar locais de interesse pertencentes a uma dada região geográfica e recuperar informações relevantes relacionadas. Todo o processo é executado automaticamente e em tempo real, retornando informações atualizadas sobre os locais. A estratégia proposta leva em consideração o dinamismo inerente a redes sociais e, assim, é robusta quanto a inconsistências e/ou informações desatualizadas, problemas comuns em soluções que se baseiam em dados pré-armazenados. A análise de resultados mostrou que a proposta é bastante promissora, retornando locais que apresentam alta taxa de concordância em relação àqueles existentes em um website turístico bastante popular

    Recommending Structured Objects: Paths and Sets

    Get PDF
    Recommender systems have been widely adopted in industry to help people find the most appropriate items to purchase or consume from the increasingly large collection of available resources (e.g., books, songs and movies). Conventional recommendation techniques follow the approach of ``ranking all possible options and pick the top'', which can work effectively for single item recommendation but fall short when the item in question has internal structures. For example, a travel trajectory with a sequence of points-of-interest or a music playlist with a set of songs. Such structured objects pose critical challenges to recommender systems due to the intractability of ranking all possible candidates. This thesis study the problem of recommending structured objects, in particular, the recommendation of path (a sequence of unique elements) and set (a collection of distinct elements). We study the problem of recommending travel trajectories in a city, which is a typical instance of path recommendation. We propose methods that combine learning to rank and route planning techniques for efficient trajectory recommendation. Another contribution of this thesis is to develop the structured recommendation approach for path recommendation by substantially modifying the loss function, the learning and inference procedures of structured support vector machines. A novel application of path decoding techniques helps us achieve efficient learning and recommendation. Additionally, we investigate the problem of recommending a set of songs to form a playlist as an example of the set recommendation problem. We propose to jointly learn user representations by employing the multi-task learning paradigm, and a key result of equivalence between bipartite ranking and binary classification enables efficient learning of our set recommendation method. Extensive evaluations on real world datasets demonstrate the effectiveness of our proposed approaches for path and set recommendation

    Searching and mining in enriched geo-spatial data

    Get PDF
    The emergence of new data collection mechanisms in geo-spatial applications paired with a heightened tendency of users to volunteer information provides an ever-increasing flow of data of high volume, complex nature, and often associated with inherent uncertainty. Such mechanisms include crowdsourcing, automated knowledge inference, tracking, and social media data repositories. Such data bearing additional information from multiple sources like probability distributions, text or numerical attributes, social context, or multimedia content can be called multi-enriched. Searching and mining this abundance of information holds many challenges, if all of the data's potential is to be released. This thesis addresses several major issues arising in that field, namely path queries using multi-enriched data, trend mining in social media data, and handling uncertainty in geo-spatial data. In all cases, the developed methods have made significant contributions and have appeared in or were accepted into various renowned international peer-reviewed venues. A common use of geo-spatial data is path queries in road networks where traditional methods optimise results based on absolute and ofttimes singular metrics, i.e., finding the shortest paths based on distance or the best trade-off between distance and travel time. Integrating additional aspects like qualitative or social data by enriching the data model with knowledge derived from sources as mentioned above allows for queries that can be issued to fit a broader scope of needs or preferences. This thesis presents two implementations of incorporating multi-enriched data into road networks. In one case, a range of qualitative data sources is evaluated to gain knowledge about user preferences which is subsequently matched with locations represented in a road network and integrated into its components. Several methods are presented for highly customisable path queries that incorporate a wide spectrum of data. In a second case, a framework is described for resource distribution with reappearance in road networks to serve one or more clients, resulting in paths that provide maximum gain based on a probabilistic evaluation of available resources. Applications for this include finding parking spots. Social media trends are an emerging research area giving insight in user sentiment and important topics. Such trends consist of bursts of messages concerning a certain topic within a time frame, significantly deviating from the average appearance frequency of the same topic. By investigating the dissemination of such trends in space and time, this thesis presents methods to classify trend archetypes to predict future dissemination of a trend. Processing and querying uncertain data is particularly demanding given the additional knowledge required to yield results with probabilistic guarantees. Since such knowledge is not always available and queries are not easily scaled to larger datasets due to the #P-complete nature of the problem, many existing approaches reduce the data to a deterministic representation of its underlying model to eliminate uncertainty. However, data uncertainty can also provide valuable insight into the nature of the data that cannot be represented in a deterministic manner. This thesis presents techniques for clustering uncertain data as well as query processing, that take the additional information from uncertainty models into account while preserving scalability using a sampling-based approach, while previous approaches could only provide one of the two. The given solutions enable the application of various existing clustering techniques or query types to a framework that manages the uncertainty.Das Erscheinen neuer Methoden zur Datenerhebung in räumlichen Applikationen gepaart mit einer erhöhten Bereitschaft der Nutzer, Daten über sich preiszugeben, generiert einen stetig steigenden Fluss von Daten in großer Menge, komplexer Natur, und oft gepaart mit inhärenter Unsicherheit. Beispiele für solche Mechanismen sind Crowdsourcing, automatisierte Wissensinferenz, Tracking, und Daten aus sozialen Medien. Derartige Daten, angereichert mit mit zusätzlichen Informationen aus verschiedenen Quellen wie Wahrscheinlichkeitsverteilungen, Text- oder numerische Attribute, sozialem Kontext, oder Multimediainhalten, werden als multi-enriched bezeichnet. Suche und Datamining in dieser weiten Datenmenge hält viele Herausforderungen bereit, wenn das gesamte Potenzial der Daten genutzt werden soll. Diese Arbeit geht auf mehrere große Fragestellungen in diesem Feld ein, insbesondere Pfadanfragen in multi-enriched Daten, Trend-mining in Daten aus sozialen Netzwerken, und die Beherrschung von Unsicherheit in räumlichen Daten. In all diesen Fällen haben die entwickelten Methoden signifikante Forschungsbeiträge geleistet und wurden veröffentlicht oder angenommen zu diversen renommierten internationalen, von Experten begutachteten Konferenzen und Journals. Ein gängiges Anwendungsgebiet räumlicher Daten sind Pfadanfragen in Straßennetzwerken, wo traditionelle Methoden die Resultate anhand absoluter und oft auch singulärer Maße optimieren, d.h., der kürzeste Pfad in Bezug auf die Distanz oder der beste Kompromiss zwischen Distanz und Reisezeit. Durch die Integration zusätzlicher Aspekte wie qualitativer Daten oder Daten aus sozialen Netzwerken als Anreicherung des Datenmodells mit aus diesen Quellen abgeleitetem Wissen werden Anfragen möglich, die ein breiteres Spektrum an Anforderungen oder Präferenzen erfüllen. Diese Arbeit präsentiert zwei Ansätze, solche multi-enriched Daten in Straßennetze einzufügen. Zum einen wird eine Reihe qualitativer Datenquellen ausgewertet, um Wissen über Nutzerpräferenzen zu generieren, welches darauf mit Örtlichkeiten im Straßennetz abgeglichen und in das Netz integriert wird. Diverse Methoden werden präsentiert, die stark personalisierbare Pfadanfragen ermöglichen, die ein weites Spektrum an Daten mit einbeziehen. Im zweiten Fall wird ein Framework präsentiert, das eine Ressourcenverteilung im Straßennetzwerk modelliert, bei der einmal verbrauchte Ressourcen erneut auftauchen können. Resultierende Pfade ergeben einen maximalen Ertrag basieren auf einer probabilistischen Evaluation der verfügbaren Ressourcen. Eine Anwendung ist die Suche nach Parkplätzen. Trends in sozialen Medien sind ein entstehendes Forscchungsgebiet, das Einblicke in Benutzerverhalten und wichtige Themen zulässt. Solche Trends bestehen aus großen Mengen an Nachrichten zu einem bestimmten Thema innerhalb eines Zeitfensters, so dass die Auftrittsfrequenz signifikant über den durchschnittlichen Level liegt. Durch die Untersuchung der Fortpflanzung solcher Trends in Raum und Zeit präsentiert diese Arbeit Methoden, um Trends nach Archetypen zu klassifizieren und ihren zukünftigen Weg vorherzusagen. Die Anfragebearbeitung und Datamining in unsicheren Daten ist besonders herausfordernd, insbesondere im Hinblick auf das notwendige Zusatzwissen, um Resultate mit probabilistischen Garantien zu erzielen. Solches Wissen ist nicht immer verfügbar und Anfragen lassen sich aufgrund der \P-Vollständigkeit des Problems nicht ohne Weiteres auf größere Datensätze skalieren. Dennoch kann Datenunsicherheit wertvollen Einblick in die Struktur der Daten liefern, der mit deterministischen Methoden nicht erreichbar wäre. Diese Arbeit präsentiert Techniken zum Clustering unsicherer Daten sowie zur Anfragebearbeitung, die die Zusatzinformation aus dem Unsicherheitsmodell in Betracht ziehen, jedoch gleichzeitig die Skalierbarkeit des Ansatzes auf große Datenmengen sicherstellen
    corecore