8 research outputs found

    Social, Structured and Semantic Search

    Get PDF
    International audienceSocial content such as blogs, tweets, news etc. is a rich source of interconnected information. We identify a set of requirements for the meaningful exploitation of such rich content, and present a new data model, called S3, which is the first to satisfy them. S3 captures social relationships between users, and between users and content, but also the structure present in rich social content, as well as its semantics. We provide the first top-k keyword search algorithm taking into account the social, structured, and semantic dimensions and formally establish its termination and correctness. Experiments on real social networks demonstrate the efficiency and qualitative advantage of our algorithm through the joint exploitation of the social, structured, and semantic dimensions of S3

    Mixed-instance querying: a lightweight integration architecture for data journalism

    Get PDF
    International audienceAs the world's affairs get increasingly more digital, timely production and consumption of news require to efficiently and quickly exploit heterogeneous data sources. Discussions with journalists revealed that content management tools currently at their disposal fall very short of expectations. We demonstrate TATOOINE, a lightweight data integration prototype, which allows to quickly set up integration queries across (very) heterogeneous data sources, capitalizing on the many data links (joins) available in this application domain. Our demonstration is based on scenarios we study in collaboration with Le Monde, France's major newspaper

    Top-k search over rich web content

    No full text
    Les rĂ©seaux sociaux sont de plus en plus prĂ©sents dans notre vie de tous les jours et sont en passe de devenir notre moyen de communication et d'information principal. Avec l'augmentation des donnĂ©es qu'ils contiennent sur nous et notre environnement, il devient dĂ©cisif d'ĂȘtre en mesure d'accĂ©der et d'analyser ces donnĂ©es. Aujourd'hui la maniĂšre la plus commune d'accĂ©der Ă  ces donnĂ©es est d'utiliser la recherche par mots-clĂ©s : on tape une requĂȘte de quelques mots et le rĂ©seau social renvoie un nombre fixe de documents qu'il juge pertinents. Dans les approches actuelles de recherche top-k dans un contexte social, la pertinence d'un document dĂ©pend de deux facteurs: la proximitĂ© sociale entre le document et l'utilisateur faisant la requĂȘte et le recoupement entre les mots-clĂ©s de la requĂȘte et les mots contenus dans le document. Nous trouvons cela limitĂ© et proposons de prendre en compte les interactions complexes entres les utilisateurs liĂ©s Ă  ce document mais aussi sa structure et le sens des mots qu'il contient, au lieu de leur formulation. Dans ce but, nous identifions les exigences propres Ă  la crĂ©ation d'un modĂšle qui intĂ©grerait pleinement des donnĂ©es sĂ©mantiques, structurĂ©es et sociales et proposons un nouveau modĂšle, S3, satisfaisant ces exigences. Nous rajoutons un modĂšle de requĂȘtes Ă  S3 et dĂ©veloppons S3k, un algorithme personnalisable de recherche top-k par mots-clĂ©s sur S3. Nous prouvons la correction de notre algorithme et en proposons une implĂ©mentation. Nous la comparons, Ă  l'aide de jeux de donnĂ©es crĂ©Ă©s Ă  partir du monde rĂ©el, avec celle d'une autre approche de recherche top-k par mots-clĂ©s dans un contexte social et montrons les diffĂ©rences fondamentales entre ces approches ainsi que les avantages qu'on peut tirer de la nĂŽtre.Social networks are increasingly present in our everyday life and are fast becoming our primary means of information and communication. As they contain more and more data about our surrounding and ourselves, it becomes vital to access and analyze this data. Currently, the primary means to query this data is through top-k keyword search: you enter a few words and the social network service sends you back a fixed number of relevant documents. In current top-k searches in a social context the relevance of a document is evaluated based on two factors: the overlapping of the query keywords with the words of the document and the social proximity between the document and the user making the query. We argue that this is limited and propose to take into account the complex interactions between the users linked to the document, its structure and the meaning of the words it contains instead of their phrasing. To this end we highlight the requirements for a model integrating fully structured, semantic and social data and propose a new model, called S3, satisfying these requirements. We introduce querying capabilities to S3 and develop an algorithm, S3k, for customizable top-k keyword search on S3. We prove the correctness of our algorithm and propose an implementation for it. We compare this implementation with another top-k keyword search in a social context, using datasets created from real world data, and show their differences and the benefits of our approach

    Recherche top-k pour le contenu du Web

    No full text
    Social networks are increasingly present in our everyday life and are fast becoming our primary means of information and communication. As they contain more and more data about our surrounding and ourselves, it becomes vital to access and analyze this data. Currently, the primary means to query this data is through top-k keyword search: you enter a few words and the social network service sends you back a fixed number of relevant documents. In current top-k searches in a social context the relevance of a document is evaluated based on two factors: the overlapping of the query keywords with the words of the document and the social proximity between the document and the user making the query. We argue that this is limited and propose to take into account the complex interactions between the users linked to the document, its structure and the meaning of the words it contains instead of their phrasing. To this end we highlight the requirements for a model integrating fully structured, semantic and social data and propose a new model, called S3, satisfying these requirements. We introduce querying capabilities to S3 and develop an algorithm, S3k, for customizable top-k keyword search on S3. We prove the correctness of our algorithm and propose an implementation for it. We compare this implementation with another top-k keyword search in a social context, using datasets created from real world data, and show their differences and the benefits of our approach.Les rĂ©seaux sociaux sont de plus en plus prĂ©sents dans notre vie de tous les jours et sont en passe de devenir notre moyen de communication et d'information principal. Avec l'augmentation des donnĂ©es qu'ils contiennent sur nous et notre environnement, il devient dĂ©cisif d'ĂȘtre en mesure d'accĂ©der et d'analyser ces donnĂ©es. Aujourd'hui la maniĂšre la plus commune d'accĂ©der Ă  ces donnĂ©es est d'utiliser la recherche par mots-clĂ©s : on tape une requĂȘte de quelques mots et le rĂ©seau social renvoie un nombre fixe de documents qu'il juge pertinents. Dans les approches actuelles de recherche top-k dans un contexte social, la pertinence d'un document dĂ©pend de deux facteurs: la proximitĂ© sociale entre le document et l'utilisateur faisant la requĂȘte et le recoupement entre les mots-clĂ©s de la requĂȘte et les mots contenus dans le document. Nous trouvons cela limitĂ© et proposons de prendre en compte les interactions complexes entres les utilisateurs liĂ©s Ă  ce document mais aussi sa structure et le sens des mots qu'il contient, au lieu de leur formulation. Dans ce but, nous identifions les exigences propres Ă  la crĂ©ation d'un modĂšle qui intĂ©grerait pleinement des donnĂ©es sĂ©mantiques, structurĂ©es et sociales et proposons un nouveau modĂšle, S3, satisfaisant ces exigences. Nous rajoutons un modĂšle de requĂȘtes Ă  S3 et dĂ©veloppons S3k, un algorithme personnalisable de recherche top-k par mots-clĂ©s sur S3. Nous prouvons la correction de notre algorithme et en proposons une implĂ©mentation. Nous la comparons, Ă  l'aide de jeux de donnĂ©es crĂ©Ă©s Ă  partir du monde rĂ©el, avec celle d'une autre approche de recherche top-k par mots-clĂ©s dans un contexte social et montrons les diffĂ©rences fondamentales entre ces approches ainsi que les avantages qu'on peut tirer de la nĂŽtre

    Recherche sur du contenu structuré, social et sémantique

    Get PDF
    Social content such as blogs, tweets, news etc. is a rich source of interconnected information. We identify a set of requirements for the meaningful exploitation of such rich content, and present a new data model, called S3, which is the first to satisfy them. S3 captures social relationships between users, and between users and content, but also the structure present in rich social content, as well as its semantics. We provide the first top-k keyword search algorithm taking into account the social, structured, and semantic dimensions and formally establish its termination and correctness. Experiments on real social networks demonstrate the efficiency and qualitative advantage of our algorithm through the joint exploitation of the social, structured, and semantic dimensions of S3.Les contenus sociaux comme les blogs, les tweets, les journaux en ligne etc. sont une source riche d’informations liĂ©es. Nous identifions dans ce rapport un ensemble de conditions nĂ©cessaires Ă  une exploration pertinente de ce contenu riche et introduisons un nouvel modĂšle de donnĂ©es, S3, qui est le premier Ă  les satisfaire. S3 capte les relations sociales entre les utilisateurs et les contenus mais aussi la structure et la sĂ©mantique de ces derniers. Nous proposons aussi le premier algorithme de recherche top k qui prend en compte les dimensions structurelles, sociales et sĂ©mantiques et donnons une preuve formelle de sa correction et de sa terminaison. Une Ă©valuation expĂ©rimentale sur des vrais rĂ©seaux sociaux valide l’efficacitĂ© et la qualitĂ© de notre approche sur l’exploration conjointe des dimensions structurelles, sociales et sĂ©mantiques de S3

    Toward Social, Structured and Semantic Search

    Get PDF
    International audienceSocial content such as social network posts, tweets, news articles and more generally web page fragments is often structured. Such social content is also frequently enriched with annotations, most of which carry semantics, either by collaborative effort or from automatic tools. Searching for relevant informa-tion in this context is both a basic feature for the users and a challenging task. We present a data model and a preliminary approach for answering queries over such structured, social and semantic-rich content, taking into account all dimensions of the data in order to return the most meaningful results

    Recherche Sociale, Structurée et Sémantique

    Get PDF
    National audienceSocial content such as blogs, tweets, news etc. is a rich source of interconnected information. We identify a set of requirements for the meaningful exploitation of such rich content, and present a new data model, called S4, which is the first to satisfy them. S4 captures social relationships between users, and between users and content, but also the structure present in rich social content, as well as its semantics. We show how S4 instances are derived from content and relationships present in today's social media, and provide the first top-kB keyword search algorithm taking into account the social, structured, and semantic dimensions and formally establish its termination and correctness.Experiments on real social networks demonstrate the efficiency and qualitative advantage of our algorithm through the joint exploitation of the social, structured, and semantic dimensions of S4