138 research outputs found

    Semantic disambiguation and contextualisation of social tags

    Full text link
    The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-28509-7_18This manuscript is an extended version of the paper ‘cTag: Semantic Contextualisation of Social Tags’, presented at the 6th International Workshop on Semantic Adaptive Social Web (SASWeb 2011).We present an algorithmic framework to accurately and efficiently identify the semantic meanings and contexts of social tags within a particular folksonomy. The framework is used for building contextualised tag-based user and item profiles. We also present its implementation in a system called cTag, with which we preliminary analyse semantic meanings and contexts of tags belonging to Delicious and MovieLens folksonomies. The analysis includes a comparison between semantic similarities obtained for pairs of tags in Delicious folksonomy, and their semantic distances in the whole Web, according to co-occurrence based metrics computed with results of a Web search engine.This work was supported by the Spanish Ministry of Science and Innovation (TIN2008-06566-C04-02), and Universidad Autónoma de Madrid (CCG10-UAM/TIC-5877

    Semantic contextualisation of social tag-based profiles and item recommendations

    Full text link
    Proceedigns of 12th International Conference, EC-Web 2011, Toulouse, France, August 30 - September 1, 2011.The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-23014-1_9We present an approach that efficiently identifies the semantic meanings and contexts of social tags within a particular folksonomy, and exploits them to build contextualised tag-based user and item profiles. We apply our approach to a dataset obtained from Delicious social bookmarking system, and evaluate it through two experiments: a user study consisting of manual judgements of tag disambiguation and contextualisation cases, and an offline study measuring the performance of several tag-powered item recommendation algorithms by using contextualised profiles. The results obtained show that our approach is able to accurately determine the actual semantic meanings and contexts of tag annotations, and allow item recommenders to achieve better precision and recall on their predictions.This work was supported by the Spanish Ministry of Science and Innovation (TIN2008-06566-C04-02), and the Community of Madrid (CCG10- UAM/TIC-5877

    cTag: Semantic Contextualisation of Social Tags

    Full text link
    Also published online by CEUR Workshop Proceedings (CEUR-WS.org, ISSN 1613-0073) Proceedings of the Workshop on Semantic Adaptive Social Web 2011In this paper, we present an algorithmic framework to identify the semantic meanings and contexts of social tags within a particular folksonomy, and exploit them for building contextualised tag-based user and item profiles. We also present its implementation in a system called cTag, with which we preliminary analyse semantic meanings and contexts of tags belonging to Delicious and MovieLens folksonomies.This work was supported by the Spanish Ministry of Science and Innovation (TIN2008-06566-C04-02), and the Regional Government of Madrid (S2009TIC- 1542)

    Using Data Mining for Facilitating User Contributions in the Social Semantic Web

    Get PDF
    This thesis utilizes recommender systems to aid the user in contributing to the Social Semantic Web. In this work, we propose a framework that maps domain properties to recommendation technologies. Next, we develop novel recommendation algorithms for improving personalized tag recommendation and for recommendation of semantic relations. Finally, we introduce a framework to analyze different types of potential attacks against social tagging systems and evaluate their impact on those systems

    An integrated approach to discover tag semantics

    Get PDF
    Tag-based systems have become very common for online classification thanks to their intrinsic advantages such as self-organization and rapid evolution. However, they are still affected by some issues that limit their utility, mainly due to the inherent ambiguity in the semantics of tags. Synonyms, homonyms, and polysemous words, while not harmful for the casual user, strongly affect the quality of search results and the performances of tag-based recommendation systems. In this paper we rely on the concept of tag relatedness in order to study small groups of similar tags and detect relationships between them. This approach is grounded on a model that builds upon an edge-colored multigraph of users, tags, and resources. To put our thoughts in practice, we present a modular and extensible framework of analysis for discovering synonyms, homonyms and hierarchical relationships amongst sets of tags. Some initial results of its application to the delicious database are presented, showing that such an approach could be useful to solve some of the well known problems of folksonomies

    Complex adaptive systems based data integration : theory and applications

    Get PDF
    Data Definition Languages (DDLs) have been created and used to represent data in programming languages and in database dictionaries. This representation includes descriptions in the form of data fields and relations in the form of a hierarchy, with the common exception of relational databases where relations are flat. Network computing created an environment that enables relatively easy and inexpensive exchange of data. What followed was the creation of new DDLs claiming better support for automatic data integration. It is uncertain from the literature if any real progress has been made toward achieving an ideal state or limit condition of automatic data integration. This research asserts that difficulties in accomplishing integration are indicative of socio-cultural systems in general and are caused by some measurable attributes common in DDLs. This research’s main contributions are: (1) a theory of data integration requirements to fully support automatic data integration from autonomous heterogeneous data sources; (2) the identification of measurable related abstract attributes (Variety, Tension, and Entropy); (3) the development of tools to measure them. The research uses a multi-theoretic lens to define and articulate these attributes and their measurements. The proposed theory is founded on the Law of Requisite Variety, Information Theory, Complex Adaptive Systems (CAS) theory, Sowa’s Meaning Preservation framework and Zipf distributions of words and meanings. Using the theory, the attributes, and their measures, this research proposes a framework for objectively evaluating the suitability of any data definition language with respect to degrees of automatic data integration. This research uses thirteen data structures constructed with various DDLs from the 1960\u27s to date. No DDL examined (and therefore no DDL similar to those examined) is designed to satisfy the law of requisite variety. No DDL examined is designed to support CAS evolutionary processes that could result in fully automated integration of heterogeneous data sources. There is no significant difference in measures of Variety, Tension, and Entropy among DDLs investigated in this research. A direction to overcome the common limitations discovered in this research is suggested and tested by proposing GlossoMote, a theoretical mathematically sound description language that satisfies the data integration theory requirements. The DDL, named GlossoMote, is not merely a new syntax, it is a drastic departure from existing DDL constructs. The feasibility of the approach is demonstrated with a small scale experiment and evaluated using the proposed assessment framework and other means. The promising results require additional research to evaluate GlossoMote’s approach commercial use potential

    Tag relatedness in image folksonomies

    Get PDF
    Folksonomies - networks of users, resources, and tags allow users to easily retrieve, organize and browse web contents. However, their advantages are still limited mainly due to the noisiness of user provided tags. To overcome this issue, we propose an approach for characterizing related tags in folksonomies: we use tag co-occurrence statistics and Laplacian score based feature selection in order to create empirical co-occurrence probability distribution for each tag; then we identify related tags on the basis of the dissimilarity between their distributions. For this purpose, we introduce variant of the Jensen-Shannon Divergence, which is more robust to statistical noise. We experimentally evaluate our approach using WordNet and compare it to a common tag-relatedness approach based on the cosine similarity. The results show the effectiveness of our approach and its advantage over the competing method

    Suchbasierte automatische Bildannotation anhand geokodierter Community-Fotos

    Get PDF
    In the Web 2.0 era, platforms for sharing and collaboratively annotating images with keywords, called tags, became very popular. Tags are a powerful means for organizing and retrieving photos. However, manual tagging is time consuming. Recently, the sheer amount of user-tagged photos available on the Web encouraged researchers to explore new techniques for automatic image annotation. The idea is to annotate an unlabeled image by propagating the labels of community photos that are visually similar to it. Most recently, an ever increasing amount of community photos is also associated with location information, i.e., geotagged. In this thesis, we aim at exploiting the location context and propose an approach for automatically annotating geotagged photos. Our objective is to address the main limitations of state-of-the-art approaches in terms of the quality of the produced tags and the speed of the complete annotation process. To achieve these goals, we, first, deal with the problem of collecting images with the associated metadata from online repositories. Accordingly, we introduce a strategy for data crawling that takes advantage of location information and the social relationships among the contributors of the photos. To improve the quality of the collected user-tags, we present a method for resolving their ambiguity based on tag relatedness information. In this respect, we propose an approach for representing tags as probability distributions based on the algorithm of Laplacian score feature selection. Furthermore, we propose a new metric for calculating the distance between tag probability distributions by extending Jensen-Shannon Divergence to account for statistical fluctuations. To efficiently identify the visual neighbors, the thesis introduces two extensions to the state-of-the-art image matching algorithm, known as Speeded Up Robust Features (SURF). To speed up the matching, we present a solution for reducing the number of compared SURF descriptors based on classification techniques, while the accuracy of SURF is improved through an efficient method for iterative image matching. Furthermore, we propose a statistical model for ranking the mined annotations according to their relevance to the target image. This is achieved by combining multi-modal information in a statistical framework based on Bayes' rule. Finally, the effectiveness of each of mentioned contributions as well as the complete automatic annotation process are evaluated experimentally.Seit der Einführung von Web 2.0 steigt die Popularität von Plattformen, auf denen Bilder geteilt und durch die Gemeinschaft mit Schlagwörtern, sogenannten Tags, annotiert werden. Mit Tags lassen sich Fotos leichter organisieren und auffinden. Manuelles Taggen ist allerdings sehr zeitintensiv. Animiert von der schieren Menge an im Web zugänglichen, von Usern getaggten Fotos, erforschen Wissenschaftler derzeit neue Techniken der automatischen Bildannotation. Dahinter steht die Idee, ein noch nicht beschriftetes Bild auf der Grundlage visuell ähnlicher, bereits beschrifteter Community-Fotos zu annotieren. Unlängst wurde eine immer größere Menge an Community-Fotos mit geographischen Koordinaten versehen (geottagged). Die Arbeit macht sich diesen geographischen Kontext zunutze und präsentiert einen Ansatz zur automatischen Annotation geogetaggter Fotos. Ziel ist es, die wesentlichen Grenzen der bisher bekannten Ansätze in Hinsicht auf die Qualität der produzierten Tags und die Geschwindigkeit des gesamten Annotationsprozesses aufzuzeigen. Um dieses Ziel zu erreichen, wurden zunächst Bilder mit entsprechenden Metadaten aus den Online-Quellen gesammelt. Darauf basierend, wird eine Strategie zur Datensammlung eingeführt, die sich sowohl der geographischen Informationen als auch der sozialen Verbindungen zwischen denjenigen, die die Fotos zur Verfügung stellen, bedient. Um die Qualität der gesammelten User-Tags zu verbessern, wird eine Methode zur Auflösung ihrer Ambiguität vorgestellt, die auf der Information der Tag-Ähnlichkeiten basiert. In diesem Zusammenhang wird ein Ansatz zur Darstellung von Tags als Wahrscheinlichkeitsverteilungen vorgeschlagen, der auf den Algorithmus der sogenannten Laplacian Score (LS) aufbaut. Des Weiteren wird eine Erweiterung der Jensen-Shannon-Divergence (JSD) vorgestellt, die statistische Fluktuationen berücksichtigt. Zur effizienten Identifikation der visuellen Nachbarn werden in der Arbeit zwei Erweiterungen des Speeded Up Robust Features (SURF)-Algorithmus vorgestellt. Zur Beschleunigung des Abgleichs wird eine Lösung auf der Basis von Klassifikationstechniken präsentiert, die die Anzahl der miteinander verglichenen SURF-Deskriptoren minimiert, während die SURF-Genauigkeit durch eine effiziente Methode des schrittweisen Bildabgleichs verbessert wird. Des Weiteren wird ein statistisches Modell basierend auf der Baye'schen Regel vorgeschlagen, um die erlangten Annotationen entsprechend ihrer Relevanz in Bezug auf das Zielbild zu ranken. Schließlich wird die Effizienz jedes einzelnen, erwähnten Beitrags experimentell evaluiert. Darüber hinaus wird die Performanz des vorgeschlagenen automatischen Annotationsansatzes durch umfassende experimentelle Studien als Ganzes demonstriert

    Study about the different use of explicit and implicit tags in social bookmarking

    Full text link
    This is the accepted version of the following article: Arolas, E. E., & Ladrón-de-Guevar, F. G. (2012). Uses of explicit and implicit tags in social bookmarking. Journal of the American Society for Information Science and Technology, 63(2), 313-322. doi:10.1002/asi.21663, which has been published in final form at http://dx.doi.org/10.1002/asi.21663Although Web 2.0 contains many tools with different functionalities, they all share a common social nature. One tool in particular, social bookmarking systems (SBSs), allows users to store and share links to different types of resources, i.e., websites, videos, images. To identify and classify these resources so that they can be retrieved and shared, fragments of text are used. These fragments of text, usually words, are called tags. A tag that is found on the inside of a resource text is referred to as an obvious or explicit tag. There are also nonobvious or implicit tags, which don't appear in the resource text. The purpose of this article is to describe the present situation of the SBSs tool and then to also determine the principal features of and how to use explicit tags. It will be taken into special consideration which HTML tags with explicit tags are used more frequently.Estelles Arolas, E.; González Ladrón De Guevara, FR. (2012). Study about the different use of explicit and implicit tags in social bookmarking. Journal of the American Society for Information Science and Technology. 63(2):313-322. doi:10.1002/asi.21663S313322632Bar-Ilan, J., Zhitomirsky-Geffet, M., Miller, Y., & Shoham, S. (2010). The effects of background information and social interaction on image tagging. Journal of the American Society for Information Science and Technology, 61(5), 940-951. doi:10.1002/asi.21306Bateman, S., Muller, M. J., & Freyne, J. (2009). Personalized retrieval in social bookmarking. Proceedinfs of the ACM 2009 international conference on Supporting group work - GROUP ’09. doi:10.1145/1531674.1531688Delicious' Blog 2010 What's next for Delicious http://blog.delicious.com/blog/2010/12/whats-next-for-delicious.htmlDing, Y., Jacob, E. K., Zhang, Z., Foo, S., Yan, E., George, N. L., & Guo, L. (2009). Perspectives on social tagging. Journal of the American Society for Information Science and Technology, 60(12), 2388-2401. doi:10.1002/asi.21190Eisterlehner , F. Hotho , A. Jäschke , R. ECML PKDD Discovery Challenge 2009 (DC09)Farooq, U., Kannampallil, T. G., Song, Y., Ganoe, C. H., Carroll, J. M., & Giles, L. (2007). Evaluating tagging behavior in social bookmarking systems. Proceedings of the 2007 international ACM conference on Conference on supporting group work - GROUP ’07. doi:10.1145/1316624.1316677Farooq , U. Zhang , S.M. Carroll , J. 2009 Sensemaking of scholarly literature through taggingFu, W.-T., Kannampallil, T., Kang, R., & He, J. (2010). Semantic imitation in social tagging. ACM Transactions on Computer-Human Interaction, 17(3), 1-37. doi:10.1145/1806923.1806926Furnas, G. W., Landauer, T. K., Gomez, L. M., & Dumais, S. T. (1987). The vocabulary problem in human-system communication. Communications of the ACM, 30(11), 964-971. doi:10.1145/32206.32212Golder , S.A. Huberman , B.A. 2005 The structure of collaborative tagging systems http://www.hpl.hp.com/research/idl/papers/tagsKörner, C., Benz, D., Hotho, A., Strohmaier, M., & Stumme, G. (2010). Stop thinking, start tagging. Proceedings of the 19th international conference on World wide web - WWW ’10. doi:10.1145/1772690.1772744Koutrika, G., Effendi, F. A., Gyöngyi, Z., Heymann, P., & Garcia-Molina, H. (2008). Combating spam in tagging systems. ACM Transactions on the Web, 2(4), 1-34. doi:10.1145/1409220.1409225Lipczak, M., & Milios, E. (2010). The impact of resource title on tags in collaborative tagging systems. Proceedings of the 21st ACM conference on Hypertext and hypermedia - HT ’10. doi:10.1145/1810617.1810648Marinho, L. B., Nanopoulos, A., Schmidt-Thieme, L., Jäschke, R., Hotho, A., Stumme, G., & Symeonidis, P. (2010). Social Tagging Recommender Systems. Recommender Systems Handbook, 615-644. doi:10.1007/978-0-387-85820-3_19Marlow, C., Naaman, M., Boyd, D., & Davis, M. (2006). HT06, tagging paper, taxonomy, Flickr, academic article, to read. Proceedings of the seventeenth conference on Hypertext and hypermedia - HYPERTEXT ’06. doi:10.1145/1149941.1149949Mathes , A. 2004 Folksonomies-Cooperative classification and communication through shared metadata http://www.adammathes.com/academic/computer-mediated-communication/folksonomies.htmlMelenhorst, M., & van Setten, M. (2007). Usefulness of Tags in Providing Access to Large Information Systems. 2007 IEEE International Professional Communication Conference. doi:10.1109/ipcc.2007.4464070Millen, D., Feinberg, J., & Kerr, B. (2005). Social bookmarking in the enterprise. Queue, 3(9), 28. doi:10.1145/1105664.1105676Robu, V., Halpin, H., & Shepherd, H. (2009). Emergence of consensus and shared vocabularies in collaborative tagging systems. ACM Transactions on the Web, 3(4), 1-34. doi:10.1145/1594173.1594176Schmitz, C., Hotho, A., Jäschke, R., & Stumme, G. (s. f.). Mining Association Rules in Folksonomies. Data Science and Classification, 261-270. doi:10.1007/3-540-34416-0_28Smith , G. 2004 Atomiq: Folksonomy: social classification http://atomiq.org/archives/2004/08/folksonomy_social_classification.htmlSubramanya, S. B., & Liu, H. (2008). Socialtagger - collaborative tagging for blogs in the long tail. Proceeding of the 2008 ACM workshop on Search in social media - SSM ’08. doi:10.1145/1458583.1458588Au Yeung, C., Gibbins, N., & Shadbolt, N. (2009). Contextualising tags in collaborative tagging systems. Proceedings of the 20th ACM conference on Hypertext and hypermedia - HT ’09. doi:10.1145/1557914.1557958Zhang, N., Zhang, Y., & Tang, J. (2009). A tag recommendation system for folksonomy. Proceeding of the 2nd ACM workshop on Social web search and mining - SWSM ’09. doi:10.1145/1651437.165144
    • …
    corecore