Search CORE

2,572 research outputs found

Recommended from our members

Exploiting Social Media Sources for Search, Fusion and Evaluation

Author: Lee Chia-Jung
Publication venue: ScholarWorks@UMass Amherst
Publication date: 09/11/2015
Field of study

The web contains heterogeneous information that is generated with different characteristics and is presented via different media. Social media, as one of the largest content carriers, has generated information from millions of users worldwide, creating material rapidly in all types of forms such as comments, images, tags, videos and ratings, etc. In social applications, the formation of online communities contributes to conversations of substantially broader aspects, as well as unfiltered opinions about subjects that are rarely covered in public media. Information accrued on social platforms, therefore, presents a unique opportunity to augment web sources such as Wikipedia or news pages, which are usually characterized as being more formal. The goal of this dissertation is to investigate in depth how social data can be exploited and applied in the context of three fundamental information retrieval (IR) tasks: search, fusion, and evaluation. Improving search performance has consistently been a major focus in the IR community. Given the in-depth discussions and active interactions contained in social media, we present approaches to incorporating this type of data to improve search on general web corpora. In particular, we propose two graph-based frameworks, social anchor and information network, to associate related web and social content, where information sources of diverse characteristics can be used to complement each other in a unified manner. We investigate how the enriched representation can potentially reduce vocabulary mismatch and improve retrieval effectiveness. Presenting social media content to users is valuable particularly for queries intended for time-sensitive events or community opinions. Current major search engines commonly blend results from different search services (or verticals) into core web results. Motivated by this real-world need, we explore ways to merge results from different web and social services into a single ranked list. We present an optimization framework for fusion, where impact of documents, ranked lists, and verticals can be modeled simultaneously to maximize performance. Evaluating search system performance has largely relied on creating reusable test collections in IR. Traditional ways to creating evaluation sets can require substantial manual effort. To reduce such effort, we explore an approach to automating the process of collecting pairs of queries and relevance judgments, using high quality social media, Community Question Answering (CQA). Our approach is based on the idea that CQA services support platforms for users to raise questions and to share answers, therefore encoding the associations between real user information needs and real user assessments. To demonstrate the effectiveness of our approaches, we conduct extensive retrieval and fusion experiments, as well as verify the reliability of the new, CQA-based evaluation test sets

ScholarWorks@UMass Amherst

Dynamic Collective Entity Representations for Entity Ranking

Author: Balog K.
Balog K.
Demartini G.
Lee C.-J.
Mohan A.
Westerveld T.
Zaragoza H.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2015
Field of study

Entity ranking, i.e., successfully positioning a relevant entity at the top of the ranking for a given query, is inherently difficult due to the potential mismatch between the entity's description in a knowledge base, and the way people refer to the entity when searching for it. To counter this issue we propose a method for constructing dynamic collective entity representations. We collect entity descriptions from a variety of sources and combine them into a single entity representation by learning to weight the content from different sources that are associated with an entity for optimal retrieval effectiveness. Our method is able to add new descriptions in real time and learn the best representation as time evolves so as to capture the dynamics of how people search entities. Incorporating dynamic description sources into dynamic collective entity representations improves retrieval effectiveness by 7% over a state-of-the-art learning to rank baseline. Periodic retraining of the ranker enables higher ranking effectiveness for dynamic collective entity representations

Crossref

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Entities of interest:Discovery in digital traces

Author: Graus D.P.
Publication venue
Publication date: 01/01/2017
Field of study

International Migration, Integration and Social Cohesion online publications

Hytexpros : a hypermedia information retrieval system

Author: Shen Hong
Publication venue: Digital Commons @ NJIT
Publication date: 31/01/2000
Field of study

The Hypermedia information retrieval system makes use of the specific capabilities of hypermedia systems with information retrieval operations and provides new kind of information management tools. It combines both hypermedia and information retrieval to offer end-users the possibility of navigating, browsing and searching a large collection of documents to satisfy an information need. TEXPROS is an intelligent document processing and retrieval system that supports storing, extracting, classifying, categorizing, retrieval and browsing enterprise information. TEXPROS is a perfect application to apply hypermedia information retrieval techniques. In this dissertation, we extend TEXPROS to a hypermedia information retrieval system called HyTEXPROS with hypertext functionalities, such as node, typed and weighted links, anchors, guided-tours, network overview, bookmarks, annotations and comments, and external linkbase. It describes the whole information base including the metadata and the original documents as network nodes connected by links. Through hypertext functionalities, a user can construct dynamically an information path by browsing through pieces of the information base. By adding hypertext functionalities to TEXPROS, HyTEXPROS is created. It changes its working domain from a personal document process domain to a personal library domain accompanied with citation techniques to process original documents. A four-level conceptual architecture is presented as the system architecture of HyTEXPROS. Such architecture is also referred to as the reference model of HyTEXPROS. Detailed description of HyTEXPROS, using the First Order Logic Calculus, is also proposed. An early version of a prototype is briefly described

Digital Commons @ New Jersey Institute of Technology (NJIT)

Transition in Monitoring and Network Offloading - Handling Dynamic Mobile Applications and Environments

Author: Richerzhagen Nils
Publication venue
Publication date: 01/01/2019
Field of study

Communication demands increased significantly in recent years, as evidenced in studies by Cisco and Ericsson. Users demand connectivity anytime and anywhere, while new application domains such as the Internet of Things and vehicular networking, amplify heterogeneity and dynamics of the resource-constrained environment of mobile networks. These developments pose major challenges to an efficient utilization of existing communication infrastructure. To reduce the burden on the communication infrastructure, mechanisms for network offloading can be utilized. However, to deal with the dynamics of new application scenarios, these mechanisms need to be highly adaptive. Gathering information about the current status of the network is a fundamental requirement for meaningful adaptation. This requires network monitoring mechanisms that are able to operate under the same highly dynamic environmental conditions and changing requirements. In this thesis, we design and realize a concept for transitions within network offloading to handle the former challenges, which constitutes our first contribution. We enable adaptive offloading by introducing a methodology for the identification and encapsulation of gateway selection and clustering mechanisms in the transition-enabled service AssignMe.KOM. To handle the dynamics of environmental conditions, we allow for centralized and decentralized offloading. We generalize and show the significant impact of our concept of transitions within offloading in various, heterogeneous applications domains such as vehicular networking or publish/subscribe. We extend the methodology of identification and encapsulation to the domain of network monitoring in our second contribution. Our concept of a transition-enabled monitoring service AdaptMon.KOM enables adaptive network state observation by executing transitions between monitoring mechanisms. We introduce extensive transition coordination concepts for reconfiguration in both of our contributions. To prevent data loss during complex transition plans that cover multiple coexisting transition-enabled mechanisms, we develop the methodology of inter-proxy state transfer. We target the coexistence of our contributions for the use case of collaborative location retrieval on the example of location-based services. Based on our prototypes of AssignMe.KOM and AdaptMon.KOM, we conduct an extensive evaluation of our contributions in the Simonstrator.KOM platform. We show that our proposed inter-proxy state transfer prevents information loss, enabling seamless execution of complex transition plans that cover multiple coexisting transition-enabled mechanisms. Additionally, we demonstrate the influence of transition coordination and spreading on the success of the network adaptation. We manifest a cost-efficient and reliable methodology for location retrieval by combining our transition-enabled contributions. We show that our contributions allow for adaption on dynamic environmental conditions and requirements in network offloading and monitoring

TUbiblio

tuprints

Augmenting applications with hyper media, functionality and meta-information

Author: Galnares Roberto
Publication venue: Digital Commons @ NJIT
Publication date: 31/05/2001
Field of study

The Dynamic Hypermedia Engine (DHE) enhances analytical applications by adding relationships, semantics and other metadata to the application\u27s output and user interface. DHE also provides additional hypermedia navigational, structural and annotation functionality. These features allow application developers and users to add guided tours, personal links and sharable annotations, among other features, into applications. DHE runs as a middleware between the application user interface and its business logic and processes, in a n-tier architecture, supporting the extra functionalities without altering the original systems by means of application wrappers. DHE automatically generates links at run-time for each of those elements having relationships and metadata. Such elements are previously identified using a Relation Navigation Analysis. DHE also constructs more sophisticated navigation techniques not often found on the Web on top of these links. The metadata, links, navigation and annotation features supplement the application\u27s primary functionality. This research identifies element types, or classes , in the application displays. A mapping rule encodes each relationship found between two elements of interest at the class level . When the user selects a particular element, DHE instantiates the commands included in the rules with the actual instance selected and sends them to the appropriate destination system, which then dynamically generates the resulting virtual (i.e. not previously stored) page. DHE executes concurrently with these applications, providing automated link generation and other hypermedia functionality. DHE uses the extensible Markup Language (XMQ -and related World Wide Web Consortium (W3C) sets of XML recommendations, like Xlink, XML Schema, and RDF -to encode the semantic information required for the operation of the extra hypermedia features, and for the transmission of messages between the engine modules and applications. DHE is the only approach we know that provides automated linking and metadata services in a generic manner, based on the application semantics, without altering the applications. DHE will also work with non-Web systems. The results of this work could also be extended to other research areas, such as link ranking and filtering, automatic link generation as the result of a search query, metadata collection and support, virtual document management, hypermedia functionality on the Web, adaptive and collaborative hypermedia, web engineering, and the semantic Web

Digital Commons @ New Jersey Institute of Technology (NJIT)

Tweet-biased summarization

Author: Huspi S
Sanderson M
Yulianti E
Publication venue: John Wiley and Sons Inc. (United States)
Publication date: 01/01/2016
Field of study

We examined whether the microblog comments given by people after reading a web document could be exploited to improve the accuracy of a web document summarization system. We examined the effect of social information (i.e., tweets) on the accuracy of the generated summaries by comparing the user preference for TBS (tweet-biased summary) with GS (generic summary). The result of crowdsourcing-based evaluation shows that the user preference for TBS was significantly higher than GS. We also took random samples of the documents to see the performance of summaries in a traditional evaluation using ROUGE, which, in general, TBS was also shown to be better than GS. We further analyzed the influence of the number of tweets pointed to a web document on summarization accuracy, finding a positive moderate correlation between the number of tweets pointed to a web document and the performance of generated TBS as measured by user preference. The results show that incorporating social information into the summary generation process can improve the accuracy of summary. The reason for people choosing one summary over another in a crowdsourcing-based evaluation is also presented in this article

RMIT Research Repository

Universiti Teknologi Malaysia Institutional Repository

Applying Wikipedia to Interactive Information Retrieval

Author: Milne David N.
Publication venue: 'University of Waikato'
Publication date: 15/09/2010
Field of study

There are many opportunities to improve the interactivity of information retrieval systems beyond the ubiquitous search box. One idea is to use knowledge bases—e.g. controlled vocabularies, classification schemes, thesauri and ontologies—to organize, describe and navigate the information space. These resources are popular in libraries and specialist collections, but have proven too expensive and narrow to be applied to everyday webscale search. Wikipedia has the potential to bring structured knowledge into more widespread use. This online, collaboratively generated encyclopaedia is one of the largest and most consulted reference works in existence. It is broader, deeper and more agile than the knowledge bases put forward to assist retrieval in the past. Rendering this resource machine-readable is a challenging task that has captured the interest of many researchers. Many see it as a key step required to break the knowledge acquisition bottleneck that crippled previous efforts. This thesis claims that the roadblock can be sidestepped: Wikipedia can be applied effectively to open-domain information retrieval with minimal natural language processing or information extraction. The key is to focus on gathering and applying human-readable rather than machine-readable knowledge. To demonstrate this claim, the thesis tackles three separate problems: extracting knowledge from Wikipedia; connecting it to textual documents; and applying it to the retrieval process. First, we demonstrate that a large thesaurus-like structure can be obtained directly from Wikipedia, and that accurate measures of semantic relatedness can be efficiently mined from it. Second, we show that Wikipedia provides the necessary features and training data for existing data mining techniques to accurately detect and disambiguate topics when they are mentioned in plain text. Third, we provide two systems and user studies that demonstrate the utility of the Wikipedia-derived knowledge base for interactive information retrieval

Research Commons@Waikato

Metagraph-based learning on heterogeneous graphs

Author: CHANG Kevin
FANG Yuan
LI Xiao-Li
LIN Wenqing
SHI Jiaqi
WU Min
ZHENG Vincent W.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/06/2019
Field of study

Institutional Knowledge at Singapore Management University