628 research outputs found

    Concept-based Interactive Query Expansion Support Tool (CIQUEST)

    Get PDF
    This report describes a three-year project (2000-03) undertaken in the Information Studies Department at The University of Sheffield and funded by Resource, The Council for Museums, Archives and Libraries. The overall aim of the research was to provide user support for query formulation and reformulation in searching large-scale textual resources including those of the World Wide Web. More specifically the objectives were: to investigate and evaluate methods for the automatic generation and organisation of concepts derived from retrieved document sets, based on statistical methods for term weighting; and to conduct user-based evaluations on the understanding, presentation and retrieval effectiveness of concept structures in selecting candidate terms for interactive query expansion. The TREC test collection formed the basis for the seven evaluative experiments conducted in the course of the project. These formed four distinct phases in the project plan. In the first phase, a series of experiments was conducted to investigate further techniques for concept derivation and hierarchical organisation and structure. The second phase was concerned with user-based validation of the concept structures. Results of phases 1 and 2 informed on the design of the test system and the user interface was developed in phase 3. The final phase entailed a user-based summative evaluation of the CiQuest system. The main findings demonstrate that concept hierarchies can effectively be generated from sets of retrieved documents and displayed to searchers in a meaningful way. The approach provides the searcher with an overview of the contents of the retrieved documents, which in turn facilitates the viewing of documents and selection of the most relevant ones. Concept hierarchies are a good source of terms for query expansion and can improve precision. The extraction of descriptive phrases as an alternative source of terms was also effective. With respect to presentation, cascading menus were easy to browse for selecting terms and for viewing documents. In conclusion the project dissemination programme and future work are outlined

    Adaptive Representations for Tracking Breaking News on Twitter

    Full text link
    Twitter is often the most up-to-date source for finding and tracking breaking news stories. Therefore, there is considerable interest in developing filters for tweet streams in order to track and summarize stories. This is a non-trivial text analytics task as tweets are short, and standard retrieval methods often fail as stories evolve over time. In this paper we examine the effectiveness of adaptive mechanisms for tracking and summarizing breaking news stories. We evaluate the effectiveness of these mechanisms on a number of recent news events for which manually curated timelines are available. Assessments based on ROUGE metrics indicate that an adaptive approaches are best suited for tracking evolving stories on Twitter.Comment: 8 Pag

    Continuous Space Models for CLIR

    Full text link
    [EN] We present and evaluate a novel technique for learning cross-lingual continuous space models to aid cross-language information retrieval (CLIR). Our model, which is referred to as external-data composition neural network (XCNN), is based on a composition function that is implemented on top of a deep neural network that provides a distributed learning framework. Different from most existing models, which rely only on available parallel data for training, our learning framework provides a natural way to exploit monolingual data and its associated relevance metadata for learning continuous space representations of language. Cross-language extensions of the obtained models can then be trained by using a small set of parallel data. This property is very helpful for resource-poor languages, therefore, we carry out experiments on the English-Hindi language pair. On the conducted comparative evaluation, the proposed model is shown to outperform state-of-the-art continuous space models with statistically significant margin on two different tasks: parallel sentence retrieval and ad-hoc retrieval.We thank German Sanchis Trilles for helping in conducting experiments with machine translation. We gratefully acknowledge the support of NVIDIA Corporation with the donation of the GeForce Titan GPU used for this research. The research of the first author was supported by FPI grant of UPV. The research of the third author is supported by the SomEMBED TIN2015-71147-C2-1-P MINECO research project and by the Generalitat Valenciana under the grant ALMAMATER (PrometeolI/2014/030).Gupta, P.; Banchs, R.; Rosso, P. (2017). Continuous Space Models for CLIR. Information Processing & Management. 53(2):359-370. https://doi.org/10.1016/j.ipm.2016.11.002S35937053

    Sustainability intervention mechanisms for managing road freight transport externalities: A systematic literature review

    Get PDF
    With road freight transport continuing to dominate global freight transport operations, there is increasing pressure on the freight transport industry and its stakeholders to address concerns over its sustainability. This paper adopts a systematic review to examine the academic literature on road freight transport sustainability between 2001 and 2018. Using content and thematic analysis, the paper identifies and categorises sustainability intervention mechanisms providing useful insights on key research applications areas and continental distribution of sustainable road freight transport (SRFT) research. In addition to the six-overarching sustainability intervention mechanism themes identified: decoupling, Information and Communications Technology (ICT), modality, operations, policy, and other, future research can explore the effectiveness of different interventions mechanisms identified in this study to improve sustainable practices across different continents

    Machine Learning in Automated Text Categorization

    Full text link
    The automated categorization (or classification) of texts into predefined categories has witnessed a booming interest in the last ten years, due to the increased availability of documents in digital form and the ensuing need to organize them. In the research community the dominant approach to this problem is based on machine learning techniques: a general inductive process automatically builds a classifier by learning, from a set of preclassified documents, the characteristics of the categories. The advantages of this approach over the knowledge engineering approach (consisting in the manual definition of a classifier by domain experts) are a very good effectiveness, considerable savings in terms of expert manpower, and straightforward portability to different domains. This survey discusses the main approaches to text categorization that fall within the machine learning paradigm. We will discuss in detail issues pertaining to three different problems, namely document representation, classifier construction, and classifier evaluation.Comment: Accepted for publication on ACM Computing Survey

    On the Bayes-optimality of F-measure maximizers

    Get PDF
    The F-measure, which has originally been introduced in information retrieval, is nowadays routinely used as a performance metric for problems such as binary classification, multi-label classification, and structured output prediction. Optimizing this measure is a statistically and computationally challenging problem, since no closed-form solution exists. Adopting a decision-theoretic perspective, this article provides a formal and experimental analysis of different approaches for maximizing the F-measure. We start with a Bayes-risk analysis of related loss functions, such as Hamming loss and subset zero-one loss, showing that optimizing such losses as a surrogate of the F-measure leads to a high worst-case regret. Subsequently, we perform a similar type of analysis for F-measure maximizing algorithms, showing that such algorithms are approximate, while relying on additional assumptions regarding the statistical distribution of the binary response variables. Furthermore, we present a new algorithm which is not only computationally efficient but also Bayes-optimal, regardless of the underlying distribution. To this end, the algorithm requires only a quadratic (with respect to the number of binary responses) number of parameters of the joint distribution. We illustrate the practical performance of all analyzed methods by means of experiments with multi-label classification problems

    Deep Multi-View Learning for Visual Understanding

    Get PDF
    PhD ThesisMulti-view data is the result of an entity being perceived or represented from multiple perspectives. Plenty of applications in visual understanding contain multi-view data. For example, the face images for training a recognition system are usually captured by different devices from multiple angles. This thesis focuses on the cross-view visual recognition problems, e.g., identifying the face images of the same person across different cameras. Several representative multi-view settings, from the supervised multi-view learning to the more challenging unsupervised domain adaptive (UDA) multi-view learning, are investigated. Novel multi-view learning algorithms are proposed correspondingly. To be more specific, the proposed methods are based on the advanced deep neural network (DNN) architectures for better handling visual data. However, directly combining the multi-view learning objectives with DNN can result in different issues, e.g., on scalability, and limit the application scenarios and model performance. Corresponding novelties in DNN methods are thus required to solve them. This thesis is organised into three parts. Each chapter focuses on a multi-view learning setting with novel solutions and is detailed as follows: Chapter 3 A supervised multi-view learning setting with two different views are studied. To recognise the data samples across views, one strategy is aligning them in a common feature space via correlation maximisation. It is also known as canonical correlation analysis (CCA). Deep CCA has been proposed for better performance with the non-linear projection via deep neural networks. Existing deep CCA models typically decorrelate the deep feature dimensions of each view before their Euclidean distances are minimised in the common space. This feature decorrelation is achieved by enforcing an exact decorrelation constraint which is computationally expensive due to the matrix inversion or SVD operations. Therefore, existing deep CCA models are inefficient and have scalability issues. Furthermore, the exact decorrelation is incompatible with the gradient based deep model training and results in sub-optimal solution. To overcome these aforementioned issues, a novel deep CCA model Soft CCA is introduced in this thesis. Specifically, the exact decorrelation is replaced by soft decorrelation via a mini-batch based Stochastic Decorrelation Loss (SDL). It can be jointly optimised with the other training objectives. In addition, our SDL loss can be applied to other deep models beyond multi-view learning. Chapter 4 The supervised multi-view learning setting, whereby more than two views exist, are studied in this chapter. Recently developed deep multi-view learning algorithms either learn a latent visual representation based on a single semantic level and/or require laborious human annotation of these factors as attributes. A novel deep neural network architecture, called Multi- Level Factorisation Net (MLFN), is proposed to automatically factorise the visual appearance into latent discriminative factors at multiple semantic levels without manual annotation. The main purpose is forcing different views share the same latent factors so that they are can be aligned at all layers. Specifically, MLFN is composed of multiple stacked blocks. Each block contains multiple factor modules to model latent factors at a specific level, and factor selection modules that dynamically select the factor modules to interpret the content of each input image. The outputs of the factor selection modules also provide a compact latent factor descriptor that is complementary to the conventional deeply learned feature, and they can be fused efficiently. The effectiveness of the proposed MLFN is demonstrated by not only the large-scale cross-view recognition problems but also the general object categorisation tasks. Chapter 5 The last problem is a special unsupervised domain adaptation setting called unsupervised domain adaptive (UDA) multi-view learning. It contains a fully annotated dataset as the source domain and another unsupervised dataset with relevant tasks as the target domain. The main purpose is to improve the performance of the unlabelled dataset with the annotated data from the other dataset. More importantly, this setting further requires both the source and target domains are multi-view datasets with relevant tasks. Therefore, the assumption of the aligned label space across domains is inappropriate in the UDA multi-view learning. For example, the person re-identification (Re-ID) datasets built on different surveillance scenarios are with images of different people captured and should be given disjoint person identity labels. Existing methods for UDA multi-view learning problems are aligning different domains either in the raw image space or a feature embedding space for domain alignment. In this thesis, a different framework, multi-task learning, is adopted with the domain specific objectives for a common space learning. Specifically, such common space is proposed to enable the knowledge transfer. The conventional supervised losses can be used for the labelled source data while the unsupervised objectives for the target domain play the key roles in domain adaptation. Two novel unsupervised objectives are introduced for UDA multi-view learning and result in two models as below. The first model, termed common factorised space model (CFSM), is built on the assumptions that the semantic latent attributes are shared between the source and target domains since they are relevant multi-view learning tasks. Different from the existing methods that based on domain alignment, CFSM emphasizes on transferring the information across domains via discovering discriminative latent factors in the proposed common space. However, the multi-view data from target domain is without labels. Therefore, an unsupervised factorisation loss is derived and applied on the common space for latent factors discovery across domains. The second model still learns a shared embedding space with multi-view data from both domains but with a different assumption. It attempts to discover the latent correspondence of multi-view data in the unsupervised target data. The target data’s contribution comes from a clustering process. Each cluster thus reveals the underlying cross-view correspondences across multiple views in target domain. To this end, a novel Stochastic Inference for Deep Clustering (SIDC) method is proposed. It reduces self-reinforcing errors that lead to premature convergence to a sub-optimal solution by changing the conventional deterministic cluster assignment to a stochastic one
    corecore