17,126 research outputs found

    Knowledge will Propel Machine Understanding of Content: Extrapolating from Current Examples

    Full text link
    Machine Learning has been a big success story during the AI resurgence. One particular stand out success relates to learning from a massive amount of data. In spite of early assertions of the unreasonable effectiveness of data, there is increasing recognition for utilizing knowledge whenever it is available or can be created purposefully. In this paper, we discuss the indispensable role of knowledge for deeper understanding of content where (i) large amounts of training data are unavailable, (ii) the objects to be recognized are complex, (e.g., implicit entities and highly subjective content), and (iii) applications need to use complementary or related data in multiple modalities/media. What brings us to the cusp of rapid progress is our ability to (a) create relevant and reliable knowledge and (b) carefully exploit knowledge to enhance ML/NLP techniques. Using diverse examples, we seek to foretell unprecedented progress in our ability for deeper understanding and exploitation of multimodal data and continued incorporation of knowledge in learning techniques.Comment: Pre-print of the paper accepted at 2017 IEEE/WIC/ACM International Conference on Web Intelligence (WI). arXiv admin note: substantial text overlap with arXiv:1610.0770

    SuperIdentity: fusion of identity across real and cyber domains

    No full text
    Under both benign and malign circumstances, people now manage a spectrum of identities across both real-world and cyber domains. Our belief, however, is that all these instances ultimately track back for an individual to reflect a single ‘SuperIdentity’. This paper outlines the assumptions underpinning the SuperIdentity Project, describing the innovative use of data fusion to incorporate novel real-world and cyber cues into a rich framework appropriate for modern identity. The proposed combinatorial model will support a robust identification or authentication decision, with confidence indexed both by the level of trust in data provenance, and the diagnosticity of the identity factors being used. Additionally, the exploration of correlations between factors may underpin the more intelligent use of identity information so that known information may be used to predict previously hidden information. With modern living supporting the ‘distribution of identity’ across real and cyber domains, and with criminal elements operating in increasingly sophisticated ways in the hinterland between the two, this approach is suggested as a way forwards, and is discussed in terms of its impact on privacy, security, and the detection of threa

    Clustering as an example of optimizing arbitrarily chosen objective functions

    Get PDF
    This paper is a reflection upon a common practice of solving various types of learning problems by optimizing arbitrarily chosen criteria in the hope that they are well correlated with the criterion actually used for assessment of the results. This issue has been investigated using clustering as an example, hence a unified view of clustering as an optimization problem is first proposed, stemming from the belief that typical design choices in clustering, like the number of clusters or similarity measure can be, and often are suboptimal, also from the point of view of clustering quality measures later used for algorithm comparison and ranking. In order to illustrate our point we propose a generalized clustering framework and provide a proof-of-concept using standard benchmark datasets and two popular clustering methods for comparison
    • …
    corecore