136,667 research outputs found
Human-Centric Cyber Social Computing Model for Hot-Event Detection and Propagation
The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.Microblogging networks have gained popularity in recent years as a platform enabling expressions of human emotions, through which users can conveniently produce contents on public events, breaking news, and/or products. Subsequently, microblogging networks generate massive amounts of data that carry opinions and mass sentiment on various topics. Herein, microblogging is regarded as a useful platform for detecting and propagating new hot events. It is also a useful channel for identifying high-quality posts, popular topics, key interests, and high-influence users. The existence of noisy data in the traditional social media data streams enforces to focus on human-centric computing. This paper proposes a human-centric social computing (HCSC) model for hot-event detection and propagation in microblogging networks. In the proposed HCSC model, all posts and users are preprocessed through hypertext induced topic search (HITS) for determining high-quality subsets of the users, topics, and posts. Then, a latent Dirichlet allocation (LDA)-based multiprototype user topic detection method is used for identifying users with high influence in the network. Furthermore, an influence maximization is used for final determination of influential users based on the user subsets. Finally, the users mined by influence maximization process are generated as the influential user sets for specific topics. Experimental results prove the superiority of our HCSC model against similar models of hot-event detection and information propagation
Hot Routes: Developing a New Technique for the Spatial Analysis of Crime
The use of hotspot mapping techniques such as KDE to represent the geographical spread of linear events can be problematic. Network-constrained data (for example transport-related crime) require a different approach to visualize concentration. We propose a methodology called Hot Routes, which measures the risk distribution of crime along a linear network by calculating the rate of crimes per section of road. This method has been designed for everyday crime analysts, and requires only a Geographical Information System (GIS), and suitable data to calculate. A demonstration is provided using crime data collected from London bus routes
Learning and Transferring IDs Representation in E-commerce
Many machine intelligence techniques are developed in E-commerce and one of
the most essential components is the representation of IDs, including user ID,
item ID, product ID, store ID, brand ID, category ID etc. The classical
encoding based methods (like one-hot encoding) are inefficient in that it
suffers sparsity problems due to its high dimension, and it cannot reflect the
relationships among IDs, either homogeneous or heterogeneous ones. In this
paper, we propose an embedding based framework to learn and transfer the
representation of IDs. As the implicit feedbacks of users, a tremendous amount
of item ID sequences can be easily collected from the interactive sessions. By
jointly using these informative sequences and the structural connections among
IDs, all types of IDs can be embedded into one low-dimensional semantic space.
Subsequently, the learned representations are utilized and transferred in four
scenarios: (i) measuring the similarity between items, (ii) transferring from
seen items to unseen items, (iii) transferring across different domains, (iv)
transferring across different tasks. We deploy and evaluate the proposed
approach in Hema App and the results validate its effectiveness.Comment: KDD'18, 9 page
Recommended from our members
State-of-the-art on research and applications of machine learning in the building life cycle
Fueled by big data, powerful and affordable computing resources, and advanced algorithms, machine learning has been explored and applied to buildings research for the past decades and has demonstrated its potential to enhance building performance. This study systematically surveyed how machine learning has been applied at different stages of building life cycle. By conducting a literature search on the Web of Knowledge platform, we found 9579 papers in this field and selected 153 papers for an in-depth review. The number of published papers is increasing year by year, with a focus on building design, operation, and control. However, no study was found using machine learning in building commissioning. There are successful pilot studies on fault detection and diagnosis of HVAC equipment and systems, load prediction, energy baseline estimate, load shape clustering, occupancy prediction, and learning occupant behaviors and energy use patterns. None of the existing studies were adopted broadly by the building industry, due to common challenges including (1) lack of large scale labeled data to train and validate the model, (2) lack of model transferability, which limits a model trained with one data-rich building to be used in another building with limited data, (3) lack of strong justification of costs and benefits of deploying machine learning, and (4) the performance might not be reliable and robust for the stated goals, as the method might work for some buildings but could not be generalized to others. Findings from the study can inform future machine learning research to improve occupant comfort, energy efficiency, demand flexibility, and resilience of buildings, as well as to inspire young researchers in the field to explore multidisciplinary approaches that integrate building science, computing science, data science, and social science
Task-specific Word Identification from Short Texts Using a Convolutional Neural Network
Task-specific word identification aims to choose the task-related words that
best describe a short text. Existing approaches require well-defined seed words
or lexical dictionaries (e.g., WordNet), which are often unavailable for many
applications such as social discrimination detection and fake review detection.
However, we often have a set of labeled short texts where each short text has a
task-related class label, e.g., discriminatory or non-discriminatory, specified
by users or learned by classification algorithms. In this paper, we focus on
identifying task-specific words and phrases from short texts by exploiting
their class labels rather than using seed words or lexical dictionaries. We
consider the task-specific word and phrase identification as feature learning.
We train a convolutional neural network over a set of labeled texts and use
score vectors to localize the task-specific words and phrases. Experimental
results on sentiment word identification show that our approach significantly
outperforms existing methods. We further conduct two case studies to show the
effectiveness of our approach. One case study on a crawled tweets dataset
demonstrates that our approach can successfully capture the
discrimination-related words/phrases. The other case study on fake review
detection shows that our approach can identify the fake-review words/phrases.Comment: accepted by Intelligent Data Analysis, an International Journa
- …