437 research outputs found

    Detecting objects using Rolling Convolution and Recurrent Neural Network

    Get PDF
    Abstract—At present, most of the existing target detection algorithms use the method of region proposal to search for the target in the image. The most effective regional proposal method usually requires thousands of target prediction areas to achieve high recall rate.This lowers the detection efficiency. Even though recent region proposal network approach have yielded good results by using hundreds of proposals, it still faces the challenge when applied to small objects and precise locations. This is mainly because these approaches use coarse feature. Therefore, we propose a new method for extracting more efficient global features and multi-scale features to provide target detection performance. Given that feature maps under continuous convolution lose the resolution required to detect small objects when obtaining deeper semantic information; hence, we use rolling convolution (RC) to maintain the high resolution of low-level feature maps to explore objects in greater detail, even if there is no structure dedicated to combining the features of multiple convolutional layers. Furthermore, we use a recurrent neural network of multiple gated recurrent units (GRUs) at the top of the convolutional layer to highlight useful global context locations for assisting in the detection of objects. Through experiments in the benchmark data set, our proposed method achieved 78.2% mAP in PASCAL VOC 2007 and 72.3% mAP in PASCAL VOC 2012 dataset. It has been verified through many experiments that this method has reached a more advanced level of detection

    Multi-frame Image Super-resolution Reconstruction Using Multi-grained Cascade Forest

    Get PDF
    Super-resolution image reconstruction utilizes two algorithms, where one is for single-frame image reconstruction, and the other is for multi-frame image reconstruction. Single-frame image reconstruction generally takes the first degradation and is followed by reconstruction, which essentially creates a problem of insufficient characterization. Multi-frame images provide additional information for image reconstruction relative to single frame images due to the slight differences between sequential frames. However, the existing super-resolution algorithm for multi-frame images do not take advantage of this key factor, either because of loose structure and complexity, or because the individual frames are restored poorly. This paper proposes a new SR reconstruction algorithm for images using Multi-grained Cascade Forest. Multi-frame image reconstruction is processed sequentially. Firstly, the image registration algorithm uses a convolutional neural network to register low-resolution image sequences, and then the images are reconstructed after registration by the Multi-grained Cascade Forest reconstruction algorithm. Finally, the reconstructed images are fused. The optimal algorithm is selected for each step  to get the most out of the details and tightly connect the internal logic of each sequential step.This novel approach proposed in this paper, in which the depth of the cascade forest is procedurally generated for recovered images, rather than being a constant. After training each layer, the recovered image is automatically evaluated, and new layers are constructed for training until an optimal restored image is obtained. Experiments show that this method improves the quality of image reconstruction while preserving the details of the image

    You Only Transfer What You Share: Intersection-Induced Graph Transfer Learning for Link Prediction

    Full text link
    Link prediction is central to many real-world applications, but its performance may be hampered when the graph of interest is sparse. To alleviate issues caused by sparsity, we investigate a previously overlooked phenomenon: in many cases, a densely connected, complementary graph can be found for the original graph. The denser graph may share nodes with the original graph, which offers a natural bridge for transferring selective, meaningful knowledge. We identify this setting as Graph Intersection-induced Transfer Learning (GITL), which is motivated by practical applications in e-commerce or academic co-authorship predictions. We develop a framework to effectively leverage the structural prior in this setting. We first create an intersection subgraph using the shared nodes between the two graphs, then transfer knowledge from the source-enriched intersection subgraph to the full target graph. In the second step, we consider two approaches: a modified label propagation, and a multi-layer perceptron (MLP) model in a teacher-student regime. Experimental results on proprietary e-commerce datasets and open-source citation graphs show that the proposed workflow outperforms existing transfer learning baselines that do not explicitly utilize the intersection structure.Comment: Accepted in TMLR (https://openreview.net/forum?id=Nn71AdKyYH

    Capacity Constrained Influence Maximization in Social Networks

    Full text link
    Influence maximization (IM) aims to identify a small number of influential individuals to maximize the information spread and finds applications in various fields. It was first introduced in the context of viral marketing, where a company pays a few influencers to promote the product. However, apart from the cost factor, the capacity of individuals to consume content poses challenges for implementing IM in real-world scenarios. For example, players on online gaming platforms can only interact with a limited number of friends. In addition, we observe that in these scenarios, (i) the initial adopters of promotion are likely to be the friends of influencers rather than the influencers themselves, and (ii) existing IM solutions produce sub-par results with high computational demands. Motivated by these observations, we propose a new IM variant called capacity constrained influence maximization (CIM), which aims to select a limited number of influential friends for each initial adopter such that the promotion can reach more users. To solve CIM effectively, we design two greedy algorithms, MG-Greedy and RR-Greedy, ensuring the 1/21/2-approximation ratio. To improve the efficiency, we devise the scalable implementation named RR-OPIM+ with (1/2ϵ)(1/2-\epsilon)-approximation and near-linear running time. We extensively evaluate the performance of 9 approaches on 6 real-world networks, and our solutions outperform all competitors in terms of result quality and running time. Additionally, we deploy RR-OPIM+ to online game scenarios, which improves the baseline considerably.Comment: The technical report of the paper entitled 'Capacity Constrained Influence Maximization in Social Networks' in SIGKDD'2

    Search Behavior Prediction: A Hypergraph Perspective

    Full text link
    Although the bipartite shopping graphs are straightforward to model search behavior, they suffer from two challenges: 1) The majority of items are sporadically searched and hence have noisy/sparse query associations, leading to a \textit{long-tail} distribution. 2) Infrequent queries are more likely to link to popular items, leading to another hurdle known as \textit{disassortative mixing}. To address these two challenges, we go beyond the bipartite graph to take a hypergraph perspective, introducing a new paradigm that leverages \underline{auxiliary} information from anonymized customer engagement sessions to assist the \underline{main task} of query-item link prediction. This auxiliary information is available at web scale in the form of search logs. We treat all items appearing in the same customer session as a single hyperedge. The hypothesis is that items in a customer session are unified by a common shopping interest. With these hyperedges, we augment the original bipartite graph into a new \textit{hypergraph}. We develop a \textit{\textbf{D}ual-\textbf{C}hannel \textbf{A}ttention-Based \textbf{H}ypergraph Neural Network} (\textbf{DCAH}), which synergizes information from two potentially noisy sources (original query-item edges and item-item hyperedges). In this way, items on the tail are better connected due to the extra hyperedges, thereby enhancing their link prediction performance. We further integrate DCAH with self-supervised graph pre-training and/or DropEdge training, both of which effectively alleviate disassortative mixing. Extensive experiments on three proprietary E-Commerce datasets show that DCAH yields significant improvements of up to \textbf{24.6\% in mean reciprocal rank (MRR)} and \textbf{48.3\% in recall} compared to GNN-based baselines. Our source code is available at \url{https://github.com/amazon-science/dual-channel-hypergraph-neural-network}.Comment: WSDM 202
    corecore