693 research outputs found
Exploring Browsing Behavior of Product Information in an M-commerce Application: a Transaction Log Analysis
This research aims to describe the information browsing and merchandise purchasing behaviors of the users in an M-commerce application. Data used in this research comes from the transaction logs of 290 heavy users in March 2015. We established the mapping between the request parameters in the log and the user information behavior to future analyze the pattern of user behavior. People are most concerned about the details of items, and actively share their favorite items and shops to others. The times of view is power-law distribution. We also find that the items which are viewed 9 times and are included in the submitted order are most likely to be bought. There is a positive correlation between the purchase of items and the numbers of browsing and sharing behaviors
Characterization of severe fever with thrombocytopenia syndrome in rural regions of Zhejiang, China.
Severe fever with thrombocytopenia syndrome virus (SFTSV) infections have recently been found in rural regions of Zhejiang. A severe fever with thrombocytopenia syndrome (SFTS) surveillance and sero-epidemiological investigation was conducted in the districts with outbreaks. During the study period of 2011-2014, a total of 51 SFTSV infection cases were identified and the case fatality rate was 12% (6/51). Ninety two percent of the patients (47/51) were over 50 years of age, and 63% (32/51) of laboratory confirmed cases occurred from May to July. Nine percent (11/120) of the serum samples from local healthy people without symptoms were found to be positive for antibodies to the SFTS virus. SFTSV strains were isolated by culture using Vero, and the whole genomic sequences of two SFTSV strains (01 and Zhao) were sequenced and submitted to the GenBank. Homology analysis showed that the similarity of the target nucleocapsid gene from the SFTSV strains from different geographic areas was 94.2-100%. From the constructed phylogenetic tree, it was found that all the SFTSV strains diverged into two main clusters. Only the SFTSV strains from the Zhejiang (Daishan) region of China and the Yamaguchi, Miyazakj regions of Japan, were clustered into lineage II, consistent with both of these regions being isolated areas with similar geographic features. Two out of eight predicted linear B cell epitopes from the nucleocapsid protein showed mutations between the SFTSV strains of different clusters, but did not contribute to the binding ability of the specific SFTSV antibodies. This study confirmed that SFTSV has been circulating naturally and can cause a seasonal prevalence in Daishan, China. The results also suggest that the molecular characteristics of SFTSV are associated with the geographic region and all SFTSV strains can be divided into two genotypes
Ada-NETS: Face Clustering via Adaptive Neighbour Discovery in the Structure Space
Face clustering has attracted rising research interest recently to take
advantage of massive amounts of face images on the web. State-of-the-art
performance has been achieved by Graph Convolutional Networks (GCN) due to
their powerful representation capacity. However, existing GCN-based methods
build face graphs mainly according to kNN relations in the feature space, which
may lead to a lot of noise edges connecting two faces of different classes. The
face features will be polluted when messages pass along these noise edges, thus
degrading the performance of GCNs. In this paper, a novel algorithm named
Ada-NETS is proposed to cluster faces by constructing clean graphs for GCNs. In
Ada-NETS, each face is transformed to a new structure space, obtaining robust
features by considering face features of the neighbour images. Then, an
adaptive neighbour discovery strategy is proposed to determine a proper number
of edges connecting to each face image. It significantly reduces the noise
edges while maintaining the good ones to build a graph with clean yet rich
edges for GCNs to cluster faces. Experiments on multiple public clustering
datasets show that Ada-NETS significantly outperforms current state-of-the-art
methods, proving its superiority and generalization. Code is available at
https://github.com/damo-cv/Ada-NETS
DeepMAD: Mathematical Architecture Design for Deep Convolutional Neural Network
The rapid advances in Vision Transformer (ViT) refresh the state-of-the-art
performances in various vision tasks, overshadowing the conventional CNN-based
models. This ignites a few recent striking-back research in the CNN world
showing that pure CNN models can achieve as good performance as ViT models when
carefully tuned. While encouraging, designing such high-performance CNN models
is challenging, requiring non-trivial prior knowledge of network design. To
this end, a novel framework termed Mathematical Architecture Design for Deep
CNN (DeepMAD) is proposed to design high-performance CNN models in a principled
way. In DeepMAD, a CNN network is modeled as an information processing system
whose expressiveness and effectiveness can be analytically formulated by their
structural parameters. Then a constrained mathematical programming (MP) problem
is proposed to optimize these structural parameters. The MP problem can be
easily solved by off-the-shelf MP solvers on CPUs with a small memory
footprint. In addition, DeepMAD is a pure mathematical framework: no GPU or
training data is required during network design. The superiority of DeepMAD is
validated on multiple large-scale computer vision benchmark datasets. Notably
on ImageNet-1k, only using conventional convolutional layers, DeepMAD achieves
0.7% and 1.5% higher top-1 accuracy than ConvNeXt and Swin on Tiny level, and
0.8% and 0.9% higher on Small level.Comment: Accepted by CVPR 202
Weighted Gene Co-Expression Network Analysis Identifies an Immunogenic Cell Death Signature to Predict Therapeutic Responses and Prognosis of Glioblastoma
Background: Induction of immunogenic cell death (ICD) breaks down the immunosuppressive tumor microenvironment (TME) and controls tumor progression, but the correlation between glioblastoma (GBM) and ICD is unclear. Therefore, this study aims to investigate the potential prognostic value of ICD-associated genes in GBM. Methods: We collected 34 ICD-related genes from various sources. Utilizing public databases, we extracted relevant GBM data and delineated prognosis-related ICD gene modules using weighted gene co-expression network analysis (WGCNA). Least absolute shrinkage and selection operator (LASSO) algorithm was employed to develop a risk model, whose accuracy was confirmed by including an independent Gene Expression Omnibus (GEO) dataset. The biological functions and pathways associated with these signals were analyzed by performing enrichment analysis, and the tumor immune infiltration capacity was evaluated. The R package oncoPredict was used to infer the drug sensitivity of patients in different risk groups using data from the Genomics of Drug Sensitivity in Cancer 2 (GDSC2) database with expression profiling. Results: Thirty-four ICD-associated genes were differentially expressed in GBM samples and two gene modules significantly associated with prognosis were identified. Based on these gene modules, vitamin D receptor (VDR) and cell death-inducing DFF45-like effector B (CIDEB) were identified as two signature genes for the prognostic prediction of GBM. Subsequently, multivariate Cox analysis confirmed the validity of this signature as an independent factor for evaluating overall survival in GBM. Receiver operating characteristic (ROC) curves also supported an effective prediction of the signature (1-year area under the ROC curve (AUC): 0.667; 3-year AUC: 0.727; 5-year AUC: 0.762). We observed that the high-risk group had higher immune cell infiltration and sensitivity to some drugs. Conclusions: This work developed a novel ICD-related prognostic model for GBM patients. Our findings highlight the potential of using ICD as a promising prognosis indicator in GBM, contributing to the current understanding of the intricate interplay between ICD and tumor microenvironment
Deconfounding Causal Inference for Zero-shot Action Recognition
Zero-shot action recognition (ZSAR) aims to recognize unseen action categories in the test set without corresponding training examples. Most existing zero-shot methods follow the feature generation framework to transfer knowledge from seen action categories to model the feature distribution of unseen categories. However, due to the complexity and diversity of actions, it remains challenging to generate unseen feature distribution, especially for the cross-dataset scenario when there is potentially larger domain shift. This paper proposes a De confounding Ca usa l GAN (DeCalGAN) for generating unseen action video features with the following technical contributions: 1) Our model unifies compositional ZSAR with traditional visual-semantic models to incorporate local object information with global semantic information for feature generation. 2) A GAN-based architecture is proposed for causal inference and unseen distribution discovery. 3) A deconfounding module is proposed to refine representations of local object and global semantic information confounder in the training data. Action descriptions and random object feature after causal inference are then used to discover unseen distributions of novel actions in different datasets. Our extensive experiments on C ross- D ataset Z ero- S hot A ction R ecognition (CD-ZSAR) demonstrate substantial improvement over the UCF101 and HMDB51 standard benchmarks for this problem
Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation
Large language models (LLMs) have emerged as a new paradigm for Text-to-SQL
task. However, the absence of a systematical benchmark inhibits the development
of designing effective, efficient and economic LLM-based Text-to-SQL solutions.
To address this challenge, in this paper, we first conduct a systematical and
extensive comparison over existing prompt engineering methods, including
question representation, example selection and example organization, and with
these experimental results, we elaborate their pros and cons. Based on these
findings, we propose a new integrated solution, named DAIL-SQL, which refreshes
the Spider leaderboard with 86.6% execution accuracy and sets a new bar. To
explore the potential of open-source LLM, we investigate them in various
scenarios, and further enhance their performance with supervised fine-tuning.
Our explorations highlight open-source LLMs' potential in Text-to-SQL, as well
as the advantages and disadvantages of the supervised fine-tuning.
Additionally, towards an efficient and economic LLM-based Text-to-SQL solution,
we emphasize the token efficiency in prompt engineering and compare the prior
studies under this metric. We hope that our work provides a deeper
understanding of Text-to-SQL with LLMs, and inspires further investigations and
broad applications.Comment: We have released code on https://github.com/BeachWang/DAIL-SQ
Vibration characteristics of a compression ignition engine fuelled with different biodiesel-diesel blends
Biodiesel has wide application prospects due to its good power performance, fuel economy and emission reduction. Experimental studies have found that the measured engine vibration presents an N-shaped nonlinear trend with the increase of the biodiesel proportion in blends, which cannot be explained solely based on the combustion characteristics of blended fuels. To study the mechanisms for this nonlinear trend of engine vibration, a two-degree-of-freedom nonlinear model of piston–cylinder system was established and verified to analyse the correspondence between in-cylinder combustion behaviour and engine dynamic responses. By correlating simulation results with measured signals, it is found that the root cause of the nonlinear vibration trend is the coupling effect of in-cylinder pressure and piston inertial force. The time integral of piston lateral force in the interval from combustion top dead centre (TDC) to the subsequent piston slap ultimately determines the trend of liner vibrations. These key findings pave the fundamentals for the vibration analysis of engines fuelled with other alternative fuels, which is important for improve engine operation performances including reliability assessment and NVH control.</p
UniDM: A Unified Framework for Data Manipulation with Large Language Models
Designing effective data manipulation methods is a long standing problem in
data lakes. Traditional methods, which rely on rules or machine learning
models, require extensive human efforts on training data collection and tuning
models. Recent methods apply Large Language Models (LLMs) to resolve multiple
data manipulation tasks. They exhibit bright benefits in terms of performance
but still require customized designs to fit each specific task. This is very
costly and can not catch up with the requirements of big data lake platforms.
In this paper, inspired by the cross-task generality of LLMs on NLP tasks, we
pave the first step to design an automatic and general solution to tackle with
data manipulation tasks. We propose UniDM, a unified framework which
establishes a new paradigm to process data manipulation tasks using LLMs. UniDM
formalizes a number of data manipulation tasks in a unified form and abstracts
three main general steps to solve each task. We develop an automatic context
retrieval to allow the LLMs to retrieve data from data lakes, potentially
containing evidence and factual information. For each step, we design effective
prompts to guide LLMs to produce high quality results. By our comprehensive
evaluation on a variety of benchmarks, our UniDM exhibits great generality and
state-of-the-art performance on a wide variety of data manipulation tasks.Comment: MLSys2
- …
