Search CORE

13 research outputs found

Concept Design and Analysis of a Novel Steamer-Filling Robot

Author: Bin Li
Enhong Xing
Xinhua Zhao
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2017
Field of study

Steamer-filling operation is a crucially important process in the liquor-making process, directly related to liquor yield and liquor quality. But so far, this process is still dominated by manual operation. In view of working environment and labor shortages in this industry, a novel exclusive steamer-filling robot is proposed in this paper. Firstly, the steamer-filling operation process is described, and the structure composition and function realization of the robot are particularly introduced. Secondly, the kinematics problems in terms of position analysis and workspace of the robot are analyzed in detail. Thirdly, experimental analyses are made to prove the validity and efficiency of the robot system. Finally, some conclusions and the future developing direction are prescribed

Crossref

Directory of Open Access Journals

Transcribing Content from Structural Images with Spotlight Mechanism

Author: Chen Enhong
Hu Guoping
Huang Zhenya
Liu Qi
Xie Xing
Yin Yu
Zhang Fuzheng
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 26/05/2019
Field of study

Transcribing content from structural images, e.g., writing notes from music scores, is a challenging task as not only the content objects should be recognized, but the internal structure should also be preserved. Existing image recognition methods mainly work on images with simple content (e.g., text lines with characters), but are not capable to identify ones with more complex content (e.g., structured symbols), which often follow a fine-grained grammar. To this end, in this paper, we propose a hierarchical Spotlight Transcribing Network (STN) framework followed by a two-stage "where-to-what" solution. Specifically, we first decide "where-to-look" through a novel spotlight mechanism to focus on different areas of the original image following its structure. Then, we decide "what-to-write" by developing a GRU based network with the spotlight areas for transcribing the content accordingly. Moreover, we propose two implementations on the basis of STN, i.e., STNM and STNR, where the spotlight movement follows the Markov property and Recurrent modeling, respectively. We also design a reinforcement method to refine the framework by self-improving the spotlight mechanism. We conduct extensive experiments on many structural image datasets, where the results clearly demonstrate the effectiveness of STN framework.Comment: Accepted by KDD2018 Research Track. In proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'18

arXiv.org e-Print Archive

Crossref

Poly[aqua(μ5-2-oxido-4-sulfonatobenzoato)lanthanum(III)]

Author: Cao
Cheng-Feng Zhu
Enhong Sheng
Sheldrick
Wang
Xing Li
Publication venue: International Union of Crystallography
Publication date: 01/04/2009
Field of study

The title compound, [La(C7H3O6S)(H2O)]n, forms a three-dimensional framework in which the asymmetric unit contains one LaIII atom, one 5-sulfosalicylate (2-oxido-4-sulfonatobenzoate) ligand and one coordinated water molecule. The LaIII atom is coordinated by nine O atoms from three carboxylate, three sulfonate and two hydroxyl groups, and one water molecule, forming a distorted trigonal-prismatic square-face tricapped geometry

Crossref

Directory of Open Access Journals

PubMed Central

Cooperative Retriever and Ranker in Deep Recommenders

Author: Chen Enhong
Chen Jin
Huang Xu
Lian Defu
Liu Zheng
Xie Xing
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 29/03/2023
Field of study

Deep recommender systems (DRS) are intensively applied in modern web services. To deal with the massive web contents, DRS employs a two-stage workflow: retrieval and ranking, to generate its recommendation results. The retriever aims to select a small set of relevant candidates from the entire items with high efficiency; while the ranker, usually more precise but time-consuming, is supposed to further refine the best items from the retrieved candidates. Traditionally, the two components are trained either independently or within a simple cascading pipeline, which is prone to poor collaboration effect. Though some latest works suggested to train retriever and ranker jointly, there still exist many severe limitations: item distribution shift between training and inference, false negative, and misalignment of ranking order. As such, it remains to explore effective collaborations between retriever and ranker.Comment: 12pages, 4 figures, WWW'2

arXiv.org e-Print Archive

A Survey on Multimodal Large Language Models

Author: Chen Enhong
Fu Chaoyou
Li Ke
Sun Xing
Xu Tong
Yin Shukang
Zhao Sirui
Publication venue
Publication date: 23/06/2023
Field of study

Multimodal Large Language Model (MLLM) recently has been a new rising research hotspot, which uses powerful Large Language Models (LLMs) as a brain to perform multimodal tasks. The surprising emergent capabilities of MLLM, such as writing stories based on images and OCR-free math reasoning, are rare in traditional methods, suggesting a potential path to artificial general intelligence. In this paper, we aim to trace and summarize the recent progress of MLLM. First of all, we present the formulation of MLLM and delineate its related concepts. Then, we discuss the key techniques and applications, including Multimodal Instruction Tuning (M-IT), Multimodal In-Context Learning (M-ICL), Multimodal Chain of Thought (M-CoT), and LLM-Aided Visual Reasoning (LAVR). Finally, we discuss existing challenges and point out promising research directions. In light of the fact that the era of MLLM has only just begun, we will keep updating this survey and hope it can inspire more research. An associated GitHub link collecting the latest papers is available at https://github.com/BradyFU/Awesome-Multimodal-Large-Language-Models.Comment: Project page:https://github.com/BradyFU/Awesome-Multimodal-Large-Language-Model

arXiv.org e-Print Archive

Woodpecker: Hallucination Correction for Multimodal Large Language Models

Author: Chen Enhong
Fu Chaoyou
Li Ke
Shen Yunhang
Sui Dianbo
Sun Xing
Wang Hao
Xu Tong
Yin Shukang
Zhao Sirui
Publication venue
Publication date: 24/10/2023
Field of study

Hallucination is a big shadow hanging over the rapidly evolving Multimodal Large Language Models (MLLMs), referring to the phenomenon that the generated text is inconsistent with the image content. In order to mitigate hallucinations, existing studies mainly resort to an instruction-tuning manner that requires retraining the models with specific data. In this paper, we pave a different way, introducing a training-free method named Woodpecker. Like a woodpecker heals trees, it picks out and corrects hallucinations from the generated text. Concretely, Woodpecker consists of five stages: key concept extraction, question formulation, visual knowledge validation, visual claim generation, and hallucination correction. Implemented in a post-remedy manner, Woodpecker can easily serve different MLLMs, while being interpretable by accessing intermediate outputs of the five stages. We evaluate Woodpecker both quantitatively and qualitatively and show the huge potential of this new paradigm. On the POPE benchmark, our method obtains a 30.66%/24.33% improvement in accuracy over the baseline MiniGPT-4/mPLUG-Owl. The source code is released at https://github.com/BradyFU/Woodpecker.Comment: 16 pages, 7 figures. Code Website: https://github.com/BradyFU/Woodpecke

arXiv.org e-Print Archive

Recommended from our members

Predicting the Spatio-Temporal Evolution of Chronic Diseases in Population with Human Mobility Data

Author: Chen Enhong
Mascolo Cecilia
Noulas Anastasios
wang yingzi
Xie Xing
Zhou Xiao
Publication venue
Publication date: 05/09/2018
Field of study

Chronic diseases like cancer and diabetes are major threats to human life. Understanding the distribution and progression of chronic diseases of a population is important in assisting the allocation of medical resources as well as the design of policies in preemptive healthcare. Traditional methods to obtain large scale indicators on population health, e.g., surveys and statistical analysis, can be costly and time-consuming and often lead to a coarse spatio-temporal picture. In this paper, we leverage a dataset describing the human mobility patterns of citizens in a large metropolitan area. By viewing local human lifestyles we predict the evolution rate of several chronic diseases at the level of a city neighborhood. We apply the combination of a collaborative topic modeling (CTM) and a Gaussian mixture method (GMM) to tackle the data sparsity challenge and achieve robust predictions on health conditions simultaneously. Our method enables the analysis and prediction of disease rate evolution at fine spatio-temporal scales and demonstrates the potential of incorporating datasets from mobile web sources to improve population health monitoring. Evaluations using real-world check-in and chronic disease morbidity datasets in the city of London show that the proposed CTM+GMM model outperforms various baseline methods

Apollo (Cambridge)

Effects of total saponins from Rhizoma Dioscoreae Nipponicae on expression of vascular endothelial growth factor and angiopoietin-2 and Tie-2 receptors in the synovium of rats with rheumatoid arthritis

Author: Enhong Xing
Hongru Song
Wenjuan Dong
Xiujun Liang
Yachun Guo
Publication venue: 'Elsevier BV'
Publication date: 01/05/2016
Field of study

Background: This study aimed to determine the effects of total saponins from Rhizoma Dioscoreae Nipponicae (TS-RDN) on the expression of vascular endothelial growth factor (VEGF) and angiopoietin (Ang)-2 and Tie-2 (endothelial tyrosine kinase receptor) receptors in the synovium of rats with rheumatoid arthritis (RA) (collagen-induced arthritis; CIA), and to examine the mechanisms of TS-RDN in alleviating RA. Methods: The CIA rat model was established and the animals were randomly divided into control, CIA model, TS-RDN, diosgenin, and tripterygium groups. Fluorescent polymerase chain reaction was performed to detect VEGF expression in the rat knee joint synovium. Additionally, immunohistochemical assay was used to detect protein expression of Ang-2 and Tie-2 in the rat knee joint synovium. Results: Expression of VEGF, Ang-2, and Tie-2 in the model group was significantly higher than in the control group (p < 0.01). After TS-RDN, tripterygium and diosgenin treatment, VEGF and Ang-2 expression was lower than in the model group (p < 0.01). However, Tie-2 expression showed no significant difference. The effects of TS-RDN on VEGF expression were more marked than those of tripterygium and diosgenin (p < 0.01). Conclusion: TS-RDN might reduce the expression of VEGF, Ang-2, and Tie-2 in the synovium, thus inhibiting synovial angiogenesis and playing a therapeutic role in RA

Elsevier - Publisher Connector

Directory of Open Access Journals

A novelty-seeking based dining recommender system

Author: Chen Enhong
Xie Xing
Yuan Nicholas Jing
Zhang Fuzheng
Zheng Kai
Zhou Xiaofang
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2015
Field of study

The rapid growth of location-based services provide the potential to understand people's mobility pattern at an unprecedented level, which can also enable food-service industry to accurately predict consumer's dining behavior. In this paper, by leveraging users' historical dining pattern, socio-demographic characteristics and restaurants' attributes, we aim at generating the top-K restaurants for a user's next dining. Compared to previous studies in location prediction which mainly focus on regular mobility patterns, we present a novelty-seeking based dining recommender system, termed NDRS, in consideration of both exploration and exploitation. First, we apply a Conditional Random Field (CRF) with additional constraints to infer users' novelty-seeking statuses by considering both spatial-Temporal-historical features and users' socio-demographic characteristics. On the one hand, when a user is predicted to be novelty-seeking, by incorporating the influence of restaurants' contextual factors such as price and service quality, we propose a context-Aware collaborative filtering method to recommend restaurants she has never visited before. On the other hand, when a user is predicted to be not novelty-seeking, we then present a Hidden Markov Model (HMM) considering the temporal regularity to recommend the previously visited restaurants. To evaluate the performance of each component as well as the whole system, we conduct extensive experiments, with a large dataset we have collected covering the concerned dining related check-ins, users' demographics, and restaurants' attributes. The results reveal that our system is effective for dining recommendation

University of Queensland eSpace