273 research outputs found

    Sequential optimization for efficient high-quality object proposal generation

    Full text link
    We are motivated by the need for a generic object proposal generation algorithm which achieves good balance between object detection recall, proposal localization quality and computational efficiency. We propose a novel object proposal algorithm, BING ++, which inherits the virtue of good computational efficiency of BING [1] but significantly improves its proposal localization quality. At high level we formulate the problem of object proposal generation from a novel probabilistic perspective, based on which our BING++ manages to improve the localization quality by employing edges and segments to estimate object boundaries and update the proposals sequentially. We propose learning the parameters efficiently by searching for approximate solutions in a quantized parameter space for complexity reduction. We demonstrate the generalization of BING++ with the same fixed parameters across different object classes and datasets. Empirically our BING++ can run at half speed of BING on CPU, but significantly improve the localization quality by 18.5 and 16.7 percent on both VOC2007 and Microhsoft COCO datasets, respectively. Compared with other state-of-the-art approaches, BING++ can achieve comparable performance, but run significantly faster

    ATP: Adaptive Tensor Parallelism for Foundation Models

    Full text link
    Foundation models have impressive performance and generalization capabilities across a wide range of applications. The increasing size of the models introduces great challenges for the training. Tensor parallelism is a critical technique that is currently used in almost all foundation model training and has a significant impact on overall training performance. However, current tensor parallelism in machine learning frameworks misses optimization opportunities in fitting various interconnection topologies. In this work, we present ATP, an adaptive tensor parallelism framework for foundation models, which can automatically select the optimal parallel strategy on different interconnections. We propose column- and row-first tensor parallelism based on 2D device meshes and construct a search space. Combined with the hierarchical communication matrix, ATP can identify the optimal strategy in the search space. We also propose chunk-based overlapping to reduce communication overhead. Our evaluations show ATP consistently outperforms the state-of-the-art approaches for various model sizes and interconnects, achieving end-to-end training performance improvements of up to 37-64% on specific interconnects. Based on our theoretical model, the communication overhead of ATP decreases with scaling, indicating a qualitative leap forward

    Tin Nanoparticles Encapsulated Carbon Nanoboxes as High-Performance Anode for Lithium-Ion Batteries

    Get PDF
    One of the crucial challenges for applying Sn as an anode of lithium-ion batteries (LIBs) is the dramatic volume change during lithiation/delithiation process, which causes a rapid capacity fading and then deteriorated battery performance. To address this issue, herein, we report the design and fabrication of Sn encapsulated carbon nanoboxes (denoted as Sn@C) with yolk@shell architectures. In this design, the carbon shell can facilitate the good transport kinetics whereas the hollow space between Sn and carbon shell can accommodate the volume variation during repeated charge/discharge process. Accordingly, this composite electrode exhibits a high reversible capacity of 675 mAh g−1 at a current density of 0.8 A g−1 after 500 cycles and preserves as high as 366mAh g−1 at a higher current density of 3 A g−1 even after 930 cycles. The enhanced electrochemical performance can be ascribed to the crystal size reduction of Sn cores and the formation of polymeric gel-like layer outside the electrode surface after long-term cycles, resulting in improved capacity and enhanced rate performance

    Moderating New Waves of Online Hate with Chain-of-Thought Reasoning in Large Language Models

    Full text link
    Online hate is an escalating problem that negatively impacts the lives of Internet users, and is also subject to rapid changes due to evolving events, resulting in new waves of online hate that pose a critical threat. Detecting and mitigating these new waves present two key challenges: it demands reasoning-based complex decision-making to determine the presence of hateful content, and the limited availability of training samples hinders updating the detection model. To address this critical issue, we present a novel framework called HATEGUARD for effectively moderating new waves of online hate. HATEGUARD employs a reasoning-based approach that leverages the recently introduced chain-of-thought (CoT) prompting technique, harnessing the capabilities of large language models (LLMs). HATEGUARD further achieves prompt-based zero-shot detection by automatically generating and updating detection prompts with new derogatory terms and targets in new wave samples to effectively address new waves of online hate. To demonstrate the effectiveness of our approach, we compile a new dataset consisting of tweets related to three recently witnessed new waves: the 2022 Russian invasion of Ukraine, the 2021 insurrection of the US Capitol, and the COVID-19 pandemic. Our studies reveal crucial longitudinal patterns in these new waves concerning the evolution of events and the pressing need for techniques to rapidly update existing moderation tools to counteract them. Comparative evaluations against state-of-the-art tools illustrate the superiority of our framework, showcasing a substantial 22.22% to 83.33% improvement in detecting the three new waves of online hate. Our work highlights the severe threat posed by the emergence of new waves of online hate and represents a paradigm shift in addressing this threat practically.Comment: To Appear in the 45th IEEE Symposium on Security and Privacy, May 20-23, 202

    Charging and discharging in thermal energy storage unit with fin-stone hybrid structure for enhancing heat transfer of phase change materials

    Get PDF
    This work proposes a fin-stone hybrid structure integrating fins (popular thermal enhancers) and natural stones (widely used sensible heat storage media) to enhance the heat transfer of phase change materials for on-site thermal energy storage applications, with advantages of low cost, environmental friendliness, and easy accessibility. 3D numerical models of charging and discharging in shell-and-tube heat storage units with various configurations, including fins, the fin-stone hybrid structure, stones, and no heat transfer enhancement, were constructed, and the performance evaluation and comparison were carried out. Compared to fins, fin-stone hybrid structures with 20 mm-, 30 mm-, and 40 mm-sized stones shorten the charging time by 67%, 54%, and 56%, and the discharging time by 73%, 60%, and 46%, respectively. Small stones have better heat transfer enhancement, which is attributed to the small volume, large surface area, and contact with the tube and fins. The advantage of the fin-stone hybrid structure, i.e. the shortening of phase change time, is more significant in charging than in discharging, in comparison with stones, as both heat conduction and natural convection are enhanced. Moreover, the hybrid structure exhibits satisfactory temperature stability with a 48.9 °C temperature change in charging and 37.2 °C in discharging, each lower than the fins, which is beneficial to stabilise the heat transfer fluid outlet temperature. The yearly supplied energy of the hybrid structure with 20 mm-sized stones is 121% and 72% more than that of fins and stones, respectively

    ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models

    Full text link
    High-resolution Large Multimodal Models (LMMs) encounter the challenges of excessive visual tokens and quadratic visual complexity. Current high-resolution LMMs address the quadratic complexity while still generating excessive visual tokens. However, the redundancy in visual tokens is the key problem as it leads to more substantial compute. To mitigate this issue, we propose ConvLLaVA, which employs ConvNeXt, a hierarchical backbone, as the visual encoder of LMM to replace Vision Transformer (ViT). ConvLLaVA compresses high-resolution images into information-rich visual features, effectively preventing the generation of excessive visual tokens. To enhance the capabilities of ConvLLaVA, we propose two critical optimizations. Since the low-resolution pretrained ConvNeXt underperforms when directly applied on high resolution, we update it to bridge the gap. Moreover, since ConvNeXt's original compression ratio is inadequate for much higher resolution inputs, we train a successive stage to further compress the visual tokens, thereby reducing redundancy. These optimizations enable ConvLLaVA to support inputs of 1536x1536 resolution generating only 576 visual tokens, capable of handling images of arbitrary aspect ratios. Experimental results demonstrate that our method achieves competitive performance with state-of-the-art models on mainstream benchmarks. The ConvLLaVA model series are publicly available at https://github.com/alibaba/conv-llava.Comment: 17 page

    Potential and Limitations of LLMs in Capturing Structured Semantics: A Case Study on SRL

    Full text link
    Large Language Models (LLMs) play a crucial role in capturing structured semantics to enhance language understanding, improve interpretability, and reduce bias. Nevertheless, an ongoing controversy exists over the extent to which LLMs can grasp structured semantics. To assess this, we propose using Semantic Role Labeling (SRL) as a fundamental task to explore LLMs' ability to extract structured semantics. In our assessment, we employ the prompting approach, which leads to the creation of our few-shot SRL parser, called PromptSRL. PromptSRL enables LLMs to map natural languages to explicit semantic structures, which provides an interpretable window into the properties of LLMs. We find interesting potential: LLMs can indeed capture semantic structures, and scaling-up doesn't always mirror potential. Additionally, limitations of LLMs are observed in C-arguments, etc. Lastly, we are surprised to discover that significant overlap in the errors is made by both LLMs and untrained humans, accounting for almost 30% of all errors.Comment: Accepted by ICIC 202

    Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence

    Full text link
    The rapid advancement of large language models (LLMs) has paved the way for the development of highly capable autonomous agents. However, existing multi-agent frameworks often struggle with integrating diverse capable third-party agents due to reliance on agents defined within their own ecosystems. They also face challenges in simulating distributed environments, as most frameworks are limited to single-device setups. Furthermore, these frameworks often rely on hard-coded communication pipelines, limiting their adaptability to dynamic task requirements. Inspired by the concept of the Internet, we propose the Internet of Agents (IoA), a novel framework that addresses these limitations by providing a flexible and scalable platform for LLM-based multi-agent collaboration. IoA introduces an agent integration protocol, an instant-messaging-like architecture design, and dynamic mechanisms for agent teaming and conversation flow control. Through extensive experiments on general assistant tasks, embodied AI tasks, and retrieval-augmented generation benchmarks, we demonstrate that IoA consistently outperforms state-of-the-art baselines, showcasing its ability to facilitate effective collaboration among heterogeneous agents. IoA represents a step towards linking diverse agents in an Internet-like environment, where agents can seamlessly collaborate to achieve greater intelligence and capabilities. Our codebase has been released at \url{https://github.com/OpenBMB/IoA}.Comment: work in progres

    BING: Binarized normed gradients for objectness estimation at 300fps

    Get PDF
    Training a generic objectness measure to produce object proposals has recently become of significant interest. We observe that generic objects with well-defined closed boundaries can be detected by looking at the norm of gradients, with a suitable resizing of their corresponding image windows to a small fixed size. Based on this observation and computational reasons, we propose to resize the window to 8 × 8 and use the norm of the gradients as a simple 64D feature to describe it, for explicitly training a generic objectness measure. We further show how the binarized version of this feature, namely binarized normed gradients (BING), can be used for efficient objectness estimation, which requires only a few atomic operations (e.g., add, bitwise shift, etc.). To improve localization quality of the proposals while maintaining efficiency, we propose a novel fast segmentation method and demonstrate its effectiveness for improving BING’s localization performance, when used in multithresholding straddling expansion (MTSE) postprocessing. On the challenging PASCAL VOC2007 dataset, using 1000 proposals per image and intersectionover- union threshold of 0.5, our proposal method achieves a 95.6% object detection rate and 78.6% mean average best overlap in less than 0.005 second per image

    Tin Nanoparticles Encapsulated Carbon Nanoboxes as High-Performance Anode for Lithium-Ion Batteries

    Get PDF
    One of the crucial challenges for applying Sn as an anode of lithium-ion batteries (LIBs) is the dramatic volume change during lithiation/delithiation process, which causes a rapid capacity fading and then deteriorated battery performance. To address this issue, herein, we report the design and fabrication of Sn encapsulated carbon nanoboxes (denoted as Sn@C) with yolk@shell architectures. In this design, the carbon shell can facilitate the good transport kinetics whereas the hollow space between Sn and carbon shell can accommodate the volume variation during repeated charge/discharge process. Accordingly, this composite electrode exhibits a high reversible capacity of 675 mAh g−1 at a current density of 0.8 A g−1 after 500 cycles and preserves as high as 366 mAh g−1 at a higher current density of 3 A g−1 even after 930 cycles. The enhanced electrochemical performance can be ascribed to the crystal size reduction of Sn cores and the formation of polymeric gel-like layer outside the electrode surface after long-term cycles, resulting in improved capacity and enhanced rate performance
    corecore