10 research outputs found

    CT-BERT: Learning Better Tabular Representations Through Cross-Table Pre-training

    Full text link
    Tabular data -- also known as structured data -- is one of the most common data forms in existence, thanks to the stable development and scaled deployment of database systems in the last few decades. At present however, despite the blast brought by large pre-trained models in other domains such as ChatGPT or SAM, how can we extract common knowledge across tables at a scale that may eventually lead to generalizable representation for tabular data remains a full blank. Indeed, there have been a few works around this topic. Most (if not all) of them are limited in the scope of a single table or fixed form of a schema. In this work, we first identify the crucial research challenges behind tabular data pre-training, particularly towards the cross-table scenario. We position the contribution of this work in two folds: (i)-we collect and curate nearly 2k high-quality tabular datasets, each of which is guaranteed to possess clear semantics, clean labels, and other necessary meta information. (ii)-we propose a novel framework that allows cross-table pre-training dubbed as CT-BERT. Noticeably, in light of pioneering the scaled cross-table training, CT-BERT is fully compatible with both supervised and self-supervised schemes, where the specific instantiation of CT-BERT is very much dependent on the downstream tasks. We further propose and implement a contrastive-learning-based and masked table modeling (MTM) objective into CT-BERT, that is inspired from computer vision and natural language processing communities but sophistically tailored to tables. The extensive empirical results on 15 datasets demonstrate CT-BERT's state-of-the-art performance, where both its supervised and self-supervised setups significantly outperform the prior approaches

    TableGPT: Towards Unifying Tables, Nature Language and Commands into One GPT

    Full text link
    Tables are prevalent in real-world databases, requiring significant time and effort for humans to analyze and manipulate. The advancements in large language models (LLMs) have made it possible to interact with tables using natural language input, bringing this capability closer to reality. In this paper, we present TableGPT, a unified fine-tuned framework that enables LLMs to understand and operate on tables using external functional commands. It introduces the capability to seamlessly interact with tables, enabling a wide range of functionalities such as question answering, data manipulation (e.g., insert, delete, query, and modify operations), data visualization, analysis report generation, and automated prediction. TableGPT aims to provide convenience and accessibility to users by empowering them to effortlessly leverage tabular data. At the core of TableGPT lies the novel concept of global tabular representations, which empowers LLMs to gain a comprehensive understanding of the entire table beyond meta-information. By jointly training LLMs on both table and text modalities, TableGPT achieves a deep understanding of tabular data and the ability to perform complex operations on tables through chain-of-command instructions. Importantly, TableGPT offers the advantage of being a self-contained system rather than relying on external API interfaces. Moreover, it supports efficient data process flow, query rejection (when appropriate) and private deployment, enabling faster domain data fine-tuning and ensuring data privacy, which enhances the framework's adaptability to specific use cases.Comment: Technical Repor

    Numerical Study on Seismic Performance of Buckling-Restrained Braced Double-Pier RC Bridge with Bolted Gusset Connections

    No full text
    Buckling-restrained braces (BRBs) have been widely employed in buildings and bridges due to their excellent ductility and energy dissipation capabilities. However, a considerable frame action will be introduced at the beam–column–brace joint for the traditional weld gusset connection under a severe earthquake. To reduce the negative frame action effect, three alternative bolted gusset connections were developed in this study. A buckling-restrained braced-double-pier RC bridge (BRB-RCB) model was constructed by ABAQUS and calibrated by the existing experimental tests. Parameter analyses were conducted to investigate the effects of various connection types on the seismic performance of the BRB-RCBs. It was found that the proposed bolted gusset connection effectively released the constraints of gusset-to-frame interfaces, resulting in a low stress level at the panel zone. The BRB-RCB with the well-designed bolted connection exhibited excellent seismic performance even when subjected to a lateral drift of 3%

    Gas-solid-liquid reactive CFD simulation of an industrial RFCC riser with investigation of feed injection

    No full text
    The feed injection zone involving gas & ndash;solid-liquid mixing, vaporization and reaction plays an important role in fluid catalytic cracking reactors. Our previous cold-model simulation indicated that a modification from the conventionally upward feed injection scheme to a downward one improves the oil-catalyst matching. This work aims to investigate the effects of such downward feed injection on the reaction behaviors through three-phase reactive simulation of an industrial riser. A twelve-lump kinetics and vaporization of liquid oil were considered in a multiphase Eulerian model with the energy minimization multi-scale model. The predicted solids flux, temperature, and solids concentration were compared with experimental data. The simulation showed that the downward injection with an angle of around 30 degrees and suitable mounting position helps improving the yield of gasoline and LPG and mitigating adhesion of the coking layer to the wall. We expect such an approach sheds light on the optimization of the three-phase chemical reactors. (c) 2021 Elsevier Ltd. All rights reserved
    corecore