67 research outputs found

    Borrowing Human Senses: Comment-Aware Self-Training for Social Media Multimodal Classification

    Full text link
    Social media is daily creating massive multimedia content with paired image and text, presenting the pressing need to automate the vision and language understanding for various multimodal classification tasks. Compared to the commonly researched visual-lingual data, social media posts tend to exhibit more implicit image-text relations. To better glue the cross-modal semantics therein, we capture hinting features from user comments, which are retrieved via jointly leveraging visual and lingual similarity. Afterwards, the classification tasks are explored via self-training in a teacher-student framework, motivated by the usually limited labeled data scales in existing benchmarks. Substantial experiments are conducted on four multimodal social media benchmarks for image text relation classification, sarcasm detection, sentiment classification, and hate speech detection. The results show that our method further advances the performance of previous state-of-the-art models, which do not employ comment modeling or self-training.Comment: accepted to EMNLP 202

    Simple algorithm for judging equivalence of differential-algebraic equation systems

    Get PDF
    Mathematical formulas play a prominent role in science, technology, engineering, and mathematics (STEM) documents; understanding STEM documents usually requires knowing the difference between equation groups containing multiple equations. When two equation groups can be transformed into the same form, we call the equation groups equivalent. Existing tools cannot judge the equivalence of two equation groups; thus, we develop an algorithm to judge such an equivalence using a computer algebra system. The proposed algorithm first eliminates variables appearing only in either equation group. It then checks the equivalence of the equations one by one: the equations with identical algebraic solutions for the same variable are judged equivalent. If each equation in one equation group is equivalent to an equation in the other, the equation groups are judged equivalent; otherwise, non-equivalent. We generated 50 pairs of equation groups for evaluation. The proposed method accurately judged the equivalence of all pairs. This method is expected to facilitate comprehension of a large amount of mathematical information in STEM documents. Furthermore, this is a necessary step for machines to understand equations, including process models

    Topic-Guided Self-Introduction Generation for Social Media Users

    Full text link
    Millions of users are active on social media. To allow users to better showcase themselves and network with others, we explore the auto-generation of social media self-introduction, a short sentence outlining a user's personal interests. While most prior work profiles users with tags (e.g., ages), we investigate sentence-level self-introductions to provide a more natural and engaging way for users to know each other. Here we exploit a user's tweeting history to generate their self-introduction. The task is non-trivial because the history content may be lengthy, noisy, and exhibit various personal interests. To address this challenge, we propose a novel unified topic-guided encoder-decoder (UTGED) framework; it models latent topics to reflect salient user interest, whose topic mixture then guides encoding a user's history and topic words control decoding their self-introduction. For experiments, we collect a large-scale Twitter dataset, and extensive results show the superiority of our UTGED to the advanced encoder-decoder models without topic modeling

    Causal relationship between gut microbiota and immune thrombocytopenia: a Mendelian randomization study of two samples

    Get PDF
    BackgroundSome observational studies have shown that immune thrombocytopenia (ITP) is highly associated with the alteration-composition of gut microbiota. However, the causality of gut microbiota on ITP has not yet been determined.MethodsBased on accessible summary statistics of the genome-wide union, the latent connection between ITP and gut microbiota was estimated using bi-directional Mendelian randomization (MR) and multivariable MR (MVMR) analyses. Inverse variance weighted (IVW), weighted median analyses, and MR-Egger regression methods were performed to examine the causal correlation between ITP and the gut microbiota. Several sensitivity analyses verified the MR results. The strength of causal relationships was evaluated using the MR-Steiger test. MVMR analysis was undertaken to test the independent causal effect. MR analyses of reverse direction were made to exclude the potential of reverse correlations. Finally, GO enrichment analyses were carried out to explore the biological functions.ResultsAfter FDR adjustment, two microbial taxa were identified to be causally associated with ITP (PFDR < 0.10), namely Alcaligenaceae (PFDR = 7.31 × 10–2) and Methanobacteriaceae (PFDR = 7.31 × 10–2). In addition, eight microbial taxa were considered as potentially causal features under the nominal significance (P < 0.05): Actinobacteria, Lachnospiraceae, Methanobacteria, Bacillales, Methanobacteriales, Coprococcus2, Gordonibacter, and Veillonella. According to the reverse-direction MR study findings, the gut microbiota was not significantly affected by ITP. There was no discernible horizontal pleiotropy or instrument heterogeneity. Finally, GO enrichment analyses showed how the identified microbial taxa participate in ITP through their underlying biological mechanisms.ConclusionSeveral microbial taxa were discovered to be causally linked to ITP in this MR investigation. The findings improve our understanding of the gut microbiome in the risk of ITP

    Microalgae-mediated tandem culture of shrimp and bivalve: an environmental and health co-benefits solution for phosphorus recovery and emission reduction

    Get PDF
    Phosphorus (P) accumulation in aquaculture systems is damaging our environment beyond acceptable levels. Devising strategies to potentially recover P from aquaculture systems in a reusable bioresource form is paramount and aligns with circular economy policies. In this study, we constructed two culture models, monoculture (Mon) and tandem culture (Tan), using Exopalaemon carinicauda and Mercenaria mercenaria. By monitoring the performance of rearing organisms, P dynamic patterns, and pollutant emissions, we found that: i) Compared to the Mon system, the Tan system demonstrated no differences in the performance of E. carinicauda and M. mercenaria, suggesting that the Tan model was viable in terms of fishery yield; ii) P in the Tan system could be efficiently recovered and removed from water and sediment, as indicated by the lower phosphate concentration in water (0.01 mg L−1), and the decrease in labile P in surface sediment (from 0.04 to 0.02 mg L−1). A combination of assimilatory and dissimilatory processes, mediated by phototrophic (bait-microalgae) and heterotrophic organisms (bivalves), appeared to be the primary mechanism for P utilization and removal; iii) The Tan system reduced pollutant emissions four times lower than the Mon system due to its minimal tailwater discharge (10%, 230 L). The emissions of total P, phosphate, total organic carbon, ammonium, and chemical oxygen demand from the Tan systems were 19 mg m−2 d−1, 2 mg m−2 d−1, 2 g m−2 d−1, 38 mg m−2 d−1, and 11 g m−2 d−1, respectively, 1.3, 1.7, 1.4, 1.3, and 1.2 times lower than those from the Mon systems. The eco-friendly Tan culture model fully exploited the resources of pond culture, a solution with environmental and health co-benefits for P recovery and emission reduction

    CCKAR is a biomarker for prognosis and asynchronous brain metastasis of non-small cell lung cancer

    Get PDF
    BackgroundNon-small cell lung cancer (NSCLC) is the most common histological type of lung cancer, and brain metastasis (BM) is the most lethal complication of NSCLC. The predictive biomarkers and risk factors of asynchronous BM are still unknown.Materials and methodsA total of 203 patients with NSCLC were enrolled into our cohort and followed up. The clinicopathological factors such as tumor size, T stage, lymphatic invasion, metastasis and asynchronous BM were investigated. CCKAR expression in NSCLC and resected BM was assessed by IHC, and CCKAR mRNAs in NSCLC and para-tumor tissues were estimated by qRT-PCR. The correlations between CCKAR expression, BM and other clinicopathological factors were assessed by chi-square test, and prognostic significance of CCKAR was estimated by univariate and multivariate analyses.ResultsCCKAR was highly expressed in NSCLC tissues compared with para-tumor tissues. CCKAR expression in NSCLC was significantly associated with asynchronous BM. The BM percentages for NSCLC patients with low and high CCKAR were surprisingly 5.2% and 66.6%, respectively. CCKAR expression and BM were unfavorable factors predicting unfavorable outcome of NSCLC. Moreover, CCKAR expression in NSCLC was an independent risk factor of asynchronous BM.ConclusionsCCKAR is a prognostic biomarker of NSCLC. CCKAR expression in NSCLC is positively associated with asynchronous BM, and is a risk factor of asynchronous BM from NSCLC

    Single well control area splitting method based on reservoir sphysical properties and gas well productivity differences

    No full text
    The determination of the control area of a single well is the prerequisite for the evaluation of the reserves of a single well. The current calculation methods of the control area of a single well are mainly divided into: experience formula, area balancing method, and the physical model, in order to solve the different limitations of the existing single-well control area splitting method and the problem of large error in use, this paper puts forward a kind of based on gas reservoir physical property and the growth of single well productivity difference algorithm for single well control area is split, according to the results of the split combining static reservoir parameters, using volumetric method for single well and the calculation of reserves of gas reservoir evaluation, further clarify the original and the remaining gas distribution of gas reservoir, for the subsequent reasonable development of the gas reservoir and enhance oil recovery. In this paper, block S of Sulige gas field is taken as an example, and the geological reserve of block is calculated as 354.75×108m3, compared with the basic proven reserves of Block S, 364.84×108m3, the error is 2.61% and the reliability is stron

    Distributed random load balancing

    No full text
    Low latency is highly desirable for cloud services spanning thousands of servers. With the rapid development of cloud market, the size of server farms grows fast. Hence, stringent timing requirements are needed for task scheduling in a large-scale server farm. Conventionally, the Join-the-Shortest-Queue (JSQ) algorithm, which directs an arriving task to the least loaded server, is adopted in scheduling. Despite its excellent delay performance, JSQ is throughput-limited, and thus doesn't scale well with the number of servers. There are two distributed algorithms proposed as "approximations" of the idealized JSQ. The first one is the Power-of-d-choices (Pod) algorithm, which selects d servers at random and routes a task to the least loaded server of the d servers. Despite its scalability, Pod suffers from long tail response times. The second one is the distributed Join-the-Idle-Queue (JIQ), which take advantages idle servers for task scheduling. In this thesis, we are interested in exploring Pod and JIQ further. First, a hybrid scheduling strategy called Pod-Helper is proposed. It consists of a Pod scheduler and a throughput-limited helper. Hybrid scheduling takes the best of both worlds, enjoying scalability and low tail response times. In particular, hybrid scheduling has bounded maximum queue size in the large-system regime, which is in sharp contrast to the Pod scheduling whose maximum queue size is unbounded. Second, we conduct an in-depth analysis for distributed Join-the-Idle-Queue (JIQ), a promising new approximation of an idealized task-scheduling algorithm. In particular, we derive semi-closed form expressions for the delay performance of distributed JIQ. Third, we propose a new variant of distributed JIQ that offers clear advantages over alternative algorithms for large systems.Applied Science, Faculty ofEngineering, School of (Okanagan)Graduat
    corecore