110 research outputs found

    Hierarchical clustering-based navigation of image search results

    Full text link
    Usually, the image search results contain multiple topics on semantic level and even semantically consistent images have diverse appearances on visual level. How to organize the results into semantically and visually consistent clusters becomes a necessary task to facilitate users ’ navigation. To attack this, HiCluster, an effective method to organize image search results is designed in this paper, which employs both textual and visual analysis. First, we extract some query-related key phrases to enumerate specific semantics of the given query and cluster them into some semantic clusters using K-lines-based clustering algorithm. Second, the resulting images corresponding to each key phrase are clustered with Bregman Bubble Clustering (BBC) algorithm, which partially groups images in the whole set while discarding some scattered noisy ones. At last, a novel user interface (UI) is designed to provide users with the diverse and helpful information based on the hierarchical clustering structure. Experiments on web images demonstrate the effectiveness and potential of the system

    Designing Wireless Networks for Delay-Sensitive Internet of Things

    Get PDF
    Internet of Things (IoT) applications have stringent requirements on the wireless network delay, but have to share and compete for the limited bandwidth with other wireless traffic. Traditional schemes adopt various QoS-aware traffic scheduling techniques, but fail when the amount of network traffic further increases. In addition, CSMA with collision avoidance (CSMA/CA) mechanism enables the coexistence of multiple wireless links but avoids concurrent transmissions, yielding severe channel access delay on the delay-sensitive traffic when the channel is busy. To address the aforementioned limitations, we present two novel designs of wireless side channel, which operate concurrently with the existing wireless network channel without occupying extra spectrum, but dedicates to real-time traffic. Our key insight of realizing such side channel is to exploit the excessive SNR margin in the wireless network by encoding data as patterned interference. First, we design such patterned interference in form of energy erasure over specific subcarriers in OFDM systems. Delay-sensitive messages can be delivered simultaneously along with other traffic from the same transmitter, which reduces the network queuing delay. Furthermore, we propose EasyPass, another side channel design that encodes data in the same OFDM scheme as being used by the main channel, but using weaker power and narrower frequency bands. By adapting the side channel's transmit power under the main channel's SNR margin, the simultaneous main channel transmission would suffer little degradation. EasyPass reduces the channel access delay by providing extra transmission opportunities when the channel is occupied by other links. Last, we present a novel modulation design that transforms the choices of link rate adaptation from discrete to continuous. With minimum extra overhead, it improves the network throughput and therefore reduces the network delay

    Development and validation of a web-based predictive model for preoperative diagnosis of localized colorectal cancer and colorectal adenoma

    Get PDF
    BackgroundLocalized colorectal cancer (LCC) has obscure clinical signs, which are difficult to distinguish from colorectal adenoma (CA). This study aimed to develop and validate a web-based predictive model for preoperative diagnosis of LCC and CA.MethodsWe conducted a retrospective study that included data from 500 patients with LCC and 980 patients with CA who were admitted to Dongyang People’s Hospital between November 2012 and June 2022. Patients were randomly divided into the training (n=1036) and validation (n=444) cohorts. Univariate logistic regression, least absolute shrinkage and selection operator regression, and multivariate logistic regression were used to select the variables for predictive models. The area under the curve (AUC), calibration curve, decision curve analysis (DCA), and clinical impact curve (CIC) were used to evaluate the performance of the model.ResultsThe web-based predictive model was developed, including nine independent risk factors: age, sex, drinking history, white blood cell count, lymphocyte count, red blood cell distribution width, albumin, carcinoembryonic antigen, and fecal occult blood test. The AUC of the prediction model in the training and validation cohorts was 0.910 (0.892–0.929) and 0.894 (0.862–0.925), respectively. The calibration curve showed good consistency between the outcome predicted by the model and the actual diagnosis. DCA and CIC showed that the predictive model had a good clinical application value.ConclusionThis study first developed a web-based preoperative prediction model, which can discriminate LCC from CA and can be used to quantitatively assess the risks and benefits in clinical practice

    Lithium niobate-enhanced laser photoacoustic spectroscopy

    Full text link
    In this paper, the photoacoustic spectroscopy technique based on lithium niobate crystals is initially reported, to our knowledge. A novel dual-cantilever tuning fork structure and new electrodes have been designed using Y-cut 128{\deg} blackened lithium niobate wafers. The tuning fork, with a resonant frequency of only 10.46 kHz and a prong gap of 1 mm, is engineered to achieve superior performance in photoacoustic spectroscopy. In the demonstration experiment, acetylene was detected using a 1.53 um semiconductor laser, achieving a detection limit of about 9 ppb within a one-second integration time.Comment: 8 pages, 4 figure

    Not All Metrics Are Guilty: Improving NLG Evaluation with LLM Paraphrasing

    Full text link
    Most research about natural language generation (NLG) relies on evaluation benchmarks with limited references for a sample, which may result in poor correlations with human judgements. The underlying reason is that one semantic meaning can actually be expressed in different forms, and the evaluation with a single or few references may not accurately reflect the quality of the model's hypotheses. To address this issue, this paper presents a novel method, named Para-Ref, to enhance existing evaluation benchmarks by enriching the number of references. We leverage large language models (LLMs) to paraphrase a single reference into multiple high-quality ones in diverse expressions. Experimental results on representative NLG tasks of machine translation, text summarization, and image caption demonstrate that our method can effectively improve the correlation with human evaluation for sixteen automatic evaluation metrics by +7.82% in ratio. We release the code and data at https://github.com/RUCAIBox/Para-Ref

    The impact of water and technology on the stock market in China and Japan based on a new heuristic model

    Get PDF
    Introduction: Water, as an essential strategic resource, is diminishing; this has been framed as a financial risk. We aim to quantitively investigate the impact of water use and technology on the stock market and compare the differences in China and Japan, which represent emerging and mature markets, respectively.Method: We constructed three models using the difference generalized method of moments. The first and second models focused on how water use could influence stock market volatility and returns; the third model added technology as an interaction to explore its impact on the above mechanism. We used an ARIMA-EGARCH model to predict the trend of marginal stock market return with an increase in industrial water use in the next 5 years.Results and Discussion: The results show that 1) water use increases the stock market volatility in both countries, but Japan shows a greater increase than China; 2) water use has a negative impact on stock market returns in China and a positive impact in Japan; 3) technology plays a positive role in the second model, while the ARIMA-EGARCH results correspond to the first two conclusions, which verifies the reasonability of the models. We conclude that heterogeneity exists in the two different market types because of technology level

    Toward a Team of AI-made Scientists for Scientific Discovery from Gene Expression Data

    Full text link
    Machine learning has emerged as a powerful tool for scientific discovery, enabling researchers to extract meaningful insights from complex datasets. For instance, it has facilitated the identification of disease-predictive genes from gene expression data, significantly advancing healthcare. However, the traditional process for analyzing such datasets demands substantial human effort and expertise for the data selection, processing, and analysis. To address this challenge, we introduce a novel framework, a Team of AI-made Scientists (TAIS), designed to streamline the scientific discovery pipeline. TAIS comprises simulated roles, including a project manager, data engineer, and domain expert, each represented by a Large Language Model (LLM). These roles collaborate to replicate the tasks typically performed by data scientists, with a specific focus on identifying disease-predictive genes. Furthermore, we have curated a benchmark dataset to assess TAIS's effectiveness in gene identification, demonstrating our system's potential to significantly enhance the efficiency and scope of scientific exploration. Our findings represent a solid step towards automating scientific discovery through large language models.Comment: 18 pages, 2 figures; added contac

    Engineering human ventricular heart muscles based on a highly efficient system for purification of human pluripotent stem cell-derived ventricular cardiomyocytes

    Get PDF
    Background Most infarctions occur in the left anterior descending coronary artery and cause myocardium damage of the left ventricle. Although current pluripotent stem cells (PSCs) and directed cardiac differentiation techniques are able to generate fetal-like human cardiomyocytes, isolation of pure ventricular cardiomyocytes has been challenging. For repairing ventricular damage, we aimed to establish a highly efficient purification system to obtain homogeneous ventricular cardiomyocytes and prepare engineered human ventricular heart muscles in a dish. Methods The purification system used TALEN-mediated genomic editing techniques to insert the neomycin or EGFP selection marker directly after the myosin light chain 2 (MYL2) locus in human pluripotent stem cells. Purified early ventricular cardiomyocytes were estimated by immunofluorescence, fluorescence-activated cell sorting, quantitative PCR, microelectrode array, and patch clamp. In subsequent experiments, the mixture of mature MYL2-positive ventricular cardiomyocytes and mesenchymal cells were cocultured with decellularized natural heart matrix. Histological and electrophysiology analyses of the formed tissues were performed 2 weeks later. Results Human ventricular cardiomyocytes were efficiently isolated based on the purification system using G418 or flow cytometry selection. When combined with the decellularized natural heart matrix as the scaffold, functional human ventricular heart muscles were prepared in a dish. Conclusions These engineered human ventricular muscles can be great tools for regenerative therapy of human ventricular damage as well as drug screening and ventricular-specific disease modeling in the future. Electronic supplementary material The online version of this article (doi:10.1186/s13287-017-0651-x) contains supplementary material, which is available to authorized users

    Comprehensive Bibliometric Analysis of the Kynurenine Pathway in Mood Disorders: Focus on Gut Microbiota Research

    Get PDF
    Background: Emerging evidence implicates the dysregulated kynurenine pathway (KP), an immune-inflammatory pathway, in the pathophysiology of mood disorders (MD), including depression and bipolar disorder characterized by a low-grade chronic pro-inflammatory state. The metabolites of the KP, an important part of the microbiota-gut-brain axis, serve as immune system modulators linking the gut microbiota (GM) with the host central nervous system.Aim: This bibliometric analysis aimed to provide a first glimpse into the KP in MD, with a focus on GM research in this field, to guide future research and promote the development of this field.Methods: Publications relating to the KP in MD between the years 2000 and 2020 were retrieved from the Scopus and Web of Science Core Collection (WoSCC), and analyzed in CiteSpace (5.7 R5W), biblioshiny (using R-Studio), and VOSviewer (1.6.16).Results: In total, 1,064 and 948 documents were extracted from the Scopus and WoSCC databases, respectively. The publications have shown rapid growth since 2006, partly owing to the largest research hotspot appearing since then, “quinolinic acid.” All the top five most relevant journals were in the neuropsychiatry field, such as Brain Behavior and Immunity. The United States and Innsbruck Medical University were the most influential country and institute, respectively. Journal co-citation analysis showed a strong tendency toward co-citation of research in the psychiatry field. Reference co-citation analysis revealed that the top four most important research focuses were “kynurenine pathway,” “psychoneuroimmunology,” “indoleamine 2,3-dioxygenase,” and “proinflammatory cytokines,” and the most recent focus was “gut-brain axis,” thus indicating the role of the KP in bridging the GM and the host immune system, and together reflecting the field’s research foundations. Overlap analysis between the thematic map of keywords and the keyword burst analysis revealed that the topics “Alzheimer’s disease,” “prefrontal cortex,” and “acid,” were research frontiers.Conclusion: This comprehensive bibliometric study provides an updated perspective on research associated with the KP in MD, with a focus on the current status of GM research in this field. This perspective may benefit researchers in choosing suitable journals and collaborators, and aid in the further understanding of the field’s hotspots and frontiers, thus facilitating future research

    Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models

    Full text link
    We introduce Generalized Instruction Tuning (called GLAN), a general and scalable method for instruction tuning of Large Language Models (LLMs). Unlike prior work that relies on seed examples or existing datasets to construct instruction tuning data, GLAN exclusively utilizes a pre-curated taxonomy of human knowledge and capabilities as input and generates large-scale synthetic instruction data across all disciplines. Specifically, inspired by the systematic structure in human education system, we build the taxonomy by decomposing human knowledge and capabilities to various fields, sub-fields and ultimately, distinct disciplines semi-automatically, facilitated by LLMs. Subsequently, we generate a comprehensive list of subjects for every discipline and proceed to design a syllabus tailored to each subject, again utilizing LLMs. With the fine-grained key concepts detailed in every class session of the syllabus, we are able to generate diverse instructions with a broad coverage across the entire spectrum of human knowledge and skills. Extensive experiments on large language models (e.g., Mistral) demonstrate that GLAN excels in multiple dimensions from mathematical reasoning, coding, academic exams, logical reasoning to general instruction following without using task-specific training data of these tasks. In addition, GLAN allows for easy customization and new fields or skills can be added by simply incorporating a new node into our taxonomy.Comment: Work in progres
    • …
    corecore