44 research outputs found

    Overcoming Overconfidence for Active Learning

    Full text link
    It is not an exaggeration to say that the recent progress in artificial intelligence technology depends on large-scale and high-quality data. Simultaneously, a prevalent issue exists everywhere: the budget for data labeling is constrained. Active learning is a prominent approach for addressing this issue, where valuable data for labeling is selected through a model and utilized to iteratively adjust the model. However, due to the limited amount of data in each iteration, the model is vulnerable to bias; thus, it is more likely to yield overconfident predictions. In this paper, we present two novel methods to address the problem of overconfidence that arises in the active learning scenario. The first is an augmentation strategy named Cross-Mix-and-Mix (CMaM), which aims to calibrate the model by expanding the limited training distribution. The second is a selection strategy named Ranked Margin Sampling (RankedMS), which prevents choosing data that leads to overly confident predictions. Through various experiments and analyses, we are able to demonstrate that our proposals facilitate efficient data selection by alleviating overconfidence, even though they are readily applicable

    Comparison of genetic variations between high- and low-risk Listeria monocytogenes isolates using whole-genome de novo sequencing

    Get PDF
    In this study, genetic variations and characteristics of Listeria monocytogenes isolates from enoki mushrooms (23), smoked ducks (7), and processed ground meat products (30) were examined with respect to hemolysis, virulence genes, growth patterns, and heat resistance. The isolates that showed the highest pathogenicity and the lowest pathogenicity were analyzed to obtain the whole-genome sequence, and the sequences were further analyzed to identify genetic variations in virulence, low-temperature growth-related, and heat resistance-related factors. All isolates had β-hemolysis and virulence genes (actA, hlyA, inlA, inlB, and plcB). At low temperatures, isolates with high growth (L. monocytogenes strains SMFM 201803 SD 1-1, SMFM 201803 SD 4-2, and SMFM 201804 SD 5-3) and low growth (L. monocytogenes strains SMFM 2019-FV43, SMFM 2019-FV42, and SMFM 2020-BT30) were selected. Among them, L. monocytogenes SMFM 201804 SD 5-3 showed the highest resistance at 60°C and 70°C. The strains SMFM 201804 SD 5-3 (high-risk) and SMFM 2019-FV43 (low-risk) harbored 45 virulence genes; 41 single nucleotide variants (SNVs) were identified between these two isolates. A comparison of 26 genes related to low-temperature growth revealed 18 SNVs between these two isolates; a comparison of the 21 genes related to heat resistance revealed 16 SNVs. These results indicate that the differences in the pathogenicity of L. monocytogenes SMFM 201804 SD 5-3 and L. monocytogenes SMFM 2019-FV43 are associated with the SNVs identified in virulence genes, low-temperature growth-related genes, and heat resistance-related genes

    iCSDB: an integrated database of CRISPR screens.

    Get PDF
    High-throughput screening based on CRISPR-Cas9 libraries has become an attractive and powerful technique to identify target genes for functional studies. However, accessibility of public data is limited due to the lack of user-friendly utilities and up-to-date resources covering experiments from third parties. Here, we describe iCSDB, an integrated database of CRISPR screening experiments using human cell lines. We compiled two major sources of CRISPR-Cas9 screening: the DepMap portal and BioGRID ORCS. DepMap portal itself is an integrated database that includes three large-scale projects of CRISPR screening. We additionally aggregated CRISPR screens from BioGRID ORCS that is a collection of screening results from PubMed articles. Currently, iCSDB contains 1375 genome-wide screens across 976 human cell lines, covering 28 tissues and 70 cancer types. Importantly, the batch effects from different CRISPR libraries were removed and the screening scores were converted into a single metric to estimate the knockout efficiency. Clinical and molecular information were also integrated to help users to select cell lines of interest readily. Furthermore, we have implemented various interactive tools and viewers to facilitate users to choose, examine and compare the screen results both at the gene and guide RNA levels. iCSDB is available at https://www.kobic.re.kr/icsdb/

    Genome-scale CRISPR screening identifies cell cycle and protein ubiquitination processes as druggable targets for erlotinib-resistant lung cancer.

    Get PDF
    Erlotinib is highly effective in lung cancer patients with epidermal growth factor receptor (EGFR) mutations. However, despite initial favorable responses, most patients rapidly develop resistance to erlotinib soon after the initial treatment. This study aims to identify new genes and pathways associated with erlotinib resistance mechanisms in order to develop novel therapeutic strategies. Here, we induced knockout (KO) mutations in erlotinib-resistant human lung cancer cells (NCI-H820) using a genome-scale CRISPR-Cas9 sgRNA library to screen for genes involved in erlotinib susceptibility. The spectrum of sgRNAs incorporated among erlotinib-treated cells was substantially different to that of the untreated cells. Gene set analyses showed a significant depletion of \u27cell cycle process\u27 and \u27protein ubiquitination pathway\u27 genes among erlotinib-treated cells. Chemical inhibitors targeting genes in these two pathways, such as nutlin-3 and carfilzomib, increased cancer cell death when combined with erlotinib in both in vitro cell line and in vivo patient-derived xenograft experiments. Therefore, we propose that targeting cell cycle processes or protein ubiquitination pathways are promising treatment strategies for overcoming resistance to EGFR inhibitors in lung cancer

    Monocular Depth Estimation from a Single Infrared Image

    No full text
    Thermal infrared imaging is attracting much attention due to its strength against illuminance variation. However, because of the spectral difference between thermal infrared images and RGB images, the existing research on self-supervised monocular depth estimation has performance limitations. Therefore, in this study, we propose a novel Self-Guided Framework using a Pseudolabel predicted from RGB images. Our proposed framework, which solves the problem of appearance matching loss in the existing framework, transfers the high accuracy of Pseudolabel to the thermal depth estimation network by comparing low- and high-level pixels. Furthermore, we propose Patch-NetVLAD Loss, which strengthens local detail and global context information in the depth map from thermal infrared imaging by comparing locally global patch-level descriptors. Finally, we introduce an Image Matching Loss to estimate a more accurate depth map in a thermal depth network by enhancing the performance of the Pseudolabel. We demonstrate that the proposed framework shows significant performance improvement even when applied to various depth networks in the KAIST Multispectral Dataset

    Monocular Depth Estimation from a Single Infrared Image

    No full text
    Thermal infrared imaging is attracting much attention due to its strength against illuminance variation. However, because of the spectral difference between thermal infrared images and RGB images, the existing research on self-supervised monocular depth estimation has performance limitations. Therefore, in this study, we propose a novel Self-Guided Framework using a Pseudolabel predicted from RGB images. Our proposed framework, which solves the problem of appearance matching loss in the existing framework, transfers the high accuracy of Pseudolabel to the thermal depth estimation network by comparing low- and high-level pixels. Furthermore, we propose Patch-NetVLAD Loss, which strengthens local detail and global context information in the depth map from thermal infrared imaging by comparing locally global patch-level descriptors. Finally, we introduce an Image Matching Loss to estimate a more accurate depth map in a thermal depth network by enhancing the performance of the Pseudolabel. We demonstrate that the proposed framework shows significant performance improvement even when applied to various depth networks in the KAIST Multispectral Dataset

    Analysis of the Functions of Backchannels and Fillers in Japanese and Korean : with Focus on Japanese \u22Hai\u22 and Korean \u22Ne\u22

    No full text
    This paper analyzes the functions of the Japanese \u22Hai\u22 and the Korean \u22Ne\u22 used both as backchannels and as fillers. The findings are as follows: (i) the \u22Hai\u22 and the \u22Ne\u22 used as backchannels function as markers expressing both the listener\u27s \u22understanding/accepting\u22 of the speaker\u27s utterance and the listener\u27s \u22agreeing\u22 with the speaker\u27s utterance, while the \u22Hai\u22 and the \u22Ne\u22 used as fillers share the functions that express \u22discoursive border\u22, \u22listener\u27s agreement with the speaker\u27s utterance\u22, \u22listener\u27s understanding of the speaker\u27s utterance\u22 and \u22speaker\u27s understanding of her own utterance\u22 , (ii) the \u22Hai\u22 used as filler has a function of expressing the beginning of an utterance, but the \u22Ne\u22 used as filler does not have such a function. From above, it is recognized that the \u22Hai\u22 and the \u22Ne\u22 used as both backchannels and fillers share the function of expressing an \u22agreement\u22. This \u22agreement\u22 is divided into two classes. When used as backchannels, they express the \u22listener\u27s\u22 agreement with the spoken content, while when used as fillers, they can also express the \u22speaker\u27s\u22 agreement with her own utterance. On the other hand, their function of expressing \u22discoursive border\u22 does not relate to any of the functions of them being used as backchannels

    日本人の不同意表明の仕方(2)

    No full text
    1. はじめに / 2. 研究方法 / 3. 前編の主な結果 / 4. 分析 / 5. 考察 / 6. まとめ / 7. 今後の課題This study examines the act of disagreement in Japanese conversations based on Brown and Levinson's (1987) factors of power and rating. Fifteen instances of disagreement have been collected from a corpus of formal conversation. The result shows that in Japanese, disagreements are unstated towards those who are of higher status and expressed overtly towards those who are of equal status such as close friends. It also shows that in Japanese disagreements are expressed using various linguistic strategies in order to avoid direct disagreement. However, in formal instances, The Japanese express disagreement more politely than in informal instances

    Instance-Aware Plant Disease Detection by Utilizing Saliency Map and Self-Supervised Pre-Training

    No full text
    Plant disease detection is essential for optimizing agricultural productivity and crop quality. With the recent advent of deep learning and large-scale plant disease datasets, many studies have shown high performance of supervised learning-based plant disease detectors. However, these studies still have limitations due to two aspects. First, labeling cost and class imbalance problems remain challenging in supervised learning-based methods. Second, plant disease datasets are either unstructured or weakly-unstructured and the shapes of leaves and diseased areas on them are variable, rendering plant disease detection even more challenging. To overcome these limitations, we propose an instance-aware unsupervised plant disease detector, which leverages normalizing flows, a visual saliency map and positional encodings. A novel way to explicitly combine these methods is the proposed model, in which the focus is on reducing background noise. In addition, to better fit the model to the plant disease detection domain and to enhance feature representation, a feature extractor is pre-trained in a self-supervised learning manner using only unlabeled data. In our extensive experiments, it is shown that the proposed approach achieves state-of-the-art performance on widely-used datasets, such as BRACOL (Weakly-unstructured) and PlantVillage (Unstructured), regardless of whether the dataset is weakly-structured or unstructured

    日韓談話スタイルにおける「あいづち」の基礎的研究

    No full text
    0. はじめに / 1. 先行研究の概観 / 2. 観察 / 3. 考察 / 4. まとめ / 参考文献This paper aims to investigate the similarities and difference between the Japanese backchannel and the Korean backchannel based on data of the interview programs which consist of one host, one hostess and one guest. The results are as follows: ①With respect to frequency, both in Japanese and in Korean, the forms belonging to the Yes-system appear most frequently. ②With respect to context where the backchannels in question appear, in Japanese the forms belonging to the Yes-system, A-system and Un-system appear after the final Ne particle, while in Korean most of the forms appear after Sɔ. ③The difference between the forms of the Yes-system, the A-system and the Un-system which take "the floor" and the same forms which do not take "the floor are as follows" in Japanese, the forms which take "the floor" appear frequently after the final particle Ne, while in Korean the forms which take "the floor" appear after Sɔ. With regard to the forms which do not take "the floor", both in Japanese and in Korean, they appear frequently after the interrogative particle Ka. This suggests that the same forms function differently depending on the whether the floor is present or not
    corecore