264 research outputs found

    Multi-dimensional data refining strategy for effective fine-tuning LLMs

    Full text link
    Data is a cornerstone for fine-tuning large language models, yet acquiring suitable data remains challenging. Challenges encompassed data scarcity, linguistic diversity, and domain-specific content. This paper presents lessons learned while crawling and refining data tailored for fine-tuning Vietnamese language models. Crafting such a dataset, while accounting for linguistic intricacies and striking a balance between inclusivity and accuracy, demands meticulous planning. Our paper presents a multidimensional strategy including leveraging existing datasets in the English language and developing customized data-crawling scripts with the assistance of generative AI tools. A fine-tuned LLM model for the Vietnamese language, which was produced using resultant datasets, demonstrated good performance while generating Vietnamese news articles from prompts. The study offers practical solutions and guidance for future fine-tuning models in languages like Vietnamese

    AI-assisted Learning for Electronic Engineering Courses in High Education

    Full text link
    This study evaluates the efficacy of ChatGPT as an AI teaching and learning support tool in an integrated circuit systems course at a higher education institution in an Asian country. Various question types were completed, and ChatGPT responses were assessed to gain valuable insights for further investigation. The objective is to assess ChatGPT's ability to provide insights, personalized support, and interactive learning experiences in engineering education. The study includes the evaluation and reflection of different stakeholders: students, lecturers, and engineers. The findings of this study shed light on the benefits and limitations of ChatGPT as an AI tool, paving the way for innovative learning approaches in technical disciplines. Furthermore, the study contributes to our understanding of how digital transformation is likely to unfold in the education sector

    PARAMETRIC INFORMATION BOTTLENECK TO OPTIMIZE STOCHASTIC NEURAL NETWORKS

    Get PDF
    Department of Computer Science and EngineeringIn this thesis, we present a layer-wise learning of Stochastic Neural Networks (SNNs) in an information-theoretic perspective. In each layer of an SNN, the compression and the relevance are defined to quantify the amount of information that the layer contains about the input space and the target space, respectively. We jointly optimize the compression and the relevance of all parameters in an SNN to better exploit the neural network???s representation. Previously, the Information Bottleneck (IB) ([1]) extracts relevant information for a target variable. Here, we propose Parametric Information Bottleneck (PIB) for a neural network by utilizing (only) its model parameters explicitly to approximate the compression and the relevance. We show that, the PIB framework can be considered as an extension of the Maximum Likelihood Estimate (MLE) principle to every layer level. We also show that, as compared to the MLE principle, PIB : (I) improves the generalization of neural networks in classification tasks, (ii) generates better samples in multi-modal prediction, (iii) is more efficient to exploit a neural network???s representation by pushing it closer to the optimal information-theoretical representation in a faster manner. Our PIB framework, therefore, shows a great potential from an information-theoretic perspective for exploiting neural networks??? representative power that have not yet been fully utilized.ope

    A Cosine Similarity-based Method for Out-of-Distribution Detection

    Full text link
    The ability to detect OOD data is a crucial aspect of practical machine learning applications. In this work, we show that cosine similarity between the test feature and the typical ID feature is a good indicator of OOD data. We propose Class Typical Matching (CTM), a post hoc OOD detection algorithm that uses a cosine similarity scoring function. Extensive experiments on multiple benchmarks show that CTM outperforms existing post hoc OOD detection methods.Comment: Accepted paper at ICML 2023 Workshop on Spurious Correlations, Invariance, and Stability. 10 pages (4 main + appendix

    Interdisciplinary education in the context of protection of water resources: A case study in Vietnam

    Get PDF
    The incorporation of interdisciplinary education, a topic of significant global interest, is increasingly being recognized as a key aspect of educational innovation in Vietnam. This recognition extends to several fields, including STEM (Science, Technology, Engineering, and Mathematics) education.This research aims to design and implement a STEM situation associated with the context of water protection in Vietnam for 10th-grade students in which students mobilize the knowledge of Physics (specific gravity, Archimedes' principle) and Mathematics (volume) to design a salinometer. This device measures the salinity of the water. The research methodology is based on the observed increase in saline levels in the coastal regions of Vietnam in recent years, which has had a substantial impact on agriculture and the livelihoods of millions of people. This methodology aims to provide realistic scenarios for students to address and resolve these problems. A total of forty students in the 10th grade were involved in a teaching situation that consisted of five distinct phases. Forty 10th-grade students participated in a teaching situation conducted in five phases. The results showed that the situation helped students strengthen and connect their physics and mathematics knowledge, create a vibrant learning atmosphere, enhance communication, and develop problem-solving competency. Furthermore, the teaching situation also needs to be revised regarding the measurement practices of Vietnamese students. The situation contributes to educating students' awareness of current events, protecting Vietnamese water resources, and the importance of sustainable development. In addition, we can use the same teaching process as in this research to develop other STEM teaching situations

    Sample Complexity of Offline Reinforcement Learning with Deep ReLU Networks

    Full text link
    We study the statistical theory of offline reinforcement learning (RL) with deep ReLU network function approximation. We analyze a variant of fitted-Q iteration (FQI) algorithm under a new dynamic condition that we call Besov dynamic closure, which encompasses the conditions from prior analyses for deep neural network function approximation. Under Besov dynamic closure, we prove that the FQI-type algorithm enjoys the sample complexity of O~(κ1+d/αϵ22d/α)\tilde{\mathcal{O}}\left( \kappa^{1 + d/\alpha} \cdot \epsilon^{-2 - 2d/\alpha} \right) where κ\kappa is a distribution shift measure, dd is the dimensionality of the state-action space, α\alpha is the (possibly fractional) smoothness parameter of the underlying MDP, and ϵ\epsilon is a user-specified precision. This is an improvement over the sample complexity of O~(Kκ2+d/αϵ2d/α)\tilde{\mathcal{O}}\left( K \cdot \kappa^{2 + d/\alpha} \cdot \epsilon^{-2 - d/\alpha} \right) in the prior result [Yang et al., 2019] where KK is an algorithmic iteration number which is arbitrarily large in practice. Importantly, our sample complexity is obtained under the new general dynamic condition and a data-dependent structure where the latter is either ignored in prior algorithms or improperly handled by prior analyses. This is the first comprehensive analysis for offline RL with deep ReLU network function approximation under a general setting.Comment: A short version published in the ICML Workshop on Reinforcement Learning Theory, 202

    Predicting Agricultural Commodities Prices with Machine Learning: A Review of Current Research

    Full text link
    Agricultural price prediction is crucial for farmers, policymakers, and other stakeholders in the agricultural sector. However, it is a challenging task due to the complex and dynamic nature of agricultural markets. Machine learning algorithms have the potential to revolutionize agricultural price prediction by improving accuracy, real-time prediction, customization, and integration. This paper reviews recent research on machine learning algorithms for agricultural price prediction. We discuss the importance of agriculture in developing countries and the problems associated with crop price falls. We then identify the challenges of predicting agricultural prices and highlight how machine learning algorithms can support better prediction. Next, we present a comprehensive analysis of recent research, discussing the strengths and weaknesses of various machine learning techniques. We conclude that machine learning has the potential to revolutionize agricultural price prediction, but further research is essential to address the limitations and challenges associated with this approach

    Microwave-assisted flow synthesis of multicore iron oxide nanoparticles

    Get PDF
    Coprecipitation is by far the most common synthesis method for iron oxide nanoparticles (IONPs). However, reproducibility and scalability represent a major challenge. Therefore, innovative processes for scalable production of IONPs are highly sought after. Here, we explored the combination of microwave heating with a flow reactor producing IONPs through coprecipitation. The synthesis was initially studied in a well-characterised microwave-heated flow system, enabling the synthesis of multicore IONPs, with control over both the single core size and the multicore hydrodynamic diameter. The effect of residence time and microwave power was investigated, enabling the synthesis of multicore nanostructures with hydrodynamic diameter between ∼35 and 70 nm, with single core size of 3–5 nm. Compared to particles produced under conventional heating, similar single core sizes were observed, though with smaller hydrodynamic diameters. The process comprised of the initial IONP coprecipitation followed by the addition of the stabiliser (citric acid and dextran). The ability of precisely controlling the stabiliser addition time (distinctive of flow reactors), contributed to the synthesis reproducibility. Finally, scale-up by increasing the reactor length and using a different microwave cavity was demonstrated, producing particles of similar structure as those from the small scale system, with a throughput of 3.3 g/h
    corecore