5,333 research outputs found

    Fast Machine Learning Method with Vector Embedding on Orthonormal Basis and Spectral Transform

    Full text link
    This paper presents a novel fast machine learning method that leverages two techniques: Vector Embedding on Orthonormal Basis (VEOB) and Spectral Transform (ST). The VEOB converts the original data encoding into a vector embedding with coordinates projected onto orthonormal bases. The Singular Value Decomposition (SVD) technique is used to calculate the vector basis and projection coordinates, leading to an enhanced distance measurement in the embedding space and facilitating data compression by preserving the projection vectors associated with the largest singular values. On the other hand, ST transforms sequence of vector data into spectral space. By applying the Discrete Cosine Transform (DCT) and selecting the most significant components, it streamlines the handling of lengthy vector sequences. The paper provides examples of word embedding, text chunk embedding, and image embedding, implemented in Julia language with a vector database. It also investigates unsupervised learning and supervised learning using this method, along with strategies for handling large data volumes.Comment: update 9. Strategies for managing large data volumes with 9.1. Using incremental SV

    A Comparative Study on Deep Learning Models for Text Classification of Unstructured Medical Notes with Various Levels of Class Imbalance

    Get PDF
    Background Discharge medical notes written by physicians contain important information about the health condition of patients. Many deep learning algorithms have been successfully applied to extract important information from unstructured medical notes data that can entail subsequent actionable results in the medical domain. This study aims to explore the model performance of various deep learning algorithms in text classification tasks on medical notes with respect to different disease class imbalance scenarios. Methods In this study, we employed seven artificial intelligence models, a CNN (Convolutional Neural Network), a Transformer encoder, a pretrained BERT (Bidirectional Encoder Representations from Transformers), and four typical sequence neural networks models, namely, RNN (Recurrent Neural Network), GRU (Gated Recurrent Unit), LSTM (Long Short-Term Memory), and Bi-LSTM (Bi-directional Long Short-Term Memory) to classify the presence or absence of 16 disease conditions from patients’ discharge summary notes. We analyzed this question as a composition of 16 binary separate classification problems. The model performance of the seven models on each of the 16 datasets with various levels of imbalance between classes were compared in terms of AUC-ROC (Area Under the Curve of the Receiver Operating Characteristic), AUC-PR (Area Under the Curve of Precision and Recall), F1 Score, and Balanced Accuracy as well as the training time. The model performances were also compared in combination with different word embedding approaches (GloVe, BioWordVec, and no pre-trained word embeddings). Results The analyses of these 16 binary classification problems showed that the Transformer encoder model performs the best in nearly all scenarios. In addition, when the disease prevalence is close to or greater than 50%, the Convolutional Neural Network model achieved a comparable performance to the Transformer encoder, and its training time was 17.6% shorter than the second fastest model, 91.3% shorter than the Transformer encoder, and 94.7% shorter than the pre-trained BERT-Base model. The BioWordVec embeddings slightly improved the performance of the Bi-LSTM model in most disease prevalence scenarios, while the CNN model performed better without pre-trained word embeddings. In addition, the training time was significantly reduced with the GloVe embeddings for all models. Conclusions For classification tasks on medical notes, Transformer encoders are the best choice if the computation resource is not an issue. Otherwise, when the classes are relatively balanced, CNNs are a leading candidate because of their competitive performance and computational efficiency

    Evaluation of Network Reliability for Computer Networks with Multiple Sources

    Get PDF
    Evaluating the reliability of a network with multiple sources to multiple sinks is a critical issue from the perspective of quality management. Due to the unrealistic definition of paths of network models in previous literature, existing models are not appropriate for real-world computer networks such as the Taiwan Advanced Research and Education Network (TWAREN). This paper proposes a modified stochastic-flow network model to evaluate the network reliability of a practical computer network with multiple sources where data is transmitted through several light paths (LPs). Network reliability is defined as being the probability of delivering a specified amount of data from the sources to the sink. It is taken as a performance index to measure the service level of TWAREN. This paper studies the network reliability of the international portion of TWAREN from two sources (Taipei and Hsinchu) to one sink (New York) that goes through a submarine and land surface cable between Taiwan and the United States

    Effect of Radius on Load/Strain Distribution between Ulna and Radius: Experimental and Numberical Analyses

    Get PDF
    Computational Infrastructure and Informatics Poster SessionIt has been hypothesized that osteocytes are stimulated by local strain distribution within the bone subjected to mechanical loadings. This collaborative research project between bone biologists and mechanical engineers is attempting to identify local strain fields around osteocytes that can account for their behavior in response to loading. Using CT images we have built and conducted an extensive finite element study of the mouse forearm. Our model incorporates many components of forearm anatomy not previously included in these models such as the radius and marrow cavities. The results of the current research will shed light on how bone perceives mechanical load and the pathway whereby a physical load is transduced into a biochemical signal that eventually results in new bone formation. The study will help in developing new treatments for bone diseases such as osteoporosis

    Conceptualizing a Knowledge Society in China: A Ubiquitous Network Perspective

    Get PDF
    Developing Ubiquitous Network Societies (UNS) has been a subject of investigation in last decade. Several policy and technological projects have been proposed and implemented at global level to promote ubiquitous network. This paper focuses on China’s preparation towards UNS by analyzing and evaluating the prerequisite technological developments that enable the construction of UNS. The objective of this paper is to identify the notable features of UNS in context to China. Being the nascent area of study our research approach is from technological perspective
    • …
    corecore