364 research outputs found
Ball: An R package for detecting distribution difference and association in metric spaces
The rapid development of modern technology facilitates the appearance of
numerous unprecedented complex data which do not satisfy the axioms of
Euclidean geometry, while most of the statistical hypothesis tests are
available in Euclidean or Hilbert spaces. To properly analyze the data of more
complicated structures, efforts have been made to solve the fundamental test
problems in more general spaces. In this paper, a publicly available R package
Ball is provided to implement Ball statistical test procedures for K-sample
distribution comparison and test of mutual independence in metric spaces, which
extend the test procedures for two sample distribution comparison and test of
independence. The tailormade algorithms as well as engineering techniques are
employed on the Ball package to speed up computation to the best of our
ability. Two real data analyses and several numerical studies have been
performed and the results certify the powerfulness of Ball package in analyzing
complex data, e.g., spherical data and symmetric positive matrix data
Raman piezospectroscopic evaluation of intergrowth ferroelectric polycrystalline ceramic in biaxial bending configuration
The piezospectroscopic (PS) effect was studied in an intergrowth bismuth layer-structure ferroelectricceramicBi₅TiNbWO₁₅ according to a micro-Raman spectroscopic evaluation. By using a ball-on-ring flexure configuration, a biaxial stress was generated in a Bi₅TiNbWO₁₅ plate-like specimen and in situ collected Raman spectra were acquired and analyzed under several loading conditions. As the observed spectral line contained signals arising from the whole illuminated in-depth region, the laser probe information was deconvoluted (by means of an in-depth probe response function obtained according to the defocusing method) in order to deduce biaxial PS coefficients for the three Raman bands of Bi₅TiNbWO₁₅ located at 763, 857, and 886 cm−1, respectively. The biaxial PS coefficients of these bands were derived to be −1.74±0.16, −2.51±0.16, and −2.64±0.31 cm⁻¹/GPa, respectively, and should be referred to the c axis of the Bi5TiNbWO15 crystal
Nonparametric statistical inference via metric distribution function in metric spaces
The distribution function is essential in statistical inference and connected with samples to form a directed closed loop by the correspondence theorem in measure theory and the Glivenko-Cantelli and Donsker properties. This connection creates a paradigm for statistical inference. However, existing distribution functions are defined in Euclidean spaces and are no longer convenient to use in rapidly evolving data objects of complex nature. It is imperative to develop the concept of the distribution function in a more general space to meet emerging needs. Note that the linearity allows us to use hypercubes to define the distribution function in a Euclidean space. Still, without the linearity in a metric space, we must work with the metric to investigate the probability measure. We introduce a class of metric distribution functions through the metric only. We overcome this challenging step by proving the correspondence theorem and the Glivenko-Cantelli theorem for metric distribution functions in metric spaces, laying the foundation for conducting rational statistical inference for metric space-valued data. Then, we develop a homogeneity test and a mutual independence test for non-Euclidean random objects and present comprehensive empirical evidence to support the performance of our proposed methods. Supplementary materials for this article are available online
MicroRNA Regulation and Tissue-Specific Protein Interaction Network
BACKGROUND: 'Fine-tuning' of protein abundance makes microRNAs (miRNAs) pervasively implicated in human biology. Although targeting many mRNAs endows the power of single miRNA to regulate complex biological processes, its functional roles in a particular tissue will be inevitably restricted because only a subset of its target genes is expressed. METHODS: Here, we analyze the characteristics of miRNA regulation upon target genes according to tissue-specific gene expression by constructing tissue-specific protein interaction networks for ten main types of tissues in the human body. RESULTS: Commonly expressed proteins are under more intensive but lower-cost miRNAs control than proteins with the tissue-specific expression. MiRNAs that target more commonly expressed genes usually regulate more tissue-specific genes. This is consistent with the previous finding that tissue-specific proteins tend to be functionally connected with commonly expressed proteins. But to a particular miRNA such a balance is not invariable among different tissues implying diverse tissue regulation modes executed by miRNAs. CONCLUSION: These results suggest miRNAs that interact with more commonly expressed genes can be expected to play important tissue-specific roles
Raman tensor analysis of ultra-high molecular weight polyethylene and its application to study retrieved hip joint components
The angular dependences of the polarized Raman intensity of A(g), B-1g, B-2g, and B-3g modes have been preliminary investigated on a model fiber sample of ultra-high molecular weight polyethylene (UHMWPE) in order to retrieve the Raman tensor elements, i.e. the intrinsic parameters governing the vibrational behavior of the orthorhombic structure of polyethylene. Based on this Raman analysis, a method is proposed for determining unknown crystallographic orientation patterns in UHMWPE biomedical components concurrently with the orientation distribution functions for orthorhombic lamellae. An application of the method is shown, in which we quantitatively examined the molecular orientation patterns developed on the surface of four in vivo exposed UHMWPE acetabular cups vs. an unused cup. Interesting findings were: (i) a clear bimodal distribution of orientation angles was observed on worn surfaces; and (ii) a definite and systematic increase in both molecular orientation and crystallinity in main wear zones vs. non-wear zones was found in all retrieved acetabular cups. The present crystallographic analysis is an extension of our previous Raman studies of UHMWPE acetabular cups related to assessments of oxidation and residual strain and suggests a viable path to track back wear-history information from the surface of UHMWPE, thus unfolding the in vivo kinematics of the bearing surfaces in hip joints on the microscopic scale. (C) 2010 Acta Materialia Inc. Published by Elsevier Ltd. All rights reserved
Ball: An R Package for Detecting Distribution Difference and Association in Metric Spaces
The rapid development of modern technology has created many complex datasets in non-linear spaces, while most of the statistical hypothesis tests are only available in Euclidean or Hilbert spaces. To properly analyze the data with more complicated structures, efforts have been made to solve the fundamental test problems in more general spaces (Lyons 2013; Pan, Tian, Wang, and Zhang 2018; Pan, Wang, Zhang, Zhu, and Zhu 2020). In this paper, we introduce a publicly available R package Ball for the comparison of multiple distributions and the test of mutual independence in metric spaces, which extends the test procedures for the equality of two distributions (Pan et al. 2018) and the independence of two random objects (Pan et al. 2020). The Ball package is computationally efficient since several novel algorithms as well as engineering techniques are employed in speeding up the ball test procedures. Two real data analyses and diverse numerical studies have been performed, and the results certify that the Ball package can detect various distribution differences and complicated dependencies in complex datasets, e.g., directional data and symmetric positive definite matrix data
Layered Functional Network Analysis of Gene Expression in Human Heart Failure
BACKGROUND: Although dilated cardiomyopathy (DCM) is a leading cause of heart failure (HF), the mechanism underlying DCM is not well understood. Previously, it has been demonstrated that an integrative analysis of gene expression and protein-protein interaction (PPI) networks can provide insights into the molecular mechanisms of various diseases. In this study we develop a systems approach by linking public available gene expression data on ischemic dilated cardiomyopathy (ICM), a main pathological form of DCM, with data from a layered PPI network. We propose that the use of a layered PPI network, as opposed to a traditional PPI network, provides unique insights into the mechanism of DCM. METHODS: Four Cytoscape plugins including BionetBuilder, NetworkAnalyzer, Cerebral and GenePro were used to establish the layered PPI network, which was based upon validated subcellular protein localization data retrieved from the HRPD and Entrez Gene databases. The DAVID function annotation clustering tool was used for gene ontology (GO) analysis. RESULTS: The assembled layered PPI network was divided into four layers: extracellular, plasma membrane, cytoplasm and nucleus. The characteristics of the gene expression pattern of the four layers were compared. In the extracellular and plasma membrane layers, there were more proteins encoded by down-regulated genes than by up-regulated genes, but in the other two layers, the opposite trend was found. GO analysis established that proteins encoded by up-regulated genes, reflecting significantly over-represented biological processes, were mainly located in the nucleus and cytoplasm layers, while proteins encoded by down-regulated genes were mainly located in the extracellular and plasma membrane layers. The PPI network analysis revealed that the Janus family tyrosine kinase-signal transducer and activator of transcription (Jak-STAT) signaling pathway might play an important role in the development of ICM and could be exploited as a therapeutic target of ICM. In addition, glycogen synthase kinase 3 beta (GSK3B) may also be a potential candidate target, but more evidence is required. CONCLUSION: This study illustrated that by incorporating subcellular localization information into a PPI network based analysis, one can derive greater insights into the mechanisms underlying ICM
Structural modifications induced by compressive plastic deformation in single-step and sequentially irradiated UHMWPE for hip joint components
Structural modifications were studied at the molecular scale in two highly crosslinked UHMWPE materials for hip-joint acetabular components, as induced upon application of (uniaxial) compressive strain to the as-manufactured microstructures. The two materials, quite different in their starting resins and belonging to different manufacturing generations, were a single-step irradiated and a sequentially irradiated polyethylene. The latter material represents the most recently launched gamma-ray-irradiated polyethylene material in the global hip implant market. Confocal/polarized Raman spectroscopy was systematically applied to characterize the initial microstructures and the microstructural response of the materials to plastic deformation. Crystallinity fractions and preferential orientation of molecular chains have been followed up during in vitro deformation tests on unused cups and correlated to plastic strain magnitude and to the recovery capacity of the material. Moreover, analyses of the in vim deformation behavior of two short-term retrieved hip cups are also presented. Trends of preferential orientation of molecular chains as a function of residual strain were similar for both materials, but distinctly different in their extents. The sequentially irradiated material was more resistant to plastic deformation and, for the same magnitude of residual plastic strain, possessed a higher capacity of recovery as compared to the single-step irradiated one. (C) 2013 Elsevier Ltd. All rights reserved
Controllable and Diverse Data Augmentation with Large Language Model for Low-Resource Open-Domain Dialogue Generation
Data augmentation (DA) is crucial to mitigate model training instability and
over-fitting problems in low-resource open-domain dialogue generation. However,
traditional DA methods often neglect semantic data diversity, restricting the
overall quality. Recently, large language models (LLM) have been used for DA to
generate diversified dialogues. However, they have limited controllability and
tend to generate dialogues with a distribution shift compared to the seed
dialogues. To maximize the augmentation diversity and address the
controllability problem, we propose \textbf{S}ummary-based \textbf{D}ialogue
\textbf{A}ugmentation with LLM (SDA). Our approach enhances the controllability
of LLM by using dialogue summaries as a planning tool. Based on summaries, SDA
can generate high-quality and diverse dialogue data even with a small seed
dataset. To evaluate the efficacy of data augmentation methods for open-domain
dialogue, we designed a clustering-based metric to characterize the semantic
diversity of the augmented dialogue data. The experimental results show that
SDA can augment high-quality and semantically diverse dialogues given a small
seed dataset and an LLM, and the augmented data can boost the performance of
open-domain dialogue models.Comment: 13 pages, 5 figure
Polarized Raman analysis of the molecular rearrangement and residual strain on the surface of retrieved polyethylene tibial plates
The response to applied strain of EtO-sterilized and gamma-irradiated polyethylene materials belonging to tibial inserts has been studied by polarized Raman spectroscopy. Initial calibrations on as-received samples from three different makers were employed to clarify the rearrangement of molecular chains under strain, expressed in terms of Euler angular displacements in space and orientation distribution functions. This body of information was then applied to a quantitative analysis of four tibial inserts (from the same three makers of the unused samples) retrieved after in vivo exposures ranging between 7 months and 5 years 8 months. The main results of the Raman analysis can be summarized as follows: (i) gamma-irradiated samples experienced lower texturing on the molecular scale compared to EtO-sterilized samples, likely due to a higher strain recovery capability; and (ii) independent of sterilization method, the amount of plastic strain was mainly developed early after in vivo implantation, whereby out-of-plane molecules rotated under load onto planes parallel to the sample surface until saturation of angular displacements was reached. (C) 2010 Acta Materialia Inc. Published by Elsevier Ltd. All rights reserved
- …