1,224 research outputs found
Efficient Processing of k Nearest Neighbor Joins using MapReduce
k nearest neighbor join (kNN join), designed to find k nearest neighbors from
a dataset S for every object in another dataset R, is a primitive operation
widely adopted by many data mining applications. As a combination of the k
nearest neighbor query and the join operation, kNN join is an expensive
operation. Given the increasing volume of data, it is difficult to perform a
kNN join on a centralized machine efficiently. In this paper, we investigate
how to perform kNN join using MapReduce which is a well-accepted framework for
data-intensive applications over clusters of computers. In brief, the mappers
cluster objects into groups; the reducers perform the kNN join on each group of
objects separately. We design an effective mapping mechanism that exploits
pruning rules for distance filtering, and hence reduces both the shuffling and
computational costs. To reduce the shuffling cost, we propose two approximate
algorithms to minimize the number of replicas. Extensive experiments on our
in-house cluster demonstrate that our proposed methods are efficient, robust
and scalable.Comment: VLDB201
Extracting transcription factor binding sites from unaligned gene sequences with statistical models
<p>Abstract</p> <p>Background</p> <p>Transcription factor binding sites (TFBSs) are crucial in the regulation of gene transcription. Recently, chromatin immunoprecipitation followed by cDNA microarray hybridization (ChIP-chip array) has been used to identify potential regulatory sequences, but the procedure can only map the probable protein-DNA interaction loci within 1–2 kb resolution. To find out the exact binding motifs, it is necessary to build a computational method to examine the ChIP-chip array binding sequences and search for possible motifs representing the transcription factor binding sites.</p> <p>Results</p> <p>We developed a program to find out accurate motif sites from a set of unaligned DNA sequences in the yeast genome. Compared with MDscan, the prediction results suggest that, overall, our algorithm outperforms MDscan since the predicted motifs are more consistent with previously known specificities reported in the literature and have better prediction ranks. Our program also outperforms the constraint-less Cosmo program, especially in the elimination of false positives.</p> <p>Conclusion</p> <p>In this study, an improved sampling algorithm is proposed to incorporate the binomial probability model to build significant initial candidate motif sets. By investigating the statistical dependence between base positions in TFBSs, the method of dependency graphs and their expanded Bayesian networks is combined. The results show that our program satisfactorily extract transcription factor binding sites from unaligned gene sequences.</p
Performance evaluation on the implementation of Pre-established Medical Processes for nurse practitioners in the hospitals
In 2015, Taiwan announced the establishment of “Pre-established Medical Processes” and related regulations to assist nurse practitioners in the clinical tasks, maintain medical quality and patient safety, and provide protection in clinical practice. However, the effectiveness of implementation still needs to be improved and strengthened. This study adopts the TAM and the TTF as the research framework, and a cross-sectional design. The questionnaires are administered to the professional nurse practitioners in the hospitals of central Taiwan. A total of 300 questionnaires were distributed, and Smart PLS 3.0 and SPSS 24.0 were both applied to verify interpretability. The questionnaire recovery rate was 88.3%, and the overall predictive power was 65.2%. Technological characteristics and TTF had a significant impact on perceived usefulness
Self-supervised learning-based general laboratory progress pretrained model for cardiovascular event detection
The inherent nature of patient data poses several challenges. Prevalent cases
amass substantial longitudinal data owing to their patient volume and
consistent follow-ups, however, longitudinal laboratory data are renowned for
their irregularity, temporality, absenteeism, and sparsity; In contrast,
recruitment for rare or specific cases is often constrained due to their
limited patient size and episodic observations. This study employed
self-supervised learning (SSL) to pretrain a generalized laboratory progress
(GLP) model that captures the overall progression of six common laboratory
markers in prevalent cardiovascular cases, with the intention of transferring
this knowledge to aid in the detection of specific cardiovascular event. GLP
implemented a two-stage training approach, leveraging the information embedded
within interpolated data and amplify the performance of SSL. After GLP
pretraining, it is transferred for TVR detection. The proposed two-stage
training improved the performance of pure SSL, and the transferability of GLP
exhibited distinctiveness. After GLP processing, the classification exhibited a
notable enhancement, with averaged accuracy rising from 0.63 to 0.90. All
evaluated metrics demonstrated substantial superiority (p < 0.01) compared to
prior GLP processing. Our study effectively engages in translational
engineering by transferring patient progression of cardiovascular laboratory
parameters from one patient group to another, transcending the limitations of
data availability. The transferability of disease progression optimized the
strategies of examinations and treatments, and improves patient prognosis while
using commonly available laboratory parameters. The potential for expanding
this approach to encompass other diseases holds great promise.Comment: published in IEEE Journal of Translational Engineering in Health &
Medicin
Metal-free sp(3) C-H functionalization: a novel approach for the syntheses of selenide ethers and thioesters from methyl arenes
A DTBP-promoted metal-free and solvent-free formation of C-Se and C-S bonds through sp(3) C-H functionalization of methyl arenes with diselenides and disulfides is described
Zigzag magnetic order in a novel tellurate compound NaNiTeO with = 1 chains
NaNiTeO is a rare example in the transition-metal
tellurate family of realizing an = 1 spin-chain structure. By performing
neutron powder diffraction measurements, the ground-state magnetic structure of
NaNiTeO is determined. These measurements reveal that below
6.8(2) K, the Ni moments form a screwed
ferromagnetic (FM) spin-chain structure running along the crystallographic
axis but these FM spin chains are coupled antiferromagnetically along the
and directions, giving rise to a magnetic propagation vector of = (0,
1/2, 1/2). This zigzag magnetic order is well supported by first-principles
calculations. The moment size of Ni spins is determined to be 2.1(1)
at 3 K, suggesting a significant quenching of the orbital moment
due to the crystalline electric field (CEF) effect. The previously reported
metamagnetic transition near 0.1 T can be understood as a
field-induced spin-flip transition. The relatively easy tunability of the
dimensionality of its magnetism by external parameters makes
NaNiTeO a promising candidate for further exploring various
types of novel spin-chain physics.Comment: 10 pages, 6 figure
Recent Development of Graphene-Based Cathode Materials for Dye-Sensitized Solar Cells
Dye-sensitized solar cells (DSSCs) have attracted extensive attention for serving as potential low-cost alternatives to silicon-based solar cells. As a vital role of a typical DSSC, the counter electrode (CE) is generally employed to collect electrons via the external circuit and speed up the reduction reaction of I3- to I- in the redox electrolyte. The noble Pt is usually deposited on a conductive glass substrate as CE material due to its excellent electrical conductivity, electrocatalytic activity, and electrochemical stability. To achieve cost-efficient DSSCs, reasonable efforts have been made to explore Pt-free alternatives. Recently, the graphene-based CEs have been intensively investigated to replace the high-cost noble Pt CE. In this paper, we provided an overview of studies on the electrochemical and photovoltaic characteristics of graphene-based CEs, including graphene, graphene/Pt, graphene/carbon materials, graphene/conducting polymers, and graphene/inorganic compounds. We also summarize the design and advantages of each graphene-based material and provide the possible directions for designing new graphene-based catalysts in future research for high-performance and low-cost DSSCs
Discriminating Glucose Tolerance Status by Regions of Interest of Dual-Energy X-Ray Absorptiometry: Clinical Implications of Body Fat Distribution
WSTĘP. Zbadanie, czy ocena rozmieszczenia tkanki tłuszczowej w organizmie metodą absorpcjometrii promieniowania rentgenowskiego o podwójnej energii (DEXA, dual energy X-ray absorptiometry) może być pomocny w ocenie stanu tolerancji glukozy.
MATERIAŁ I METODY. U 1015 badanych mieszkańców Chin (559 mężczyzn i 456 kobiet) zastosowano doustny test obciążenia glukozą (75,0 g). Na podstawie jego wyników wyodrębniono osoby o prawidłowej (NGT, normal glucose tolerance) i upośledzonej (IGT, impaired glucose tolerance) tolerancji glukozy oraz osoby, u których rozpoznano cukrzycę (DM, diabetes mellitus). Mierzono wysokość ciśnienia tętniczego i oceniano profil lipidowy. Na podstawie stosunku obwodu talii do bioder (WHR, waist-to-hip ratio) i wyników DEXA oceniano rozmieszczenie tkanki tłuszczowej u osób w poszczególnych grupach.
WYNIKI. Rozmieszczenie tkanki tłuszczowej, wyrażone poprzez WHR oraz wskaźnik centralizacji, wykazało znamienną częściową korelację ze stężeniem hemoglobiny glikowanej, wysokością ciśnienia tętniczego i profilem lipidowym u wszystkich badanych. Po skorygowaniu wyników wobec wieku i wskaźnika masy ciała (BMI, body mass index), stwierdzono znamienne różnice częstości wszystkich sercowo-naczyniowych czynników ryzyka w poszczególnych grupach, z wyjątkiem stężenia cholesterolu całkowitego. W grupie DM odnotowano znamiennie wyż-sze wartości WHR i wskaźnika centralizacji przy niższej procentowo zawartości tkanki tłuszczowej w udach. Ponadto, pacjentów z grupy IGT charakteryzował wyższy wskaźnik centralizacji niż osoby z grupy NGT. Nie stwierdzono jednakże znamiennych różnic masy tkanek beztłuszczowych w porównywanych grupach. Po dokonaniu wieloczynnikowej analizy logistycznej regresji wskaźnik centralizacji pozostał istotnym czynnikiem umożliwiającym ocenę tolerancji glukozy, niezależnie od procentowej zawartości tkanki tłuszczowej w organizmie.
WNIOSKI. Otyłość centralna wykazuje znamienną korelację z sercowo-naczyniowymi czynnikami ryzyka w grupach osób o różnej tolerancji glukozy. Indeks centralizacji, oceniany metodą DEXA, wydaje się lepszym wskaźnikiem upośledzenia tolerancji glukozy niż WHR, otyłość brzuszna czy uogólniona otyłość (wyrażone odpowiednio jako odsetek zawartości tłuszczu całkowitego lub BMI) w dużej grupie badanych Chińczyków.OBJECTIVE. To determine whether measuring body
fat distribution by dual-energy X-ray a bsor ptio
metry (DEXA) can be used to discriminate glucose
tolerance status.
RESEARCH DESIGN AND METHODS. Using a 75-g oral
glucose tolerance test, a total of 1,015 Chinese subjects
(559 men and 456 women) were categorized
as having normal glucose tolerance (NGT), impaired
glucose tolerance (IGT), or diabetes. Blood pre ssure
and lipid profiles of these subjects were measured.
Waist-to-hip ratio (WHR) and DEXA were used
to evaluate the varying patterns of body fat distribution
among the gro ups.
RESULTS. Body fat distribution, as reflected by WHR
and the centrality index, showed significant partial
correlation coefficients with glycosylated hemoglobin,
blood pressure, and lipid profiles in all subjects.
After adjusting for age and BMI, there were significant
differences among the three glycemic groups
for all the cardiovascular risk factors except for total
cholesterol level. The diabetic group had a significantly
higher WHR and centrality index, but lower
femoral fat percentage than the NGT and IGT groups.
The diabetic group also showed higher abdominal
fat percentage than the NGT group. More over,
the IGT group had a higher centrality index than the
NGT group. However, no significant differences were
found in the percentage of lean tissue mass among
the three groups. Using multiple stepwise logistic
regression models, the centrality index remained a
significant factor for discriminating different glucose
tolerance status independent of the percentage
total body fat.
CONCLUSIONS. Central obesity has shown significant
correlation with cardio vascular risk factors among
the three different glycemic groups. Centrality index
measured by DEXA appears to be the better
predictor of glucose intolerance, compared with
WHR, abdominal fat, and general obesity (reflected
by percentage total body fat or BMI) in a large cohort
of the Chinese population
- …