136 research outputs found
PASS-JOIN: A Partition-based Method for Similarity Joins
As an essential operation in data cleaning, the similarity join has attracted
considerable attention from the database community. In this paper, we study
string similarity joins with edit-distance constraints, which find similar
string pairs from two large sets of strings whose edit distance is within a
given threshold. Existing algorithms are efficient either for short strings or
for long strings, and there is no algorithm that can efficiently and adaptively
support both short strings and long strings. To address this problem, we
propose a partition-based method called Pass-Join. Pass-Join partitions a
string into a set of segments and creates inverted indices for the segments.
Then for each string, Pass-Join selects some of its substrings and uses the
selected substrings to find candidate pairs using the inverted indices. We
devise efficient techniques to select the substrings and prove that our method
can minimize the number of selected substrings. We develop novel pruning
techniques to efficiently verify the candidate pairs. Experimental results show
that our algorithms are efficient for both short strings and long strings, and
outperform state-of-the-art methods on real datasets.Comment: VLDB201
MassJoin: A mapreduce-based method for scalable string similarity joins
Abstract—String similarity join is an essential operation in data integration. The era of big data calls for scalable algorithms to support large-scale string similarity joins. In this paper, we study scalable string similarity joins using MapReduce. We propose a MapReduce-based framework, called MASSJOIN, which supports both set-based similarity functions and character-based similarity functions. We extend the existing partition-based signature scheme to support set-based similarity functions. We utilize the signatures to generate key-value pairs. To reduce the transmission cost, we merge key-value pairs to significantly reduce the number of key-value pairs, from cubic to linear com-plexity, while not sacrificing the pruning power. To improve the performance, we incorporate “light-weight ” filter units into the key-value pairs which can be utilized to prune large number of dissimilar pairs without significantly increasing the transmission cost. Experimental results on real-world datasets show that our method significantly outperformed state-of-the-art approaches. I
Pedestrian–bus route and pickup location planning for emergency evacuation
Planning for a bus-based regional evacuation is essential for emergency preparedness, especially for hurricane or flood prone urban environments with large numbers of transit-dependent or transit-captive populations. This paper develops an optimization-based decision-support model for pedestrian–bus evacuation planning under bus fleet, pedestrian and bus routes, and network constraints. Aiming to minimize the evacuation duration time, an optimization model is proposed to determine the optimal pickup nodes for evacuees to assemble using existing pedestrian routes, and to allocate available bus fleet via bus routes and urban road network to transport the assembled evacuees between the pickup nodes and designated public shelters. The numerical examples with two scenarios based on the Sioux Falls street network from North Dakota (United States) demonstrates that this model can be used to optimize the evacuation duration time, the location of pickup nodes and bus assignment simultaneously.
First published online 13 October 202
Study on the degradation of T-2 toxin in beer by glow discharge plasma
Objective: Exploring the optimal process for the degradation of T-2 toxin in beer by glow discharge plasma(GDP) and its impact on the physicochemical indicators of beer. Methods: Based on single-factor experiments, a response surface optimization experiment with four factors and three levels was conducted using the Box Behnken method to determine the optimal degradation conditions for T-2 toxin in beer. Results: When the discharge voltage was 570 V, the action time was 18 minutes, the discharge current was 99 mA, and initial concentration of T-2 toxin was 8.5 μg/mL. Under the control of these conditions, the degradation efficiency of the T-2 toxin was the highest (89.21%). After GDP treatment, the physical and chemical indicators of beer were tested, and the results showed a significant decrease in beer foam retention (P<0.05), while other indicators remained unchanged. Conclusion: The optimal degradation conditions of GDP obtained by the response surface optimization model are accurate and reliable, which can be used for the degradation of T-2 toxin in beer. GDP can affect the brewing ability of beer, but it will not have a significant impact on other indicators
The Protective Antibodies Induced by a Novel Epitope of Human TNF-α Could Suppress the Development of Collagen-Induced Arthritis
Tumor necrosis factor alpha (TNF-α) is a major inflammatory mediator that exhibits actions leading to tissue destruction and hampering recovery from damage. At present, two antibodies against human TNF-α (hTNF-α) are available, which are widely used for the clinic treatment of certain inflammatory diseases. This work was undertaken to identify a novel functional epitope of hTNF-α. We performed screening peptide library against anti-hTNF-α antibodies, ELISA and competitive ELISA to obtain the epitope of hTNF-α. The key residues of the epitope were identified by means of combinatorial alanine scanning and site-specific mutagenesis. The N terminus (80–91 aa) of hTNF-α proved to be a novel epitope (YG1). The two amino acids of YG1, proline and valine, were identified as the key residues, which were important for hTNF-α biological function. Furthermore, the function of the epitope was addressed on an animal model of collagen-induced arthritis (CIA). CIA could be suppressed in an animal model by prevaccination with the derivative peptides of YG1. The antibodies of YG1 could also inhibit the cytotoxicity of hTNF-α. These results demonstrate that YG1 is a novel epitope associated with the biological function of hTNF-α and the antibodies against YG1 can inhibit the development of CIA in animal model, so it would be a potential target of new therapeutic antibodies
- …