573 research outputs found

    Almost Optimal Streaming Algorithms for Coverage Problems

    Full text link
    Maximum coverage and minimum set cover problems --collectively called coverage problems-- have been studied extensively in streaming models. However, previous research not only achieve sub-optimal approximation factors and space complexities, but also study a restricted set arrival model which makes an explicit or implicit assumption on oracle access to the sets, ignoring the complexity of reading and storing the whole set at once. In this paper, we address the above shortcomings, and present algorithms with improved approximation factor and improved space complexity, and prove that our results are almost tight. Moreover, unlike most of previous work, our results hold on a more general edge arrival model. More specifically, we present (almost) optimal approximation algorithms for maximum coverage and minimum set cover problems in the streaming model with an (almost) optimal space complexity of O~(n)\tilde{O}(n), i.e., the space is {\em independent of the size of the sets or the size of the ground set of elements}. These results not only improve over the best known algorithms for the set arrival model, but also are the first such algorithms for the more powerful {\em edge arrival} model. In order to achieve the above results, we introduce a new general sketching technique for coverage functions: This sketching scheme can be applied to convert an α\alpha-approximation algorithm for a coverage problem to a (1-\eps)\alpha-approximation algorithm for the same problem in streaming, or RAM models. We show the significance of our sketching technique by ruling out the possibility of solving coverage problems via accessing (as a black box) a (1 \pm \eps)-approximate oracle (e.g., a sketch function) that estimates the coverage function on any subfamily of the sets

    One Table to Count Them All: Parallel Frequency Estimation on Single-Board Computers

    Get PDF
    Sketches are probabilistic data structures that can provide approximate results within mathematically proven error bounds while using orders of magnitude less memory than traditional approaches. They are tailored for streaming data analysis on architectures even with limited memory such as single-board computers that are widely exploited for IoT and edge computing. Since these devices offer multiple cores, with efficient parallel sketching schemes, they are able to manage high volumes of data streams. However, since their caches are relatively small, a careful parallelization is required. In this work, we focus on the frequency estimation problem and evaluate the performance of a high-end server, a 4-core Raspberry Pi and an 8-core Odroid. As a sketch, we employed the widely used Count-Min Sketch. To hash the stream in parallel and in a cache-friendly way, we applied a novel tabulation approach and rearranged the auxiliary tables into a single one. To parallelize the process with performance, we modified the workflow and applied a form of buffering between hash computations and sketch updates. Today, many single-board computers have heterogeneous processors in which slow and fast cores are equipped together. To utilize all these cores to their full potential, we proposed a dynamic load-balancing mechanism which significantly increased the performance of frequency estimation.Comment: 12 pages, 4 figures, 3 algorithms, 1 table, submitted to EuroPar'1

    Regulation of Heme Biosynthesis in Neurospora crassa

    Get PDF
    The mold Neurospora crassa does not accumulate porphyrins in iron deficiency but instead accumulates the sideramine desferricoprogen. In iron deficiency there is an accumulation of δ-aminolevulinic acid and the δ-aminolevulinic acid dehydratase level is very low. A similar situation exists in cobalt toxicity and zinc deficiency. The δ-aminolevulinic acid synthetase and ferroprotoporphyrin chelatase comparatively show only marginal changes under these conditions. Addition of iron and zinc to the respective metal-deficient cultures results in an induction of δ-aminolevulinic acid dehydratase. The induction of δ-aminolevulinic acid dehydratase is repressed by protoporphyrin and less effectively by hemin and hemoglobin. Iron deficiency, zinc deficiency, and cobalt toxicity have been found to interfere with the conversion of protoporphyrin into heme, thus rendering protoporphyrin available to repress the enzyme. This repression can be counteracted by iron and much more effectively by coprogen. A model has been proposed in which protoporphyrin has been visualized as the corepressor for the enzyme δ-aminolevulinic acid dehydratase. It is held that iron in the form of coprogen converts protoporphyrin to heme, the latter having a lesser affinity for the aporepressor. Coprogen can inhibit heme binding to the aporepressor and thus render the repressor nonfunctional. This will lead to a derepression of the enzyme δ-aminolevulinic acid dehydratase

    LEAD MOLECULE IDENTIFICATION FROM VITEX TRIFOLIA LINN FOR HELMINTHIASIS USING IN VITRO AND IN SILICO METHODS

    Get PDF
    Objective: The study was an attempt to discover a lead molecule to treat helminthiasis using Vitex trifolia. Linn (V. folia Linn) through sterile effect, in vitro and in silico evaluation. Methods: The antibacterial activity was done by Kirby-Bauer disc diffusion method in three different concentrations of extract and in vitro anthelmintic activity was carried out by petri dish and organ bath method. Further, the in silico docking studies were carried out by 11 phytoconstituents against phosphoethanolamine methyltransferase (4FGZ) using Auto Dock 4.2, it was working based on the principle of Lamarckian genetic algorithm. In docking studies, three important parameters such as binding energy, inhibition constant and intermolecular energy are determined. Results: The extracts showed an antibacterial effect in three different concentrations. At 16 mcg/disc a significant effect was observed when compared to blank and ciprofloxacin 5 mcg/disc. The anthelmintic activity in the petri dish method, means paralyzing time of Pheretimaposthuma with the dose of 25, 50 and 100 mg/ml were 13.78, 5.79 and 4.57 min respectively and Piperazine citrate (10 mg/ml) showed paralysis in 21.58 min. In the organ bath method, the time for paralysis of the worm was recorded on a slow-moving Sherrington rotating drum and the study report showed that paralyzing time was decreased at increasing concentrations of the extract. The results of in silico studies exhibited a binding energy of-10.25kcal/mol, inhibitory constant (Ki) 30.91nM, intermolecular energy,-10.84kcal/mol for abietatriene-3-ol which is lesser than the standard ligand phosphoethanolamine (-6.03kcal/mol, 38.29µM,-7.82kcal/mol) respectively. Conclusion: The study reports conclude that the active constituents in V. folia Linn having better anthelmintic activity, thus the active constituents may be optimized and make way to a new moiety for the treatment of helminthiasis

    Oxypred: prediction and classification of oxygen-binding proteins

    Get PDF
    This study describes a method for predicting and classifying oxygen-binding proteins. Firstly, support vector machine (SVM) modules were developed using amino acid composition and dipeptide composition for predicting oxygen-binding proteins, and achieved maximum accuracy of 85.5% and 87.8%, respectively. Secondly, an SVM module was developed based on amino acid composition, classifying the predicted oxygen-binding proteins into six classes with accuracy of 95.8%, 97.5%, 97.5%, 96.9%, 99.4%, and 96.0% for erythrocruorin, hemerythrin, hemocyanin, hemoglobin, leghemoglobin, and myoglobin proteins, respectively. Finally, an SVM module was developed using dipeptide composition for classifying the oxygen-binding proteins, and achieved maximum accuracy of 96.1%, 98.7%, 98.7%, 85.6%, 99.6%, and 93.3% for the above six classes, respectively. All modules were trained and tested by five-fold cross validation. Based on the above approach, a web server Oxypred was developed for predicting and classifying oxygen-binding proteins (available from http://www.imtech.res.in/raghava/oxypred/)

    Stochastic Budget Optimization in Internet Advertising

    Full text link
    Internet advertising is a sophisticated game in which the many advertisers "play" to optimize their return on investment. There are many "targets" for the advertisements, and each "target" has a collection of games with a potentially different set of players involved. In this paper, we study the problem of how advertisers allocate their budget across these "targets". In particular, we focus on formulating their best response strategy as an optimization problem. Advertisers have a set of keywords ("targets") and some stochastic information about the future, namely a probability distribution over scenarios of cost vs click combinations. This summarizes the potential states of the world assuming that the strategies of other players are fixed. Then, the best response can be abstracted as stochastic budget optimization problems to figure out how to spread a given budget across these keywords to maximize the expected number of clicks. We present the first known non-trivial poly-logarithmic approximation for these problems as well as the first known hardness results of getting better than logarithmic approximation ratios in the various parameters involved. We also identify several special cases of these problems of practical interest, such as with fixed number of scenarios or with polynomial-sized parameters related to cost, which are solvable either in polynomial time or with improved approximation ratios. Stochastic budget optimization with scenarios has sophisticated technical structure. Our approximation and hardness results come from relating these problems to a special type of (0/1, bipartite) quadratic programs inherent in them. Our research answers some open problems raised by the authors in (Stochastic Models for Budget Optimization in Search-Based Advertising, Algorithmica, 58 (4), 1022-1044, 2010).Comment: FINAL versio

    Experimental culture of black tiger shrimp Penaeus monodon Fabricius, 1798 in open sea floating cage

    Get PDF
    An experiment was designed and conducted to assess the feasibility for culture of Penaeus monodon in 6 m dia HDPE circular floating cage, installed at 10-12 m depth off Visakhapatnam in the Bay of Bengal. The cage was fixed with three cylindrical nets; inner net of 2 mm mesh (6 m dia×5.5 m height) to rear post-larvae from stock to 80th day; middle net of 10 mm mesh (6 m×5.5 m) to rear juveniles from day 81 onwards and an outer net of 40 mm mesh (8 m dia×4.5 m height) to prevent entry of predators. Post-larvae (PL23) of P. monodon (mean total length 16.1±3.8 mm), were stocked in the cage at a density of 1179 PL m-3). Bottom water parameters viz., salinity, temperature, dissolved oxygen, pH and ammonia (NH3-N) recorded in the cage site during the culture period were 34-35 ppt, 26-32°C, 3.9-4.6 ml l-1, 8.1-8.3 and 0.05-0.07 mg l-1, respectively. Shrimp feed varying in size suitable to different growth stages, was used during the trial. Feed was given twice during the first 20 days and thrice during 21-99 days. Feeding rate per day ranged from 133.3% of the biomass of shrimp at stocking time to 3.1% of the biomass of shrimp at preharvest. P. monodon registered a growth rate of 0.87 mm (TL)/0.018 g (wt) per day during first 30 days; 0.89 mm/0.069 g per day during 31-60 days and 0.37 mm/0.051 g per day during 61-90 days. On day 100, about 207 kg of shrimp (production rate, 1.62 kg m-3) was harvested and the mean size of harvested shrimp was 81.54±15.20 mm TL/4.50±2.80 g wt. Survival recorded was 31.4% and the feed conversion ratio (FCR) was 3.97. Present study demonstrated the possibility for culture of P. monodon in open sea floating cage. The possible causes for low survival and high FCR are discussed

    Approximate String Joins in a Database (Almost) for Free -- Erratum

    Get PDF
    In [GIJ+01a, GIJ+01b] we described how to use q-grams in an RDBMS to perform approximate string joins. We also showed how to implement the approximate join using plain SQL queries. Specifically, we described three filters, count filter, position filter, and length filter, which can be used to execute efficiently the approximate join. The intuition behind the count filter was that strings that are similar have many q-grams in common. In particular, two strings s1 and s2 can have up to max max {|s1|, |s2|} + q - 1 common q-grams. When s1 = s2, they have exactly that many q-grams in common. When s1 and s2 are within edit distance k, they share at least (max {|s1|, |s2|} + q - 1) - kq q-grams, since kq is the maximum numbers of q-grams that can be affected by k edit distance operations
    corecore