95 research outputs found

    Increasing the performance of the Wetland DEM Ponding Model using multiple GPUs

    Get PDF
    Due to the lack of conventional drainage systems on the Canadian Prairies, when excess water runs off the landscape because of the snow-melt and heavy rainfall, the water may be trapped in surface depressions ranging in size from puddles to permanent wetlands and may cause local flooding. Hydrological processes play an important role in the Canadian Prairies regions, and using hydrological simulation models helps people understand past hydrological events and predict future ones. In order to obtain an accurate simulation, higher-resolution systems and larger simulation areas are introduced, and those lead to the need to solve larger-scale problems. However, the size of the problem is often limited by available computational resources, and solving large systems results in unacceptable simulation durations. Therefore, improving the computational efficiency and taking advantage of available computational resources is an urgent task for hydrological researchers and software developers. The Wetland DEM Ponding Model (WDPM) was developed to model the distribution of runoff water on the Canadian Prairies. It helps determine the fraction of Prairie basins contributing flows to stream while these change dynamically with water storage in the depressions. In the WDPM, the water redistribution module is the most computationally intensive part. Previously, the WDPM has been developed to run in parallel with one CPU or one GPU that makes the water redistribution module more efficient. Multi-device parallel computing is a common method to increase the available computation resources and could effectively speed up the application with an appropriate parallel algorithm. This thesis develops a multiple-GPU parallel algorithm and investigates efficient data transmission methods compared to the CPU parallel and one-GPU parallel algorithm. A technique that overlaps communication with computation is applied to optimize the parallel computing process. Then the thesis evaluates the new implementation from several aspects. In the first step, the output summary and the output system are compared between the new implementation and the initial one. The solution shows significant convergence as the simulation processes, verifying the new implementation produces the correct result. In the second step, the multiple-GPU code is profiled, and it is verified that the algorithm can be re-organized to take advantage of multiple GPUs and carry out efficient data synchronization through optimized techniques. Finally, by means of numerical experiments, the new implementation shows performance improvement when using multiple GPUs and demonstrates good scaling. In particular, when working with a large system, the multiple-GPU implementation produces correct output and shows that there is around 2.35 times improvement in the performance compared using four GPUs with using one GPU

    Strong convergence of adaptive time-stepping schemes for the stochastic Allen--Cahn equation

    Full text link
    It is known in \cite{beccari} that the standard explicit Euler-type scheme (such as the exponential Euler and the linear-implicit Euler schemes) with a uniform timestep, though computationally efficient, may diverge for the stochastic Allen--Cahn equation. To overcome the divergence, this paper proposes and analyzes adaptive time-stepping schemes, which adapt the timestep at each iteration to control numerical solutions from instability. The \textit{a priori} estimates in C(O)\mathcal {C}(\mathcal {O})-norm and H˙β(O)\dot{H}^{\beta}(\mathcal{O})-norm of numerical solutions are established provided the adaptive timestep function is suitably bounded, which plays a key role in the convergence analysis. We show that the adaptive time-stepping schemes converge strongly with order β2\frac{\beta}{2} in time and βd\frac{\beta}{d} in space with dd (d=1,2,3d=1,2,3) being the dimension and β(0,2]\beta\in(0,2]. Numerical experiments show that the adaptive time-stepping schemes are simple to implement and at a lower computational cost than a scheme with the uniform timestep

    Bounds for Blind Rate Adaptation

    Get PDF
    A core challenge in wireless communication is choosing appropriate transmission rates for packets. This rate selection problem is well understood in the context of unicast communication from a sender to a known receiver that can reply with acknowledgments. The problem is more difficult, however, in the multicast scenario where a sender must communicate with a potentially large and changing group of receivers with varied link qualities. In such settings, it is inefficient to gather feedback, and achieving good performance for every receiver is complicated by the potential diversity of their link conditions. This paper tackles this problem from an algorithmic perspective: identifying near optimal strategies for selecting rates that guarantee every receiver achieves throughput within reasonable factors of the optimal capacity of its link to the sender. Our algorithms have the added benefit that they are blind: they assume the sender has no information about the network and receives no feedback on its transmissions. We then prove new lower bounds on the fundamental difficulty of achieving good performance in the presence of fast fading (rapid and frequent changes to link quality), and conclude by studying strategies for achieving good throughput over multiple hops. We argue that the implementation of our algorithms should be easy because of the feature of being blind (it is independent to the network structure and the quality of links, so it\u27s robust to changes). Our theoretical framework yields many new open problems within this important general topic of distributed transmission rate selection

    Probabilistic power flow calculation using principal component analysis-based compressive sensing

    Get PDF
    The increasing scale of the injection of renewable energy has brought about great uncertainty to the operation of power grid. In this situation, probabilistic power flow (PPF) calculation has been introduced to mitigate the low accuracy of traditional deterministic power flow calculation in describing the operation status and power flow distribution of power systems. Polynomial chaotic expansion (PCE) method has become popular in PPF analysis due to its high efficiency and accuracy, and sparse PCE has increased its capability of tackling the issue of dimension disaster. In this paper, we propose a principal component analysis-based compressive sensing (PCA-CS) algorithm solve the PPF problem. The l1-optimization of CS is used to tackle the dimension disaster of sparse PCE, and PCA is included to further increase the sparsity of expansion coefficient matrix. Theoretical and numerical simulation results show that the proposed method can effectively improve the efficiency of PPF calculation in the case of random inputs with higher dimensions

    The characterization of actions at the superordinate, basic and subordinate level

    Get PDF
    Objects can be categorized at different levels of abstraction, ranging from the superordinate (e.g., fruit) and the basic (e.g., apple) to the subordinate level (e.g., golden delicious). The basic level is assumed to play a key role in categorization, e.g., in terms of the number of features used to describe these actions and the speed of processing. To which degree do these principles also apply to the categorization of observed actions? To address this question, we first selected a range of actions at the superordinate (e.g., locomotion), basic (e.g., to swim) and subordinate level (e.g., to swim breaststroke), using verbal material (Experiments 1–3). Experiments 4–6 aimed to determine the characteristics of these actions across the three taxonomic levels. Using a feature listing paradigm (Experiment 4), we determined the number of features that were provided by at least six out of twenty participants (common features), separately for the three different levels. In addition, we examined the number of shared (i.e., provided for more than one category) and distinct (i.e., provided for one category only) features. Participants produced the highest number of common features for actions at the basic level. Actions at the subordinate level shared more features with other actions at the same level than those at the superordinate level. Actions at the superordinate and basic level were described with more distinct features compared to those provided at the subordinate level. Using an auditory priming paradigm (Experiment 5), we observed that participants responded faster to action images preceded by a matching auditory cue corresponding to the basic and subordinate level, but not for superordinate level cues, suggesting that the basic level is the most abstract level at which verbal cues facilitate the processing of an upcoming action. Using a category verification task (Experiment 6), we found that participants were faster and more accurate to verify action categories (depicted as images) at the basic and subordinate level in comparison to the superordinate level. Together, in line with the object categorization literature, our results suggest that information about action categories is maximized at the basic level

    Overlapping representations of observed actions and action‐related features

    Get PDF
    The lateral occipitotemporal cortex (LOTC) has been shown to capture the representational structure of a smaller range of actions. In the current study, we carried out an fMRI experiment in which we presented human participants with images depicting 100 different actions and used representational similarity analysis (RSA) to determine which brain regions capture the semantic action space established using judgments of action similarity. Moreover, to determine the contribution of a wide range of action-related features to the neural representation of the semantic action space we constructed an action feature model on the basis of ratings of 44 different features. We found that the semantic action space model and the action feature model are best captured by overlapping activation patterns in bilateral LOTC and ventral occipitotemporal cortex (VOTC). An RSA on eight dimensions resulting from principal component analysis carried out on the action feature model revealed partly overlapping representations within bilateral LOTC, VOTC, and the parietal lobe. Our results suggest spatially overlapping representations of the semantic action space of a wide range of actions and the corresponding action-related features. Together, our results add to our understanding of the kind of representations along the LOTC that support action understanding

    Lesion segmentation on 18F-fluciclovine PET/CT images using deep learning

    Get PDF
    Background and purposeA novel radiotracer, 18F-fluciclovine (anti-3-18F-FACBC), has been demonstrated to be associated with significantly improved survival when it is used in PET/CT imaging to guide postprostatectomy salvage radiotherapy for prostate cancer. We aimed to investigate the feasibility of using a deep learning method to automatically detect and segment lesions on 18F-fluciclovine PET/CT images.Materials and methodsWe retrospectively identified 84 patients who are enrolled in Arm B of the Emory Molecular Prostate Imaging for Radiotherapy Enhancement (EMPIRE-1) trial. All 84 patients had prostate adenocarcinoma and underwent prostatectomy and 18F-fluciclovine PET/CT imaging with lesions identified and delineated by physicians. Three different neural networks with increasing levels of complexity (U-net, Cascaded U-net, and a cascaded detection segmentation network) were trained and tested on the 84 patients with a fivefold cross-validation strategy and a hold-out test, using manual contours as the ground truth. We also investigated using both PET and CT or using PET only as input to the neural network. Dice similarity coefficient (DSC), 95th percentile Hausdorff distance (HD95), center-of-mass distance (CMD), and volume difference (VD) were used to quantify the quality of segmentation results against ground truth contours provided by physicians.ResultsAll three deep learning methods were able to detect 144/155 lesions and 153/155 lesions successfully when PET+CT and PET only, respectively, served as input. Quantitative results demonstrated that the neural network with the best performance was able to segment lesions with an average DSC of 0.68 ± 0.15 and HD95 of 4 ± 2 mm. The center of mass of the segmented contours deviated from physician contours by approximately 2 mm on average, and the volume difference was less than 1 cc. The novel network proposed by us achieves the best performance compared to current networks. The addition of CT as input to the neural network contributed to more cases of failure (DSC = 0), and among those cases of DSC > 0, it was shown to produce no statistically significant difference with the use of only PET as input for our proposed method.ConclusionQuantitative results demonstrated the feasibility of the deep learning methods in automatically segmenting lesions on 18F-fluciclovine PET/CT images. This indicates the great potential of 18F-fluciclovine PET/CT combined with deep learning for providing a second check in identifying lesions as well as saving time and effort for physicians in contouring
    corecore