1,118 research outputs found
Using Hierarchical Data Mining to Characterize Performance of Wireless System Configurations
This paper presents a statistical framework for assessing wireless systems
performance using hierarchical data mining techniques. We consider WCDMA
(wideband code division multiple access) systems with two-branch STTD (space
time transmit diversity) and 1/2 rate convolutional coding (forward error
correction codes). Monte Carlo simulation estimates the bit error probability
(BEP) of the system across a wide range of signal-to-noise ratios (SNRs). A
performance database of simulation runs is collected over a targeted space of
system configurations. This database is then mined to obtain regions of the
configuration space that exhibit acceptable average performance. The shape of
the mined regions illustrates the joint influence of configuration parameters
on system performance. The role of data mining in this application is to
provide explainable and statistically valid design conclusions. The research
issue is to define statistically meaningful aggregation of data in a manner
that permits efficient and effective data mining algorithms. We achieve a good
compromise between these goals and help establish the applicability of data
mining for characterizing wireless systems performance
決定木・回帰木のための多変量判別ルール発見アルゴリズム
広島大学(Hiroshima University)博士(工学)Engineeringdoctora
Mining Optimized Association Rules for Numeric Attributes
AbstractGiven a huge database, we address the problem of finding association rules for numeric attributes, such as(Balance∈I)⇒(CardLoan=yes),which implies that bank customers whose balances fall in a rangeIare likely to use card loan with a probability greater thanp. The above rule is interesting only if the rangeIhas some special feature with respect to the interrelation betweenBalanceandCardLoan. It is required that the number of customers whose balances are contained inI(called thesupportofI) is sufficient and also that the probabilitypof the conditionCardLoan=yesbeing met (called theconfidence ratio) be much higher than the average probability of the condition over all the data. Our goal is to realize a system that finds such appropriate ranges automatically. We mainly focus on computing twooptimized ranges: one that maximizes the support on the condition that the confidence ratio is at least a given threshold value, and another that maximizes the confidence ratio on the condition that the support is at least a given threshold number. Using techniques from computational geometry, we present novel algorithms that compute the optimized ranges in linear time if the data are sorted. Since sorting data with respect to each numeric attribute is expensive in the case of huge databases that occupy much more space than the main memory, we instead apply randomized bucketing as the preprocessing method and thus obtain an efficient rule-finding system. Tests show that our implementation is fast not only in theory but also in practice. The efficiency of our algorithm enables us to compute optimized rules for all combinations of hundreds of numeric and Boolean attributes in a reasonable time
A Similarity Measure for GPU Kernel Subgraph Matching
Accelerator architectures specialize in executing SIMD (single instruction,
multiple data) in lockstep. Because the majority of CUDA applications are
parallelized loops, control flow information can provide an in-depth
characterization of a kernel. CUDAflow is a tool that statically separates CUDA
binaries into basic block regions and dynamically measures instruction and
basic block frequencies. CUDAflow captures this information in a control flow
graph (CFG) and performs subgraph matching across various kernel's CFGs to gain
insights to an application's resource requirements, based on the shape and
traversal of the graph, instruction operations executed and registers
allocated, among other information. The utility of CUDAflow is demonstrated
with SHOC and Rodinia application case studies on a variety of GPU
architectures, revealing novel thread divergence characteristics that
facilitates end users, autotuners and compilers in generating high performing
code
Recommended from our members
BioScript: programming safe chemistry on laboratories-on-a-chip
This paper introduces BioScript, a domain-specific language (DSL) for programmable biochemistry which executes on emerging microfluidic platforms. The goal of this research is to provide a simple, intuitive, and type-safe DSL that is accessible to life science practitioners. The novel feature of the language is its syntax, which aims to optimize human readability; the technical contributions of the paper include the BioScript type system and relevant portions of its compiler. The type system ensures that certain types of errors, specific to biochemistry, do not occur, including the interaction of chemicals that may be unsafe. The compiler includes novel optimizations that place biochemical operations to execute concurrently on a spatial 2D array platform on the granularity of a control flow graph, as opposed to individual basic blocks. Results are obtained using both a cycle-accurate microfluidic simulator and a software interface to a real-world platform
Fuzzy-Granular Based Data Mining for Effective Decision Support in Biomedical Applications
Due to complexity of biomedical problems, adaptive and intelligent knowledge discovery and data mining systems are highly needed to help humans to understand the inherent mechanism of diseases. For biomedical classification problems, typically it is impossible to build a perfect classifier with 100% prediction accuracy. Hence a more realistic target is to build an effective Decision Support System (DSS). In this dissertation, a novel adaptive Fuzzy Association Rules (FARs) mining algorithm, named FARM-DS, is proposed to build such a DSS for binary classification problems in the biomedical domain. Empirical studies show that FARM-DS is competitive to state-of-the-art classifiers in terms of prediction accuracy. More importantly, FARs can provide strong decision support on disease diagnoses due to their easy interpretability. This dissertation also proposes a fuzzy-granular method to select informative and discriminative genes from huge microarray gene expression data. With fuzzy granulation, information loss in the process of gene selection is decreased. As a result, more informative genes for cancer classification are selected and more accurate classifiers can be modeled. Empirical studies show that the proposed method is more accurate than traditional algorithms for cancer classification. And hence we expect that genes being selected can be more helpful for further biological studies
Initial detailed routing algorithms
In this work, we present a study of the problem of routing in the context of the VLSI physical synthesis flow. We study the fundamental routing algorithms such as maze routing, A*, and Steiner tree-based algorithms, as well as some global routing algorithms, namely FastRoute 4.0 and BoxRouter 2.0. We dissect some of the major state of the art initial detailed routing tools, such as RegularRoute, TritonRoute, SmartDR and Dr.CU 2.0. We also propose an initial detailed routing flow, and present an implementation of the proposed routing flow, with a track assignment technique that models the problem as an instance of the maximum independent weighted set (MWIS) and utilizes integer linear programming (ILP) as a solver. The implementation of the proposed initial detailed routing flow also includes an implementation of multiple-source and multiple-target A* for terminal andnet connection with adjustable rules and weights. Finally, we also present a study of the results obtained by the implementation of the proposed initial detailed routing flow and a comparison with the ISPD 2019 contest winners, considering the ISPD 2019 and benchmark suite and evaluation tools.Neste trabalho, apresentamos um estudo do problema de roteamento no contexto do fluxo de síntese física de circuitos integrados VLSI. Nós estudamos algoritmos de roteamento fundamentais como roteamento de labirinto, A* e baseados em árvores de Steiner, além de alguns algoritmos de roteamento global como FastRoute 4.0 e BoxRouter 2.0. Nós dissecamos alguns dos principais trabalhos de roteamento detalhado inicial do estado da arte, como RegularRoute, TritonRoute, SmartDR e Dr.CU 2.0. Também propomos um fluxo de roteamento detalhado inicial, e apresentamos uma implementação do fluxo de roteametno proposto, com uma técnica de assinalamento de trilhas que modela o problema como uma instância do problema do conjunto independente de peso máximo e usa programação linear inteira como um resolvedor. A implementação do fluxo de rotemaento detalhado inicial proposto também inclui uma implementação de um A* com múltiplas fontes e múltiplos destinos para conexão de terminais e redes, com regras e pesos ajustáveis. Por fim, nós apresentamos um estudo dos resultados obtidos pela implementação do fluxo de roteamento detalhado inicial proposto e comparamos com os vencedores do ISPD 2019 contest considerando a suíte de teste e ferramentas de avaliação do ISPD 2019
Scalable stellar evolution forecasting: Deep learning emulation vs. hierarchical nearest neighbor interpolation
Many astrophysical applications require efficient yet reliable forecasts of
stellar evolution tracks. One example is population synthesis, which generates
forward predictions of models for comparison with observations. The majority of
state-of-the-art population synthesis methods are based on analytic fitting
formulae to stellar evolution tracks that are computationally cheap to sample
statistically over a continuous parameter range. Running detailed stellar
evolution codes, such as MESA, over wide and densely sampled parameter grids is
prohibitively expensive computationally, while stellar-age based linear
interpolation in-between sparsely sampled grid points leads to intolerably
large systematic prediction errors. In this work, we provide two solutions of
automated interpolation methods that find satisfactory trade-off points between
cost-efficiency and accuracy. We construct a timescale-adapted evolutionary
coordinate and use it in a two-step interpolation scheme that traces the
evolution of stars from zero age main sequence all the way to the end of core
helium burning while covering a mass range from to . The feedforward neural network regression model (first
solution) that we train to predict stellar surface variables can make millions
of predictions, sufficiently accurate over the entire parameter space, within
tens of seconds on a 4-core CPU. The hierarchical nearest neighbor
interpolation algorithm (second solution) that we hard-code to the same end
achieves even higher predictive accuracy, the same algorithm remains applicable
to all stellar variables evolved over time, but it is two orders of magnitude
slower. Our methodological framework is demonstrated to work on the MIST data
set. Finally, we discuss prospective applications and provide guidelines how to
generalize our methods to higher dimensional parameter spaces.Comment: Submitted to A&
- …