445 research outputs found

    Agent Based Models of Language Competition: Macroscopic descriptions and Order-Disorder transitions

    Get PDF
    We investigate the dynamics of two agent based models of language competition. In the first model, each individual can be in one of two possible states, either using language XX or language YY, while the second model incorporates a third state XY, representing individuals that use both languages (bilinguals). We analyze the models on complex networks and two-dimensional square lattices by analytical and numerical methods, and show that they exhibit a transition from one-language dominance to language coexistence. We find that the coexistence of languages is more difficult to maintain in the Bilinguals model, where the presence of bilinguals in use facilitates the ultimate dominance of one of the two languages. A stability analysis reveals that the coexistence is more unlikely to happen in poorly-connected than in fully connected networks, and that the dominance of only one language is enhanced as the connectivity decreases. This dominance effect is even stronger in a two-dimensional space, where domain coarsening tends to drive the system towards language consensus.Comment: 30 pages, 11 figure

    Medicina «de bolsillo»

    Get PDF

    Analytical Solution of the Voter Model on Disordered Networks

    Get PDF
    We present a mathematical description of the voter model dynamics on heterogeneous networks. When the average degree of the graph is μ2\mu \leq 2 the system reaches complete order exponentially fast. For μ>2\mu >2, a finite system falls, before it fully orders, in a quasistationary state in which the average density of active links (links between opposite-state nodes) in surviving runs is constant and equal to (μ2)3(μ1)\frac{(\mu-2)}{3(\mu-1)}, while an infinite large system stays ad infinitum in a partially ordered stationary active state. The mean life time of the quasistationary state is proportional to the mean time to reach the fully ordered state TT, which scales as T(μ1)μ2N(μ2)μ2T \sim \frac{(\mu-1) \mu^2 N}{(\mu-2) \mu_2}, where NN is the number of nodes of the network, and μ2\mu_2 is the second moment of the degree distribution. We find good agreement between these analytical results and numerical simulations on random networks with various degree distributions.Comment: 20 pages, 8 figure

    Análisis del enfoque de historia y filosofía de la ciencia en libros de texto de química : el caso de la estructura atómica

    Get PDF
    Este trabajo analiza desde el enfoque de historia y filosofía de las ciencias (HFC) la forma como se presenta el tema de estructura atómica en cinco libros de texto (LT) de los cursos de química de primer año de la Universidad de Barcelona. En el análisis se aplicó un marco metodológico que exploró la posibilidad de emplear diferentes metodologías (heterogéneas entre sí) para lograr una explicación más acertada del tema de estudio. Los resultados mostraron que la mayoría de los textos presentan una imagen de la ciencia desconectada de otros contextos, incluido el científico, imagen caracterizada por descubrimientos aislados producto del trabajo de científicos individuales. Esta representación de la ciencia no revela lo que realmente es y la manera como se genera el conocimiento científico. Además, los libros de texto analizados enfatizan el trabajo experimental, pero no cubren los detalles teóricos y si lo hacen es someramente.History and Philosophy of Science approach (HPS) in atomic structure chapters was analyzed in five chemistry textbooks that are recommended in first year's courses the at University of Barcelona. A methodological framework composed by four heterogeneous methodologies was applied in this work trying to get a more accurate explanation about the topic studied. Results have shown that a wrong image of science, disconnected from other contexts, included the scientific one, is presented in the most of the textbooks. Science in these textbooks is characterized by isolated discoveries, product of individual scientific work. This kind of scientific work's representation is not faithful with the way in which science really works and how science really is. Besides, the textbooks analyzed emphasize experimental details while they do not go into theoretical aspects in depth

    Paralelización del código Stampack v7.10

    Get PDF
    El objetivo principal del presente informe es mostrar las mejoras en la performance del código Stampack, cuando se lo emplea en conjunto con OpenMP en máquinas del tipo multi-núcleo (ej.: Intel Core2 Duo, Intel Core2 Quad, etc) o multi-thread (ej.: Intel Core i3, i5, i7, etc). Esta última tecnología en los procesadores (multi-thread) permite aumentar de manera importante la potencia de cálculo, dado que además de contar con más de un procesador, permite aumentar los hilos (threads) de tareas simultaneas de cada núcleo. Por ejemplo un procesador Intel Core i3, puede tener dos procesadores y realizar dos tareas simultaneas por cada procesador, lo cual es equivalente a tener una máquina con cuatro procesadores. En el caso del procesador Intel Core i7 empleado en este trabajo cuenta con cuatro núcleos y la posibilidad de realizar dos tareas simultaneas por núcleo, lo cual equivale a tener ocho procesadores. La paralelización del software Stampack se ha realizado siguiendo la idea básica de modificar o intervenir mínimamente los archivos fuente originales de la versión serial. Bajo estas condiciones, se ha realizado un análisis de tiempos para detectar zonas del código que representen un importante porcentaje del tiempo total de cálculo. Para finalizar se debe destacar que el software paralelizado debe funcionar también de manera correcta cuando es ejecutado en ordenadores con un solo procesador, de manera que la versión paralela debe ser compatible en su totalidad con la versión serial original del código Stampack

    Efficient and portable Winograd convolutions for multi-core processors

    Get PDF
    We take a step forward towards developing high-performance codes for the convolution operator, based on the Winograd algorithm, that are easy to customise for general-purpose processor architectures. In our approach, augmenting the portability of the solution is achieved via the introduction of vector instructions from Intel SSE/AVX2/AVX512 and ARM NEON/SVE to exploit the single-instruction multiple-data capabilities of current processors as well as OpenMP pragmas to exploit multi-threaded parallelism. While this comes at the cost of sacrificing a fraction of the computational performance, our experimental results on three distinct processors, with Intel Xeon Skylake, ARM Cortex A57 and Fujitsu A64FX processors, show that the impact is affordable and still renders a Winograd-based solution that is competitive when compared with the lowering GEMM-based convolution

    Analyzing the impact of the MPI allreduce in distributed training of convolutional neural networks

    Get PDF
    For many distributed applications, data communication poses an important bottleneck from the points of view of performance and energy consumption. As more cores are integrated per node, in general the global performance of the system increases yet eventually becomes limited by the interconnection network. This is the case for distributed data-parallel training of convolutional neural networks (CNNs), which usually proceeds on a cluster with a small to moderate number of nodes. In this paper, we analyze the performance of the Allreduce collective communication primitive, a key to the efficient data-parallel distributed training of CNNs. Our study targets the distinct realizations of this primitive in three high performance instances of Message Passing Interface (MPI), namely MPICH, OpenMPI, and IntelMPI, and employs a cluster equipped with state-of-the-art processor and network technologies. In addition, we apply the insights gained from the experimental analysis to the optimization of the TensorFlow framework when running on top of Horovod. Our study reveals that a careful selection of the most convenient MPI library and Allreduce (ARD) realization accelerates the training throughput by a factor of 1.2× compared with the default algorithm in the same MPI library, and up to 2.8× when comparing distinct MPI libraries in a number of relevant combinations of CNN model+dataset

    Using machine learning to model the training scalability of convolutional neural networks on clusters of GPUs

    Get PDF
    In this work, we build a general piece-wise model to analyze data-parallel (DP) training costs of convolutional neural networks (CNNs) on clusters of GPUs. This general model is based on i) multi-layer perceptrons (MLPs) in charge of modeling the NVIDIA cuDNN/cuBLAS library kernels involved in the training of some of the state-of-the-art CNNs; and ii) an analytical model in charge of modeling the NVIDIA NCCL Allreduce collective primitive using the Ring algorithm. The CNN training scalability study performed using this model in combination with the Roofline technique on varying batch sizes, node (floating-point) arithmetic performance, node memory bandwidth, network link bandwidth, and cluster dimension unveil some crucial bottlenecks at both GPU and cluster level. To provide evidence of this analysis, we validate the accuracy of the proposed model against a Python library for distributed deep learning training.Funding for open access charge: CRUE-Universitat Jaume
    corecore