183 research outputs found

    Data Science for Entrepreneurship Research:Studying Demand Dynamics for Entrepreneurial Skills in the Netherlands

    Get PDF
    The recent rise of big data and artificial intelligence (AI) is changing markets, politics, organizations, and societies. It also affects the domain of research. Supported by new statistical methods that rely on computational power and computer science --- data science methods --- we are now able to analyze data sets that can be huge, multidimensional, unstructured, and are diversely sourced. In this paper, we describe the most prominent data science methods suitable for entrepreneurship research and provide links to literature and Internet resources for self-starters. We survey how data science methods have been applied in the entrepreneurship research literature. As a showcase of data science techniques, based on a dataset of 95% of all job vacancies in the Netherlands over a 6-year period with 7.7 million data points, we provide an original analysis of the demand dynamics for entrepreneurial skills in the Netherlands. We show which entrepreneurial skills are particularly important for which type of profession. Moreover, we find that demand for both entrepreneurial and digital skills has increased for managerial positions, but not for others. We also find that entrepreneurial skills were significantly more demanded than digital skills over the entire period 2012-2017 and that the absolute importance of entrepreneurial skills has even increased more than digital skills for managers, despite the impact of datafication on the labor market. We conclude that further studies of entrepreneurial skills in the general population --- outside the domain of entrepreneurs --- is a rewarding subject for future research

    Preface: Swarm Intelligence, Focus on Ant and Particle Swarm Optimization

    Get PDF
    In the era globalisation the emerging technologies are governing engineering industries to a multifaceted state. The escalating complexity has demanded researchers to find the possible ways of easing the solution of the problems. This has motivated the researchers to grasp ideas from the nature and implant it in the engineering sciences. This way of thinking led to emergence of many biologically inspired algorithms that have proven to be efficient in handling the computationally complex problems with competence such as Genetic Algorithm (GA), Ant Colony Optimization (ACO), Particle Swarm Optimization (PSO), etc. Motivated by the capability of the biologically inspired algorithms the present book on ""Swarm Intelligence: Focus on Ant and Particle Swarm Optimization"" aims to present recent developments and applications concerning optimization with swarm intelligence techniques. The papers selected for this book comprise a cross-section of topics that reflect a variety of perspectives and disciplinary backgrounds. In addition to the introduction of new concepts of swarm intelligence, this book also presented some selected representative case studies covering power plant maintenance scheduling; geotechnical engineering; design and machining tolerances; layout problems; manufacturing process plan; job-shop scheduling; structural design; environmental dispatching problems; wireless communication; water distribution systems; multi-plant supply chain; fault diagnosis of airplane engines; and process scheduling. I believe these 27 chapters presented in this book adequately reflect these topics

    Developing Efficient Metaheuristics for Communication Network Problems by using Problem-specific Knowledge

    Full text link
    Metaheuristics, such as evolutionary algorithms or simulated annealing, are widely applicable heuristic optimization strategies that have shown encouraging results for a large number of difficult optimization problems. To show high performance, metaheuristics need to be adapted to the properties of the problem at hand. This paper illustrates how efficient metaheuristics can be developed for communication network problems by utilizing problem-specific knowledge for the design of a high-quality problem representation. The minimum communication spanning tree (MCST) problem finds a communication spanning tree that connects all nodes and satisfies their communication requirements for a minimum total cost. An investigation into the properties of the problem reveals that optimum solutions are similar to the minimum spanning tree (MST). Consequently, a problem-specific representation, the link biased (LB) encoding, is developed, which represents trees as a list of floats. The LB encoding makes use of the knowledge that optimum solutions are similar to the MST, and encodes trees that are similar to the MST with a higher probability. Experimental results for different types of metaheuristics show that metaheuristics using the LB-encoding efficiently solve existing MCST problem instances from the literature, as well as randomly generated MCST problems of different sizes and types

    Improving the translation environment for professional translators

    Get PDF
    When using computer-aided translation systems in a typical, professional translation workflow, there are several stages at which there is room for improvement. The SCATE (Smart Computer-Aided Translation Environment) project investigated several of these aspects, both from a human-computer interaction point of view, as well as from a purely technological side. This paper describes the SCATE research with respect to improved fuzzy matching, parallel treebanks, the integration of translation memories with machine translation, quality estimation, terminology extraction from comparable texts, the use of speech recognition in the translation process, and human computer interaction and interface design for the professional translation environment. For each of these topics, we describe the experiments we performed and the conclusions drawn, providing an overview of the highlights of the entire SCATE project

    Estimating heritability in plant breeding programs

    Get PDF
    Heritability is an important notion in, e.g., human genetics, animal breeding and plant breeding, since the focus of these fields lies on the relationship between phenotypes and genotypes. A phenotype is the composite of an organisms observable traits, which is determined by its underlying genotype, by environmental factors and by genotype-environment interactions. For a set of genotypes, the notion of heritability expresses the proportion of the phenotypic variance that is attributable to the genotypic variance. Furthermore, as it is an intraclass correlation, heritability can also be interpreted as, e.g., the squared correlation between phenotypic and genotypic values. It is important to note that heritability was originally proposed in the context of animal breeding where it is the individual animal that represents the basic unit of observation. This stands in contrast to plant breeding, where multiple observations for the same genotype are obtained in replicated trials. Furthermore, trials are usually conducted as multi-environment trials (MET), where an environment denotes a year × location combination and represents a random sample from a target population of environments. Hence, the observations for each genotype first need to be aggregated in order to obtain a single phenotypic value, which is usually done by obtaining some sort of mean value across trials and replicates. As a consequence, heritability in the context of plant breeding is referred to as heritability on an entry-mean basis and its standard estimation method is a linear combination of variances and trial dimensions. Ultimately, I find that there are two main uses for heritability in plant breeding: The first is to predict the response to selection and the second is as a descriptive measure for the usefulness and precision of cultivar trials. Heritability on an entry-mean basis is suited for both purposes as long as three main assumptions hold: (i) the trial design is completely balanced/orthogonal, (ii) genotypic effects are independent and (iii) variances and covariances are constant. In the last decades, however, many advancements in the methodology of experimental design for and statistical analysis of plant breeding trials took place. As a consequence it is seldom the case that all three of above mentioned assumptions are met. Instead, the application of linear mixed models enables the breeder to straightforwardly analyze unbalanced data with complex variance structures. Chapter 2 exemplarily demonstrates some of the flexibility and benefit of the mixed model framework for typically unbalanced MET by using a bivariate mixed model analyses to jointly analyze two MET for cultivar evaluation, which differ in multiple crucial aspects such as plot size, trial design and general purpose. Such an approach can lead to higher accuracy and precision of the analysis and thus more efficient and successful breeding programs. It is not clear, however, how to define and estimate a generalized heritability on an entry-mean basis for such settings. Therefore, multiple alternative methods for the estimation of heritability on an entry-mean basis have been proposed. In Chapter 3, six alternative methods are applied to four typically unbalanced MET for cultivar evaluation and compared to the standard method. The outcome suggests that the standard method over-estimates heritability, while all of the alternative methods show similar, lower estimates and thus seem able to handle this kind of unbalanced data. Finally, it is argued in Chapter 4 that heritability in plant breeding is not actually based on or aiming at entry-means, but on the differences between them. Moreover, an estimation method for this new proposal of heritability on an entry-difference basis (H_Delta^2/h_Delta^2) is derived and discussed, as well as exemplified and compared to other methods via analyzing four different datasets for cultivar evaluation which differ in their complexity. I argue that regarding the use of heritability as a descriptive measure, H_Delta^2/h_Delta^2, can on the one hand give a more detailed and meaningful insight than all other heritability methods and on the other hand reduces to other methods under certain circumstances. When it comes to the use of heritability as a means to predict the response to selection, the outcome of this work discourages this as a whole. Instead, response to selection should be simulated directly and thus without using any ad hoc heritability measure.In der Humangenetik, Tier- und Pflanzenzüchtung sowie anderen Forschungsdisziplinen, bei denen die Beziehung zwischen Genotypen und Phänotypen im Fokus steht, ist die Heritabilität eine wichtige Maßzahl. Der Phänotyp setzt sich aus einem oder mehreren beobachteten Merkmalen eines Organismus zusammen und wird durch den zugrunde liegenden Genotypen, durch Umwelteinflüsse, sowie durch Genotyp-Umwelt-Wechselwirkungseffekte bestimmt. Die Heritabilität gibt an, welcher Anteil der phänotypischen Varianz genetisch bedingt ist. Sie kann als quadrierte Korrelation zwischen phänotypischen und genotypischen Werte interpretiert werden. Ursprünglich wurde die Heritabilität in der Tierzüchtung vorgeschlagen, in welcher das einzelne Tier die kleinste Beobachtungseinheit darstellt. Dies steht im Gegensatz zur Pflanzenzüchtung, in der meist wiederholte Versuche durchgeführt werden, so dass derselbe Genotyp reproduziert und mehrfach beobachtet werden kann. Hinzu kommt, dass die Versuche meist in Versuchsserien an mehreren Standorten und über mehrere Jahre hinweg durchgeführt werden. Um also einen phänotypischen Wert je Genotyp zu erhalten, müssen dessen Beobachtungen aggregiert werden, was meist durch eine Form von Mittelwertbildung geschieht. Aus diesem Grund wird Heritabilität in der Pflanzenzüchtung standardmäßig als Heritabilität auf Sortenmittelwertbasis. Ich sehe zwei Hauptnutzen von Heritabilität in der Pflanzenzüchtung: Zum einen kann mit ihr der Selektionserfolg vorhergesagt werden und zum anderen dient sie als beschreibende Maßzahl für die Präzision und Brauchbarkeit eines Versuchs. Die Heritabilität auf Sortenmittelwertbasis ist für beide Zwecke geeignet solange folgende Bedingungen erfüllt sind: (i) Das Versuchsdesign ist vollkommen balanciert/orthogonal, (ii) die Genotyp-Effekte sind unabhängig und (iii) alle Varianzen, sowie Kovarianzen sind konstant. In den letzten Jahrzehnten gab es mehrere Weiterentwicklungen in der Methodik des Versuchsdesigns sowie der statistischen Analyse von Pflanzenzüchtungsversuchen. Gemischte Modelle ermöglichen komplexe Varianzstrukturen und unbalancierte Daten auszuwerten In Kapitel 2 wird beispielhaft gezeigt, welche Möglichkeiten und Vorteile in der Anwendung von gemischten Modellen liegen, indem typisch unbalancierte Datensätze von zwei verschiedenen Sortenversuchsserien mithilfe eines bivariaten gemischten Modells gemeinsam ausgewertet werden. Ansätze wie dieser können eine höhere Analyseexaktheit und präzision erzielen und demnach die Effizienz und den Erfolg von Pflanzenzüchtungsprogrammen steigern. Gleichzeitig führt dies dazu, dass die oben genannten Bedingungen nur selten erfüllt sind. In solchen Fällen ist dann nicht klar, wie eine Heritabilität auf Sortenmittelwertbasis definiert und geschätzt werden kann. Mehrere alternative Methoden wurden vorgeschlagen. In Kapitel 3 werden sechs dieser alternativen Methoden für vier typische Datensätze aus Sortenversuchsserien berechnet und miteinander, sowie mit der Standardmethode verglichen. Die Ergebnisse deuten darauf hin, dass letztere die Heritabilität überschätzt, während alle alternativen Methoden ähnliche, niedrigere Schätzungen zeigen. Dies lässt vermuten, dass diese Methoden besser für die vorliegenden, unbalancierten Daten geeignet sind. Abschließend wird in Kapitel 4 gezeigt, dass Heritabilität in der Pflanzenzüchtung im Grunde nicht auf Genotypmittelwerten sondern auf deren Differenzen basiert. Hieraus wird eine Methode zur Berechnung einer generalisierten Heritabilität auf Sortendifferenzbasis (H_Delta^2/h_Delta^2) hergeleitet und diskutiert. Vier unterschiedlich komplexe Datensätze von Sortenversuchen werden verwendet und mit alternativen Heritabilitätsschätzern verglichen. Bezüglich der Verwendung der Heritabilität als beschreibende Maßzahl bietet H_Delta^2/h_Delta^2 einen ausführlicheren und bedeutsameren Einblick als die alternativen Heritabilitätsschätzer oder die Standardmethode. Hinzu kommt, dass H_Delta^2/h_Delta^2 die bisher bekannten Methoden nicht nur verallgemeinert, sondern in Spezialfällen exakt abbildet. Basierend auf den Resultaten der gesamten Arbeit rate ich von der Verwendung von Heritabilität als Mittel zur Vorhersage des Selektionserfolges ab. Der Selektionserfolg sollte stattdessen direkt simuliert werden, sodass die Nutzung einer ad hoc Schätzungsmethode der Heritabilität unnötig ist

    Formation of Si Nanocrystals for Single Electron Transistors by Ion Beam Mixing and Self-Organization – Modeling and Simulation

    Get PDF
    The replacement of the conventional field effect transistor (FET) by single electron transistors (SET) would lead to high energy savings and to devices with significantly longer battery life. There are many production approaches, but mostly for specimens in the laboratory. Most of them suffer from the fact that they either only work at cryogenic temperatures, have a low production yield or are not reproducible and each unit works in a unique way. A room temperature (RT) operating SET can be configured by inserting a small (few nm diameters) Si-Nanocrystal (NC) into a thin (<10 nm) SiO2 interlayer in Si. Industrial production has so far been excluded due to a lack of manufacturing processes. Classical technologies such as lithography fail to produce structures in this small scale. Even electron beam lithography or extreme ultraviolet lithography are far from being able to realize these structures in mass production. However, self-organization processes enable structures to be produced in any order of magnitude down to atomic sizes. Earlier studies realized similar systems using a layer of Si-NCs to fabricate a non-volatile memory by using the charge of the NCs for data storage. Based on this, it is very promising to use it for the realization of the SET. The self-organization depends only on the start configuration of the system and the boundary conditions during the process. These macroscopic conditions control the self-formed structures. In this work, ion beam irradiation is used to form the initial configuration, and thermal annealing is used to drive self-organization. A Si/SiO2/Si stack is irradiated and transforms the stack into Si/SiOx/Si by ion beam mixing (IBM) of the two Si/SiO2 interfaces. The oxide becomes metastable and the subsequent thermal treatment induces selforganization, which might leave a single Si-NC in the SiO2 layer for a sufficiently small mixing volume. The transformation of the planar SiOx layer (restriction only in one dimension) into a small SiOx volume (restriction in all three dimensions) is done by etching nanopillars with a diameter of less than 10nm. This forms a small SiOx plate embedded between two Si layers. The challenge is to control the self-organization process. In this work, simulation was used to investigate dependencies and parameter optimization. The ion mixing simulations were performed using binary collision approximation (BCA), followed by kinetic Monte Carlo (KMC) simulations of the decomposition process, which gave good qualitative agreement with the structures observed in related experiments. Quantitatively, however, the BCA simulation seemed to overestimate the mixing effect. This is due to the neglect of the positive entropy of the Si-SiO2 system mixing, i.e. the immiscibility counteracts the collisional mixing. The influence of this mechanism increases with increasing ion fluence. Compared to the combined BCA and KMC simulations, a larger ion mixing fluence has to be applied experimentally to obtain the predicted nanocluster morphology. To model the ion beam mixing of the Si/SiO2 interface, phase field methods have been applied to describe the influence of chemical effects during the irradiation of buried SiO2 layers by 60 keV Si+ ions at RT and thermal annealing at 1050°C. The ballistic collisional mixing was modeled by an approach using Fick’s diffusion equation, and the chemical effects and the annealing were described by the Cahn Hilliard equation. By that, it is now possible to predict composition profiles of Si/SiO2 interfaces during irradiation. The results are in good agreement with the experiment and are used for the predictions of the NCs formation in the nanopillar. For the thermal treatment model extensions were also necessary. The KMC simulations of Si-SiO2 systems in the past were based on normed time and temperature, so that the diffusion velocity of the components was not considered. However, the diffusion of Si in SiO2 and SiO2 in Si differs by several orders of magnitude. This cannot be neglected in the thermal treatment of the Si/SiO2 interface, because the processes that differ in speed in this order of magnitude are only a few nanometers apart. The KMC method was extended to include the different diffusion coefficients of the Si-SiO2 system. This allows to extensively investigate the influence of the diffusion. The phase diagram over temperature and composition was examined regarding decomposition (nucleation as well as spinodal decomposition) and growing of NCs. Using the methods and the knowledge gained about the system, basic simulations for the individual NC formation in the nanopillar were carried out. The influence of temperature, diameter, and radiation fluence was discussed in detail on the basis of simulation results

    Exploring data sharing obligations in the technology sector

    Get PDF
    This report addresses the question: What is the role of data in the technology sector and what are the opportunities and risks of mandatory data sharing? The answer provides insights into costs and benefits of variants of data sharing obligations with and between technology companies

    Classical-to-Quantum Sequence Encoding in Genomics

    Full text link
    DNA sequencing allows for the determination of the genetic code of an organism, and therefore is an indispensable tool that has applications in Medicine, Life Sciences, Evolutionary Biology, Food Sciences and Technology, and Agriculture. In this paper, we present several novel methods of performing classical-to-quantum data encoding inspired by various mathematical fields, and we demonstrate these ideas within Bioinformatics. In particular, we introduce algorithms that draw inspiration from diverse fields such as Electrical and Electronic Engineering, Information Theory, Differential Geometry, and Neural Network architectures. We provide a complete overview of the existing data encoding schemes and show how to use them in Genomics. The algorithms provided utilise lossless compression, wavelet-based encoding, and information entropy. Moreover, we propose a contemporary method for testing encoded DNA sequences using Quantum Boltzmann Machines. To evaluate the effectiveness of our algorithms, we discuss a potential dataset that serves as a sandbox environment for testing against real-world scenarios. Our research contributes to developing classical-to-quantum data encoding methods in the science of Bioinformatics by introducing innovative algorithms that utilise diverse fields and advanced techniques. Our findings offer insights into the potential of Quantum Computing in Bioinformatics and have implications for future research in this area.Comment: 58 pages, 14 figure
    corecore