1,249 research outputs found
Developments from enquiries into the learnability of the pattern languages from positive data
AbstractThe pattern languages are languages that are generated from patterns, and were first proposed by Angluin as a non-trivial class that is inferable from positive data [D. Angluin, Finding patterns common to a set of strings, Journal of Computer and System Sciences 21 (1980) 46â62; D. Angluin, Inductive inference of formal languages from positive data, Information and Control 45 (1980) 117â135]. In this paper we chronologize some results that developed from the investigations on the inferability of the pattern languages from positive data
: Méthodes d'Inférence Symbolique pour les Bases de Données
This dissertation is a summary of a line of research, that I wasactively involved in, on learning in databases from examples. Thisresearch focused on traditional as well as novel database models andlanguages for querying, transforming, and describing the schema of adatabase. In case of schemas our contributions involve proposing anoriginal languages for the emerging data models of Unordered XML andRDF. We have studied learning from examples of schemas for UnorderedXML, schemas for RDF, twig queries for XML, join queries forrelational databases, and XML transformations defined with a novelmodel of tree-to-word transducers.Investigating learnability of the proposed languages required us toexamine closely a number of their fundamental properties, often ofindependent interest, including normal forms, minimization,containment and equivalence, consistency of a set of examples, andfinite characterizability. Good understanding of these propertiesallowed us to devise learning algorithms that explore a possibly largesearch space with the help of a diligently designed set ofgeneralization operations in search of an appropriate solution.Learning (or inference) is a problem that has two parameters: theprecise class of languages we wish to infer and the type of input thatthe user can provide. We focused on the setting where the user inputconsists of positive examples i.e., elements that belong to the goallanguage, and negative examples i.e., elements that do not belong tothe goal language. In general using both negative and positiveexamples allows to learn richer classes of goal languages than usingpositive examples alone. However, using negative examples is oftendifficult because together with positive examples they may cause thesearch space to take a very complex shape and its exploration may turnout to be computationally challenging.Ce mĂ©moire est une courte prĂ©sentation dâune direction de recherche, Ă laquelle jâai activementparticipĂ©, sur lâapprentissage pour les bases de donnĂ©es Ă partir dâexemples. Cette recherchesâest concentrĂ©e sur les modĂšles et les langages, aussi bien traditionnels quâĂ©mergents, pourlâinterrogation, la transformation et la description du schĂ©ma dâune base de donnĂ©es. Concernantles schĂ©mas, nos contributions consistent en plusieurs langages de schĂ©mas pour les nouveaumodĂšles de bases de donnĂ©es que sont XML non-ordonnĂ© et RDF. Nous avons ainsi Ă©tudiĂ©lâapprentissage Ă partir dâexemples des schĂ©mas pour XML non-ordonnĂ©, des schĂ©mas pour RDF,des requĂȘtes twig pour XML, les requĂȘtes de jointure pour bases de donnĂ©es relationnelles et lestransformations XML dĂ©finies par un nouveau modĂšle de transducteurs arbre-Ă -mot.Pour explorer si les langages proposĂ©s peuvent ĂȘtre appris, nous avons Ă©tĂ© obligĂ©s dâexaminerde prĂšs un certain nombre de leurs propriĂ©tĂ©s fondamentales, souvent souvent intĂ©ressantespar elles-mĂȘmes, y compris les formes normales, la minimisation, lâinclusion et lâĂ©quivalence, lacohĂ©rence dâun ensemble dâexemples et la caractĂ©risation finie. Une bonne comprĂ©hension de cespropriĂ©tĂ©s nous a permis de concevoir des algorithmes dâapprentissage qui explorent un espace derecherche potentiellement trĂšs vaste grĂące Ă un ensemble dâopĂ©rations de gĂ©nĂ©ralisation adaptĂ© Ă la recherche dâune solution appropriĂ©e.Lâapprentissage (ou lâinfĂ©rence) est un problĂšme Ă deux paramĂštres : la classe prĂ©cise delangage que nous souhaitons infĂ©rer et le type dâinformations que lâutilisateur peut fournir. Nousnous sommes placĂ©s dans le cas oĂč lâutilisateur fournit des exemples positifs, câest-Ă -dire desĂ©lĂ©ments qui appartiennent au langage cible, ainsi que des exemples nĂ©gatifs, câest-Ă -dire qui nâenfont pas partie. En gĂ©nĂ©ral lâutilisation Ă la fois dâexemples positifs et nĂ©gatifs permet dâapprendredes classes de langages plus riches que lâutilisation uniquement dâexemples positifs. Toutefois,lâutilisation des exemples nĂ©gatifs est souvent difficile parce que les exemples positifs et nĂ©gatifspeuvent rendre la forme de lâespace de recherche trĂšs complexe, et par consĂ©quent, son explorationinfaisable
Reasoning & Querying â State of the Art
Various query languages for Web and Semantic Web data, both for practical use and as an area of research in the scientific community, have emerged in recent years. At the same time, the broad adoption of the internet where keyword search is used in many applications, e.g. search engines, has familiarized casual users with using keyword queries to retrieve information on the internet. Unlike this easy-to-use querying, traditional query languages require knowledge of the language itself as well as of the data to be queried. Keyword-based query languages for XML and RDF bridge the gap between the two, aiming at enabling simple querying of semi-structured data, which is relevant e.g. in the context of the emerging Semantic Web. This article presents an overview of the field of keyword querying for XML and RDF
Query Answering in Probabilistic Data and Knowledge Bases
Probabilistic data and knowledge bases are becoming increasingly important in academia and industry. They are continuously extended with new data, powered by modern information extraction tools that associate probabilities with knowledge base facts. The state of the art to store and process such data is founded on probabilistic database systems, which are widely and successfully employed. Beyond all the success stories, however, such systems still lack the fundamental machinery to convey some of the valuable knowledge hidden in them to the end user, which limits their potential applications in practice. In particular, in their classical form, such systems are typically based on strong, unrealistic limitations, such as the closed-world assumption, the closed-domain assumption, the tuple-independence assumption, and the lack of commonsense knowledge. These limitations do not only lead to unwanted consequences, but also put such systems on weak footing in important tasks, querying answering being a very central one. In this thesis, we enhance probabilistic data and knowledge bases with more realistic data models, thereby allowing for better means for querying them. Building on the long endeavor of unifying logic and probability, we develop different rigorous semantics for probabilistic data and knowledge bases, analyze their computational properties and identify sources of (in)tractability and design practical scalable query answering algorithms whenever possible. To achieve this, the current work brings together some recent paradigms from logics, probabilistic inference, and database theory
Recommended from our members
Representation Learning for Shape Decomposition, By Shape Decomposition
The ability to parse 3D objects into their constituent parts is essential for humans to understand and interact with the surrounding world. Imparting this skill in machines is important for various computer graphics, computer vision, and robotics tasks. Machines endowed with this skill can better interact with its surroundings, perform shape editing, texturing, recomposing, tracking, and animation. In this thesis, we ask two questions. First, how can machines decompose 3D shapes into their fundamental parts? Second, does the ability to decompose the 3D shape into these parts help learn useful 3D shape representations?
In this thesis, we focus on parsing the shape into compact representations, such as parametric surface patches and Constructive Solid Geometry (CSG) primitives, which are also widely used representations in 3D modeling in computer graphics. Inspired by the advances in neural networks for 3D shape processing, we develop neural network approaches to tackle shape decomposition. First, we present CSGNet, a network architecture to parse shapes into CSG programs, which is trained using combination of supervised and reinforcement learning. Second, we present ParSeNet, a network architecture to decompose a shape into parametric surface patches (B-Spline) and geometric primitives (plane, cone, cylinder and sphere), trained on a large set of CAD models using supervised learning.
The training of deep neural network architectures for 3D recognition and generation tasks requires a large amount of labeled datasets. We explore ways to alleviate this problem by relying on shape decomposition methods to guide the learning process. Towards that end, we first study the use of freely available metadata, albeit inconsistent, from shape repositories to learn 3D shape features. Later we show that learning to decompose a 3D shape into geometric primitives also helps in learning shape representations useful for semantic segmentation tasks. Finally, since most 3D shapes encountered in real life are textured, consisting of several fine-grained semantic parts, we propose a method to learn fine-grained representations for textured 3D shapes in a self-supervised manner by incorporating 3D geometric priors
Essays in political text: new actors, new data, new challenges
The essays in this thesis explore diverse manifestations and different aspects of political text. The two main contributions on the methodological side are bringing forward novel data on political actors who were overlooked by the existing literature and application of new approaches in text analysis to address substantive questions about them. On the theoretical side this thesis contributes to the literatures on lobbying, government transparency, post-conflict studies and gender in politics. In the first paper on interest groups in the UK I argue that contrary to much of the theoretical and empirical literature mechanisms of attaining access to government in pluralist systems critically depend on the presence of limits on campaign spending. When such limits exist, political candidates invest few resources in fund-raising and, thus, most organizations make only very few if any political donations. I collect and analyse transparency data on government department meetings and show that economic importance is one of the mechanisms that can explain variation in the level of access attained by different groups. Furthermore, I show that Brexit had a diminishing effect on this relationship between economic importance and the level of access. I also study the reported purpose of meetings and, using dynamic topic models, show the temporary shifts in policy agenda during this period. The second paper argues that civil society in post-conflict settings is capable of high-quality deliberation and, while differing in their focus, both male and female can deliver arguments pertaining to the interests of broader societal groups. Using the transcripts of civil society public consultation meetings across former Yugoslavia I show that the lack of gender-sensitive transitional justice instruments could stem not from the lack of womenâs 3 physical or verbal participation, but from the dynamic of speech enclaves and topical focus on different aspects of transitional justice process between genders. And, finally, the third paper maps the challenges that lie ahead with the proliferation of research that relies on multiple datasets. In a simulation study I show that, when the linking information is limited to text, the noise can potential occur at different levels and is often hard to anticipate in practice. Thus, the choice of record linkage requires balancing between these different scenarios. Taken together, the papers in this thesis advance the field of âtext as dataâ and contribute to our understanding of multiple political phenomena
The evolution of the general certificate of secondary education to 1986
The evolution of the G.G.S.E. was a phase both in the history of examinations and also in the social and political interaction of education with its environment. Each subject discipline has its own development. The turbulent development of modern languages appears to have experienced a more easily discernible phase of progression in the period approaching the G.G.S.E. than at other times over the century and more especially in the post-wax period; in fact 'languages' reached a greater spread of effective contact of the school population than ever before. Such an incidence of events merits .some attention even though alternative sequences were occurring in other subject disciplines. The G.G.S.E. followed in the tradition of the School Certificate, the G.G.E. and the G.S.E. yet it also mirrored major movements in British society and its expectation of public education. Competition became paramount. Differentiation resolved, somewhat, the problems of a common system for the high and low achievers. The irony was that the G.C.S.E. suited the comprehensive schools but the comprehensive schools did not suit everybody. The teaching profession, whilst trying to deal vrith this problem sensitively, felt its national profile deteriorate. These fundamental changes took place at a time of growing concern over the education system. Yet fundamental changes in society were the key to fundamental changes in education. Languages, throughout, democratized down the hierarchy of learning; other subjects followed the pattern. World War II had polarized for languages a pacific, literature-civilization from a message-communication. These became the opposing sides of the battleground, the victory being a merger of the two. This century's main lost soul of the curriculum found its resting-place in G.G.S.E. practicability'. The post-war extension to the whole ability range forced a lonesome mental introversion. Sound experienced psychoanalysis and therapy by the subject association with basic guidance from the examination boards brought restoration to a new state of health. In fact restoratives primarily for the low achiever had been vital. The new government in 1979 encouraged practicality and usefulness of school subjects. Having advised throughout, the subject associations, like others, took the initiative in the teachers' cold war lull, to sound out true opinion (which could not be done publicly due to the intractability of positions) and made recommendation to the government. The contribution of the low achiever was finally acknowledged. The subject associations, uniquely, were in a position to test opinion and act with speed. The disappearance of Ordinary Level and Grammar Schools had proved a strong brake, yet the post World War II period up to the 1980s was inevitably between staging posts of major educational reform and nothing was to stop the G.G.S.E. being by accident or design the frontrunner of a series of reforms. The sources for this study have been the professional literature and reviews underpinned by personal interviews with relevant and representative personnel
Economic Growth and Subjective Well-Being: Reassessing the Easterlin Paradox
The âEasterlin paradoxâ suggests that there is no link between a societyâs economic development and its average level of happiness. We re-assess this paradox analyzing multiple rich datasets spanning many decades. Using recent data on a broader array of countries, we establish a clear positive link between average levels of subjective well-being and GDP per capita across countries, and find no evidence of a satiation point beyond which wealthier countries have no further increases in subjective well-being. We show that the estimated relationship is consistent across many datasets and is similar to the relationship between subject well-being and income observed within countries. Finally, examining the relationship between changes in subjective well-being and income over time within countries we find economic growth associated with rising happiness. Together these findings indicate a clear role for absolute income and a more limited role for relative income comparisons in determining happiness.happiness, subjective well-being, Easterlin Paradox, life satisfaction, economic growth, well-being-income gradient, hedonic treadmill
Open City Data Pipeline
Statistical data about cities, regions and at country level is collected for various purposes and from various institutions. Yet, while
access to high quality and recent such data is crucial both for decision makers as well as for the public, all to often such collections of
data remain isolated and not re-usable, let alone properly integrated. In this paper we present the Open City Data Pipeline, a focused
attempt to collect, integrate, and enrich statistical data collected at city level worldwide, and republish this data in a reusable manner
as Linked Data. The main feature of the Open City Data Pipeline are: (i) we integrate and cleanse data from several sources in a
modular and extensible, always up-to-date fashion; (ii) we use both Machine Learning techniques as well as ontological reasoning
over equational background knowledge to enrich the data by imputing missing values, (iii) we assess the estimated accuracy of such
imputations per indicator. Additionally, (iv) we make the integrated and enriched data available both in a we browser interface and as
machine-readable Linked Data, using standard vocabularies such as QB and PROV, and linking to e.g. DBpedia.
Lastly, in an exhaustive evaluation of our approach, we compare our enrichment and cleansing techniques to a preliminary version
of the Open City Data Pipeline presented at ISWC2015: firstly, we demonstrate that the combination of equational knowledge and
standard machine learning techniques significantly helps to improve the quality of our missing value imputations; secondly, we
arguable show that the more data we integrate, the more reliable our predictions become. Hence, over time, the Open City Data
Pipeline shall provide a sustainable effort to serve Linked Data about cities in increasing quality.Series: Working Papers on Information Systems, Information Business and Operation
- âŠ