42 research outputs found
Multiparadigm programming: Novel devices for implementing functional and logic programming constructs in C++
Constructs for functional and logic programming can be smoothly integrated into an existing object-oriented language. We demonstrate this in the context of C++ (a statically-typed object-oriented language with effects and parametric polymorphism) via two libraries: FC++ and LC++. FC++ is a library for functional programming in C++; FC++ supports higher-order polymorphic functions, lazy lists, and a small lambda language; it also contains a large library of useful functions, datatypes, combinators, and monads. LC++ is a library for logic programming in C++; LC++ provides the same general functionality as Prolog, including the ability to return query results lazily (one at a time). Both
libraries are embedded in C++ so that they share C++'s static type system, and the library interfaces provide straightforward ways for
code from within one paradigm to ``call out' to another.
Our work describes the techniques used to implement these libraries in C++ and shows that the resulting multiparadigm language has useful
applications in real-world domains. We also describe how many of the implementation techniques can be generalized from C++ and applied to
other programming languages to yield similar results.Ph.D.Committee Chair: Yannis Smaragdakis; Committee Member: Mary Jean Harrold; Committee Member: Olin Shivers; Committee Member: Philip Wadler; Committee Member: Spencer Rugabe
A submodular optimization framework for never-ending learning : semi-supervised, online, and active learning.
The revolution in information technology and the explosion in the use of computing devices in people\u27s everyday activities has forever changed the perspective of the data mining and machine learning fields. The enormous amounts of easily accessible, information rich data is pushing the data analysis community in general towards a shift of paradigm. In the new paradigm, data comes in the form a stream of billions of records received everyday. The dynamic nature of the data and its sheer size makes it impossible to use the traditional notion of offline learning where the whole data is accessible at any time point. Moreover, no amount of human resources is enough to get expert feedback on the data. In this work we have developed a unified optimization based learning framework that approaches many of the challenges mentioned earlier. Specifically, we developed a Never-Ending Learning framework which combines incremental/online, semi-supervised, and active learning under a unified optimization framework. The established framework is based on the class of submodular optimization methods. At the core of this work we provide a novel formulation of the Semi-Supervised Support Vector Machines (S3VM) in terms of submodular set functions. The new formulation overcomes the non-convexity issues of the S3VM and provides a state of the art solution that is orders of magnitude faster than the cutting edge algorithms in the literature. Next, we provide a stream summarization technique via exemplar selection. This technique makes it possible to keep a fixed size exemplar representation of a data stream that can be used by any label propagation based semi-supervised learning technique. The compact data steam representation allows a wide range of algorithms to be extended to incremental/online learning scenario. Under the same optimization framework, we provide an active learning algorithm that constitute the feedback between the learning machine and an oracle. Finally, the developed Never-Ending Learning framework is essentially transductive in nature. Therefore, our last contribution is an inductive incremental learning technique for incremental training of SVM using the properties of local kernels. We demonstrated through this work the importance and wide applicability of the proposed methodologies
Stream Differential Equations: Specification Formats and Solution Methods
Streams, or innite sequences, are innite objects of a very simple type, yet they
have a rich theory partly due to their ubiquity in mathematics and computer science.
Stream dierential equations are a coinductive method for specifying streams and stream
operations, and their theory has been developed in many papers over the past two decades.
In this paper we present a survey of the many results in this area. Our focus is on the
classication of dierent formats of stream dierential equations, their solution methods,
and the classes of streams they can dene. Moreover, we describe in detail the connection
between the so-called syntactic solution method and abstract GSOS
Stream differential equations: Specification formats and solution methods
Streams, or infinite sequences, are infinite objects of a very simple type, yet they have a rich theory partly due to their ubiquity in mathematics and computer science. Stream differential equations are a coinductive method for specifying streams and stream operations, and their theory has been developed in many papers over the past two decades. In this paper we present a survey of the many results in this area. Our focus is on the classification of different formats of stream differential equations, their solution methods, and the classes of streams they can define. Moreover, we describe in detail the connection between the so-called syntactic solution method and abstract GSOS
Compilation of bottom-up evaluation for a pure logic programming language
Abstraction in programming languages is usually achieved at the price of run time efficiency. This thesis presents a compilation scheme for the Starlog logic programming language. In spite of being very abstract, Starlog can be compiled to an efficient executable form. Starlog implements stratified negation and includes logically pure facilities for input and output, aggregation and destructive assignment. The main new work described in this thesis is (1) a bottom-up evaluation technique which is optimised for Starlog programs, (2) a static indexing structure that allows significant compile time optimisation, (3) an intermediate language to represent bottom-up logic programs and (4) an evaluation of automatic data structure selection techniques. It is shown empirically that the performance of compiled Starlog programs can be competitive with that of equivalent hand-coded programs
The Impact of Overfitting and Overgeneralization on the Classification Accuracy in Data Mining
Current classification approaches usually do not try to achieve a balance between fitting and generalization when they infer models from training data. Such approaches ignore the possibility of different penalty costs for the false-positive, false-negative, and unclassifiable types. Thus, their performances may not be optimal or may even be coincidental. This dissertation analyzes the above issues in depth. It also proposes two new approaches called the Homogeneity-Based Algorithm (HBA) and the Convexity-Based Algorithm (CBA) to address these issues. These new approaches aim at optimally balancing the data fitting and generalization behaviors of models when some traditional classification approaches are used. The approaches first define the total misclassification cost (TC) as a weighted function of the three penalty costs and their corresponding error rates. The approaches then partition the training data into regions. In the HBA, the partitioning is done according to some homogeneous properties derivable from the training data. Meanwhile, the CBA employs some convex properties to derive regions. A traditional classification method is then used in conjunction with the HBA and CBA. Finally, the approaches apply a genetic approach to determine the optimal levels of fitting and generalization. The TC serves as the fitness function in this genetic approach. Real-life datasets from a wide spectrum of domains were used to better understand the effectiveness of the HBA and CBA. The computational results have indicated that both the HBA and CBA might potentially fill a critical gap in the implementation of current or future classification approaches. Furthermore, the results have also shown that when the penalty cost of an error type was changed, the corresponding error rate followed stepwise patterns. The finding of stepwise patterns of classification errors can assist researchers in determining applicable penalties for classification errors. Thus, the dissertation also proposes a binary search approach (BSA) to produce those patterns. Real-life datasets were utilized to demonstrate for the BSA
Computer Aided Verification
This open access two-volume set LNCS 10980 and 10981 constitutes the refereed proceedings of the 30th International Conference on Computer Aided Verification, CAV 2018, held in Oxford, UK, in July 2018. The 52 full and 13 tool papers presented together with 3 invited papers and 2 tutorials were carefully reviewed and selected from 215 submissions. The papers cover a wide range of topics and techniques, from algorithmic and logical foundations of verification to practical applications in distributed, networked, cyber-physical, and autonomous systems. They are organized in topical sections on model checking, program analysis using polyhedra, synthesis, learning, runtime verification, hybrid and timed systems, tools, probabilistic systems, static analysis, theory and security, SAT, SMT and decisions procedures, concurrency, and CPS, hardware, industrial applications
Computer Aided Verification
This open access two-volume set LNCS 10980 and 10981 constitutes the refereed proceedings of the 30th International Conference on Computer Aided Verification, CAV 2018, held in Oxford, UK, in July 2018. The 52 full and 13 tool papers presented together with 3 invited papers and 2 tutorials were carefully reviewed and selected from 215 submissions. The papers cover a wide range of topics and techniques, from algorithmic and logical foundations of verification to practical applications in distributed, networked, cyber-physical, and autonomous systems. They are organized in topical sections on model checking, program analysis using polyhedra, synthesis, learning, runtime verification, hybrid and timed systems, tools, probabilistic systems, static analysis, theory and security, SAT, SMT and decisions procedures, concurrency, and CPS, hardware, industrial applications