1,291 research outputs found

    A 2007 Model Curriculum For A Liberal Arts Degree In Computer Science

    Get PDF

    Granular computing based approach of rule learning for binary classification

    Get PDF
    Rule learning is one of the most popular types of machine-learning approaches, which typically follow two main strategies: ‘divide and conquer’ and ‘separate and conquer’. The former strategy is aimed at induction of rules in the form of a decision tree, whereas the latter one is aimed at direct induction of if–then rules. Due to the case that the divide and conquer strategy could result in the replicated sub-tree problem, which not only leads to overfitting but also increases the computational complexity in classifying unseen instances, researchers have thus been motivated to develop rule learning approaches through the separate and conquer strategy. In this paper, we focus on investigation of the Prism algorithm, since it is a representative one that follows the separate and conquer strategy, and is aimed at learning a set of rules for each class in the setting of granular computing, where each class (referred to as target class) is viewed as a granule. The Prism algorithm shows highly comparable performance to the most popular algorithms, such as ID3 and C4.5, which follow the divide and conquer strategy. However, due to the need to learn a rule set for each class, Prism usually produces very complex rule-based classifiers. In real applications, there are many problems that involve one target class only, so it is not necessary to learn a rule set for each class, i.e., only a set of rules for the target class needs to be learned and a default rule is used to indicate the case of non-target classes. To address the above issues of Prism, we propose a new version of the algorithm referred to as PrismSTC, where ‘STC’ stands for ‘single target class’. Our experimental results show that PrismSTC leads to production of simpler rule-based classifiers without loss of accuracy in comparison with Prism. PrismSTC also demonstrates sufficiently good performance comparing with C4.5

    Deriving divide-and-conquer dynamic programming algorithms using solver-aided transformations

    Get PDF
    We introduce a framework allowing domain experts to manipulate computational terms in the interest of deriving better, more efficient implementations.It employs deductive reasoning to generate provably correct efficient implementations from a very high-level specification of an algorithm, and inductive constraint-based synthesis to improve automation. Semantic information is encoded into program terms through the use of refinement types. In this paper, we develop the technique in the context of a system called Bellmania that uses solver-aided tactics to derive parallel divide-and-conquer implementations of dynamic programming algorithms that have better locality and are significantly more efficient than traditional loop-based implementations. Bellmania includes a high-level language for specifying dynamic programming algorithms and a calculus that facilitates gradual transformation of these specifications into efficient implementations. These transformations formalize the divide-and conquer technique; a visualization interface helps users to interactively guide the process, while an SMT-based back-end verifies each step and takes care of low-level reasoning required for parallelism. We have used the system to generate provably correct implementations of several algorithms, including some important algorithms from computational biology, and show that the performance is comparable to that of the best manually optimized code.National Science Foundation (U.S.) (CCF-1139056)National Science Foundation (U.S.) (CCF- 1439084)National Science Foundation (U.S.) (CNS-1553510

    Inductive programming meets the real world

    Full text link
    © Gulwani, S. et al. | ACM 2015. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in Communications of the ACM, http://dx.doi.org/10.1145/2736282[EN] Since most end users lack programming skills they often spend considerable time and effort performing tedious and repetitive tasks such as capitalizing a column of names manually. Inductive Programming has a long research tradition and recent developments demonstrate it can liberate users from many tasks of this kind.Gulwani, S.; Hernández-Orallo, J.; Kitzelmann, E.; Muggleton, SH.; Schmid, U.; Zorn, B. (2015). Inductive programming meets the real world. Communications of the ACM. 58(11):90-99. doi:10.1145/2736282S90995811Bengio, Y., Courville, A. and Vincent, P. Representation learning: A review and new perspectives.Pattern Analy. Machine Intell. 35, 8 (2013), 1798--1828.Bielawski, B. Using the convertfrom-string cmdlet to parse structured text.PowerShell Magazine, (Sept. 9, 2004); http://www.powershellmagazine.com/2014/09/09/using-the-convertfrom-string-cmdlet-to-parse-structured-text/Carlson, A., Betteridge, J., Kisiel, B., Settles, B., Hruschka-Jr, E.R. and T.M. Mitchell, T.M. Toward an architecture for never-ending language learning. InAAAI, 2010.Chandola, V., Banerjee, A. and V. Kumar, V. Anomaly detection: A survey.ACM Computing Surveys 41, 3 (2009), 15.Cypher, A. (Ed).Watch What I Do: Programming by Demonstration.MIT Press, Cambridge, MA, 1993.Ferri-Ramírez, C., Hernández-Orallo, J. and Ramírez-Quintana, M.J. Incremental learning of functional logic programs. InProceedings of FLOPS, 2001, 233--247.Flener, P. and Schmid, U. An introduction to inductive programming.AI Review 29, 1 (2009), 45--62.Gulwani, S. Dimensions in program synthesis. InProceedings of PPDP, 2010.Gulwani, S. Automating string processing in spreadsheets using input-output examples. InProceedings of POPL, 2011; http://research.microsoft.com/users/sumitg/flashfill.html.Gulwani, S. Example-based learning in computer-aided STEM education.Commun. ACM 57, 8 (Aug 2014), 70--80.Gulwani, S., Harris, W. and Singh, R. Spreadsheet data manipulation using examples.Commun. ACM 55, 8 (Aug. 2012), 97--105.Henderson, R.J. and Muggleton, S.H. Automatic invention of functional abstractions.Latest Advances in Inductive Logic Programming, 2012.Hernández-Orallo, J. Deep knowledge: Inductive programming as an answer, Dagstuhl TR 13502, 2013.Hofmann, M. and Kitzelmann, E. I/O guided detection of list catamorphisms---towards problem specific use of program templates in IP. InACM SIGPLAN PEPM, 2010.Jha, J., Gulwani, S., Seshia, S. and Tiwari, A. Oracle-guided component-based program synthesis. InProceedings of the ICSE, 2010.Katayama, S. Efficient exhaustive generation of functional programs using Monte-Carlo search with iterative deepening. InProceedings of PRICAI, 2008.Kitzelmann, E. Analytical inductive functional programming.LOPSTR 2008, LNCS 5438.Springer, 2009, 87--102.Kitzelmann, E. Inductive programming: A survey of program synthesis techniques. InAAIP, Springer, 2010, 50--73.Kitzelmann, E. and Schmid, U. Inductive synthesis of functional programs: An explanation based generalization approach.J. Machine Learning Research 7, (Feb. 2006), 429--454.Kotovsky, K., Hayes, J.R. and Simon, H.A. Why are some problems hard? Evidence from Tower of Hanoi.Cognitive Psychology 17, 2 (1985), 248--294.Lau, T.A. Why programming-by-demonstration systems fail: Lessons learned for usable AI.AI Mag. 30, 4, (2009), 65--67.Lau, T.A., Wolfman, S.A., Domingos, P. and Weld, D.S. Programming by demonstration using version space algebra.Machine Learning 53, 1-2 (2003), 111--156.Le, V. and Gulwani, S. FlashExtract: A framework for data extraction by examples. InProceedings of PLDI, 2014.Lieberman, H. (Ed).Your Wish is My Command: Programming by Example.Morgan Kaufmann, 2001.Lin, D., Dechter, E., Ellis, K., Tenenbaum, J.B. and Muggleton, S.H. Bias reformulation for one-shot function induction. InProceedings of ECAI, 2014.Marcus, G.F. The Algebraic Mind.Integrating Connectionism and Cognitive Science.Bradford, Cambridge, MA, 2001.Martìnez-Plumed, C. Ferri, Hernández-Orallo, J. and M.J. Ramírez-Quintana. On the definition of a general learning system with user-defined operators.arXiv preprint arXiv:1311.4235, 2013.Menon, A., Tamuz, O., Gulwani, S., Lampson, B. and Kalai, A. A machine learning framework for programming by example. InProceedings of the ICML, 2013.Miller, R.C. and Myers, B.A. Multiple selections in smart text editing. InProceedings of IUI, 2002, 103--110.Muggleton, S.H. Inductive Logic Programming.New Generation Computing 8, 4 (1991), 295--318.Muggleton, S.H. and Lin, D. Meta-interpretive learning of higher-order dyadic datalog: Predicate invention revisited.IJCAI 2013, 1551--1557.Muggleton, S.H., Lin, D., Pahlavi, N. and Tamaddoni-Nezhad, A. Meta-interpretive learning: application to grammatical inference.Machine Learning 94(2014), 25--49.Muggleton, S.H., De Raedt, L., Poole, D., Bratko, I., Flach, P. and Inoue, P. ILP turns 20: Biography and future challenges.Machine Learning 86, 1 (2011), 3--23.Olsson, R. Inductive functional programming using incremental program transformation.Artificial Intelligence 74, 1 (1995), 55--83.Perelman, D., Gulwani, S., Grossman, D. and Provost, P. Test-driven synthesis.PLDI, 2014.Raza, M., Gulwani, S. and Milic-Frayling, N. Programming by example using least general generalizations.AAAI, 2014.Schmid, U. and Kitzelmann, E. Inductive rule learning on the knowledge level.Cognitive Systems Research 12, 3 (2011), 237--248.Schmid, U. and Wysotzki, F. Induction of recursive program schemes.ECML 1398 LNAI(1998), 214--225.Shapiro, E.Y. An algorithm that infers theories from facts.IJCAI(1981), 446--451.Solar-Lezama, A.Program Synthesis by Sketching.Ph.D thesis, UC Berkeley, 2008.Summers, P.D. A methodology for LISP program construction from examples.JACM 24, 1 (1977), 162--175.Tenenbaum, J.B., Griffiths, T.L. and Kemp, C. Theory-based Bayesian models of inductive learning and reasoning.Trends in Cognitive Sciences 10, 7 (2006), 309--318.Young, S. Cognitive user interfaces.IEEE Signal Processing 27, 3 (2010), 128--140

    A novel fuzzy first-order logic learning system.

    Get PDF
    Tse, Ming Fun.Thesis submitted in: December 2001.Thesis (M.Phil.)--Chinese University of Hong Kong, 2002.Includes bibliographical references (leaves 142-146).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Problem Definition --- p.2Chapter 1.2 --- Contributions --- p.3Chapter 1.3 --- Thesis Outline --- p.4Chapter 2 --- Literature Review --- p.6Chapter 2.1 --- Representing Inexact Knowledge --- p.7Chapter 2.1.1 --- Nature of Inexact Knowledge --- p.7Chapter 2.1.2 --- Probability Based Reasoning --- p.8Chapter 2.1.3 --- Certainty Factor Algebra --- p.11Chapter 2.1.4 --- Fuzzy Logic --- p.13Chapter 2.2 --- Machine Learning Paradigms --- p.13Chapter 2.2.1 --- Classifications --- p.14Chapter 2.2.2 --- Neural Networks and Gradient Descent --- p.15Chapter 2.3 --- Related Learning Systems --- p.21Chapter 2.3.1 --- Relational Concept Learning --- p.21Chapter 2.3.2 --- Learning of Fuzzy Concepts --- p.24Chapter 2.4 --- Fuzzy Logic --- p.26Chapter 2.4.1 --- Fuzzy Set --- p.27Chapter 2.4.2 --- Basic Notations in Fuzzy Logic --- p.29Chapter 2.4.3 --- Basic Operations on Fuzzy Sets --- p.29Chapter 2.4.4 --- "Fuzzy Relations, Projection and Cylindrical Extension" --- p.31Chapter 2.4.5 --- Fuzzy First Order Logic and Fuzzy Prolog --- p.34Chapter 3 --- Knowledge Representation and Learning Algorithm --- p.43Chapter 3.1 --- Knowledge Representation --- p.44Chapter 3.1.1 --- Fuzzy First-order Logic ´ؤ A Powerful Language --- p.44Chapter 3.1.2 --- Literal Forms --- p.48Chapter 3.1.3 --- Continuous Variables --- p.50Chapter 3.2 --- System Architecture --- p.61Chapter 3.2.1 --- Data Reading --- p.61Chapter 3.2.2 --- Preprocessing and Postprocessing --- p.67Chapter 4 --- Global Evaluation of Literals --- p.71Chapter 4.1 --- Existing Closeness Measures between Fuzzy Sets --- p.72Chapter 4.2 --- The Error Function and the Normalized Error Functions --- p.75Chapter 4.2.1 --- The Error Function --- p.75Chapter 4.2.2 --- The Normalized Error Functions --- p.76Chapter 4.3 --- The Nodal Characteristics and the Error Peaks --- p.79Chapter 4.3.1 --- The Nodal Characteristics --- p.79Chapter 4.3.2 --- The Zero Error Line and the Error Peaks --- p.80Chapter 4.4 --- Quantifying the Nodal Characteristics --- p.85Chapter 4.4.1 --- Information Theory --- p.86Chapter 4.4.2 --- Applying the Information Theory --- p.88Chapter 4.4.3 --- Upper and Lower Bounds of CE --- p.89Chapter 4.4.4 --- The Whole Heuristics of FF99 --- p.93Chapter 4.5 --- An Example --- p.94Chapter 5 --- Partial Evaluation of Literals --- p.99Chapter 5.1 --- Importance of Covering in Inductive Learning --- p.100Chapter 5.1.1 --- The Divide-and-conquer Method --- p.100Chapter 5.1.2 --- The Covering Method --- p.101Chapter 5.1.3 --- Effective Pruning in Both Methods --- p.102Chapter 5.2 --- Fuzzification of FOIL --- p.104Chapter 5.2.1 --- Analysis of FOIL --- p.104Chapter 5.2.2 --- Requirements on System Fuzzification --- p.107Chapter 5.2.3 --- Possible Ways in Fuzzifing FOIL --- p.109Chapter 5.3 --- The α Covering Method --- p.111Chapter 5.3.1 --- Construction of Partitions by α-cut --- p.112Chapter 5.3.2 --- Adaptive-α Covering --- p.112Chapter 5.4 --- The Probabistic Covering Method --- p.114Chapter 6 --- Results and Discussions --- p.119Chapter 6.1 --- Experimental Results --- p.120Chapter 6.1.1 --- Iris Plant Database --- p.120Chapter 6.1.2 --- Kinship Relational Domain --- p.122Chapter 6.1.3 --- The Fuzzy Relation Domain --- p.129Chapter 6.1.4 --- Age Group Domain --- p.134Chapter 6.1.5 --- The NBA Domain --- p.135Chapter 6.2 --- Future Development Directions --- p.137Chapter 6.2.1 --- Speed Improvement --- p.137Chapter 6.2.2 --- Accuracy Improvement --- p.138Chapter 6.2.3 --- Others --- p.138Chapter 7 --- Conclusion --- p.140Bibliography --- p.142Chapter A --- C4.5 to FOIL File Format Conversion --- p.147Chapter B --- FF99 example --- p.15

    Automatically evolving rule induction algorithms with grammar-based genetic programming

    Get PDF
    In the last 30 years, research in the field of rule induction algorithms produced a large number of algorithms. However, these algorithms are usually obtained from the combination of a basic rule induction algorithm (typically following the sequential covering approach) with new evaluation functions, pruning methods and stopping criteria for refining or producing rules, generating many "new" and more sophisticated sequential covering algorithms. We cannot deny that these attempts to improve the basic sequential covering approach have succeeded. Hence, if manually changing these major components of rule induction algorithms can result in new, significantly better ones, why not to automate this process to make it more cost-effective? This is the core idea of this work: to automate the process of designing rule induction algorithms by means of grammar-based genetic programming. Grammar-based Genetic Programming (GGP) is a special type of evolutionary algorithm used to automatically evolve computer programs. The most interesting feature of this type of algorithm is that it incorporates a grammar into its search mechanism, which expresses prior knowledge about the problem being solved. Since we have a lot of previous knowledge about how humans design rule induction algorithms, this type of algorithm is intuitively a suitable tool to automatically evolve rule induction algorithms. The grammar given to the proposed GGP system includes knowledge about how humans- design rule induction algorithms, and also presents some new elements which could work in rule induction algorithms, but to the best of our knowledge were not previously tested. The GG P system aims to evolve rule induction algorithms under two different frameworks, as follows. In the first framework, the GGP is used to evolve robust rule induction algorithms, i.e., algorithms which were designed to be applied to virtually any classification data set, like a manually-designed rule induction algorithm. In the second framework, the GGP is applied to evolve rule induction algorithms tailored to a specific application XVI domain, i.e., rule induction algorithms tailored to a single data set. Note that the latter framework is hardly feasible on a hard scale in the case of conventional, manually-designed algorithms, since the number of classification data sets greatly outnumbers the number of rule induction algorithms designers. However, it is clearly feasible on a large scale when using the proposed system, which automates the process of rule induction algorithm design and implementation. Overall, extensive computational experiments with 20 VCI data sets and 5 bioinformatics data sets showed that effective rule induction algorithms can be automatically generated using the GGP in both frameworks. Moreover, the automatically evolved rule induction algorithms were shown to be competitive with (and overall slightly better than) four well-known manually designed rule induction algorithms when comparing their predictive accuracies. The proposed GGP system was also compared to a grammar-based hillclimbing system, and experimental results showed that the GGP system is a more effective method to evolve rule induction algorithms than the grammar-based hillclimbing method. At last, a multi-objective version of the GGP (based on the concept of Pareto dominance) was also proposed, and experiments were performed to evolve robust rule induction algorithms which generate both accurate and simple models. The results showed that in most of the cases the GGP system can produce rule induction algorithms which are competitive in predictive accuracy to wellknown human-designed rule induction algorithms, but generate simpler classification modes - i.e., smaller rule sets, intuitively easier to be interpreted by the user
    • …
    corecore