32 research outputs found

    Compositional Taylor Model Based Validated Integration

    Get PDF

    INTEGRATING MACHINE LEARNING WITH SOFTWARE DEVELOPMENT LIFECYCLES: INSIGHTS FROM EXPERTS

    Get PDF
    This paper examines the challenges related to integrating machine learning (ML) development with software development lifecycle (SDLC) models. Data-intensive development and use of ML are gaining popularity in information systems development (ISD). To date, there is little empirical research that explores the challenges that ISD practitioners encounter when integrating ML development with SDLC frameworks. In this work we conducted a series of expert interviews where we asked the informants to reflect upon how four different archetypal SDLC models support ML development. Three high level trends in ML systems development emerged from the analysis, namely, (1) redefining the prescribed roles and responsibilities within development work; (2) the SDLC as a frame for creating a shared understanding and commitment by management, customers, and software development teams: and (3) method tailoring. This study advances the body of knowledge on the integration of conceptual SDLC models and ML engineering

    Data types as a more ergonomic frontend for Grammar-Guided Genetic Programming

    Full text link
    Genetic Programming (GP) is an heuristic method that can be applied to many Machine Learning, Optimization and Engineering problems. In particular, it has been widely used in Software Engineering for Test-case generation, Program Synthesis and Improvement of Software (GI). Grammar-Guided Genetic Programming (GGGP) approaches allow the user to refine the domain of valid program solutions. Backus Normal Form is the most popular interface for describing Context-Free Grammars (CFG) for GGGP. BNF and its derivatives have the disadvantage of interleaving the grammar language and the target language of the program. We propose to embed the grammar as an internal Domain-Specific Language in the host language of the framework. This approach has the same expressive power as BNF and EBNF while using the host language type-system to take advantage of all the existing tooling: linters, formatters, type-checkers, autocomplete, and legacy code support. These tools have a practical utility in designing software in general, and GP systems in particular. We also present Meta-Handlers, user-defined overrides of the tree-generation system. This technique extends our object-oriented encoding with more practicability and expressive power than existing CFG approaches, achieving the same expressive power of Attribute Grammars, but without the grammar vs target language duality. Furthermore, we evidence that this approach is feasible, showing an example Python implementation as proof. We also compare our approach against textual BNF-representations w.r.t. expressive power and ergonomics. These advantages do not come at the cost of performance, as shown by our empirical evaluation on 5 benchmarks of our example implementation against PonyGE2. We conclude that our approach has better ergonomics with the same expressive power and performance of textual BNF-based grammar encodings

    Feature learning for stock price prediction shows a significant role of analyst rating

    Get PDF
    Data Availability Statement: The code is available from https://mkhushi.github.io/ (accessed on 1 February 2021). Dataset License: License under which the dataset is made available (CC0).Efficient Market Hypothesis states that stock prices are a reflection of all the information present in the world and generating excess returns is not possible by merely analysing trade data which is already available to all public. Yet to further the research rejecting this idea, a rigorous literature review was conducted and a set of five technical indicators and 23 fundamental indicators was identified to establish the possibility of generating excess returns on the stock market. Leveraging these data points and various classification machine learning models, trading data of the 505 equities on the US S&P500 over the past 20 years was analysed to develop a classifier effective for our cause. From any given day, we were able to predict the direction of change in price by 1% up to 10 days in the future. The predictions had an overall accuracy of 83.62% with a precision of 85% for buy signals and a recall of 100% for sell signals. Moreover, we grouped equities by their sector and repeated the experiment to see if grouping similar assets together positively effected the results but concluded that it showed no significant improvements in the performance—rejecting the idea of sector-based analysis. Also, using feature ranking we could identify an even smaller set of 6 indicators while maintaining similar accuracies as that from the original 28 features and also uncovered the importance of buy, hold and sell analyst ratings as they came out to be the top contributors in the model. Finally, to evaluate the effectiveness of the classifier in real-life situations, it was backtested on FAANG (Facebook, Amazon, Apple, Netflix & Google) equities using a modest trading strategy where it generated high returns of above 60% over the term of the testing dataset. In conclusion, our proposed methodology with the combination of purposefully picked features shows an improvement over the previous studies, and our model predicts the direction of 1% price changes on the 10th day with high confidence and with enough buffer to even build a robotic trading system.This research received no external funding

    The Language Means of Comicality in Clickbait Headings

    Get PDF
    The analysis of material presented in the media discourse demonstrates significant changes in the intentionality of the journalistic text, which are reflected in establishing contacts so as to grab and retain the reader's attention. This feature of modern media text is represented in changing genre preferences, speech tactics and strategies, and, consequently, selecting and combining linguistic means. One of the manifestations of this trend is the phenomenon of clickbait, which is a communicative act of promising to continue communication. This article is dedicated to the clickbait with the semantics of comicality. The collected from the Russian-language Internet research material includes clickbait headings that promise a certain funny content. The study revealed that a clickbait model includes the following semantic components: a stimulating utterance of the subject of speech seeking to involve the reader in the humorous nature of hypertext; the verbal and non-verbal markers of the object of laughter; markers, which reflect Internet user's involvement in the communicative act. The analysis of relationship between the components of a clickbait model resulted in specifying four types of clickbait headlines: 1) narrative headlines, which invite the reader to laugh what some other readers have already laughed at; 2) offering headlines suggesting some comic entertainment; 3) allusive clickbaits that hint on the possibility to continue amusing reading; 4) nominative clickbaits, which name the expected laughing reaction to the presentation of some objects
    corecore