4,130 research outputs found
MARIO: A Versatile and User-Friendly Software for Building Input-Output Models
MARIO (Multi-Regional Analysis of Regions through Input-Output) is a Python-based framework for building input-output models. It automates the parsing of well-known databases (e.g. EXIOBASE, EORA, Eurostat) and of customized tables. With respect to similar tools, like pymrio, it broadens the scope of application to supply-use tables and handles both monetary and physical units. Employing an intuitive Excel-based API, it facilitates advanced table manipulations and allows for modelling additional supply chains through a hybrid LCA approach. It provides built-in functions for footprinting and scenario analyses as well as for visualizations of model outcomes. Results are exportable into various formats, possibly supplemented by a metadata file tracking the full history of applied changes. MARIO comes with extensive documentation and is available on Zenodo, GitHub, or installable via PyPI
ASSESSING THE QUALITY OF SOFTWARE DEVELOPMENT TUTORIALS AVAILABLE ON THE WEB
Both expert and novice software developers frequently access software development resources available on the Web in order to lookup or learn new APIs, tools and techniques. Software quality is affected negatively when developers fail to find high-quality information relevant to their problem. While there is a substantial amount of freely available resources that can be accessed online, some of the available resources contain information that suffers from error proneness, copyright infringement, security concerns, and incompatible versions. Use of such toxic information can have a strong negative effect on developer’s efficacy. This dissertation focuses specifically on software tutorials, aiming to automatically evaluate the quality of such documents available on the Web. In order to achieve this goal, we present two contributions: 1) scalable detection of duplicated code snippets; 2) automatic identification of valid version ranges.
Software tutorials consist of a combination of source code snippets and natural language text. The code snippets in a tutorial can originate from different sources, perhaps carrying stringent licensing requirements or known security vulnerabilities. Developers, typically unaware of this, can reuse these code snippets in their project. First, in this thesis, we present our work on a Web-scale code clone search technique that is able to detect duplicate code snippets between large scale document and source code corpora in order to trace toxic code snippets.
As software libraries and APIs evolve over time, existing software development tutorials can become outdated. It is difficult for software developers and especially novices to determine the expected version of the software implicit in a specific tutorial in order to decide whether the tutorial is applicable to their software development environment. To overcome this challenge, in this thesis we present a novel technique for automatic identification of the valid version range of software development tutorials on the Web
ADVANTAGES OF USING OBJECT-ORIENTED TECHNOLOGIES IN MODELING COSTS
The project was created with the intention of helping the managers, whose objective is to optimize the use of resources so that they obtain the wanted profit. In the first paragraphs we presented the theoretical concepts that we had in order to make this application. We pointed out the necessity to pass to object oriented programming, underlining the main advantages that made us chose this type of programming. Next, we showed the importance of the production cost in the decisional process and its calculus methods.object oriented technologies, costs management, optimization, linear programming, object -oriented programming
Learning to predict closed questions on stack overflow
The paper deals with the problem of predicting whether the user’s question will be closed by the moderator on Stack Overflow, a popular question answering service devoted to software programming. The task along with data and evaluation metrics was offered as an open machine learning competition on Kaggle platform. To solve this problem, we employed a wide range of classification features related to users, their interactions, and post content. Classification was carried out using several machine learning methods. According to the results of the experiment, the most important features are characteristics of the user and topical features of the question. The best results were obtained using Vowpal Wabbit – an implementation of online learning based on stochastic gradient descent. Our results are among the best ones in overall ranking, although they were obtained after the official competition was over
Unified functional network and nonlinear time series analysis for complex systems science: The pyunicorn package
We introduce the \texttt{pyunicorn} (Pythonic unified complex network and
recurrence analysis toolbox) open source software package for applying and
combining modern methods of data analysis and modeling from complex network
theory and nonlinear time series analysis. \texttt{pyunicorn} is a fully
object-oriented and easily parallelizable package written in the language
Python. It allows for the construction of functional networks such as climate
networks in climatology or functional brain networks in neuroscience
representing the structure of statistical interrelationships in large data sets
of time series and, subsequently, investigating this structure using advanced
methods of complex network theory such as measures and models for spatial
networks, networks of interacting networks, node-weighted statistics or network
surrogates. Additionally, \texttt{pyunicorn} provides insights into the
nonlinear dynamics of complex systems as recorded in uni- and multivariate time
series from a non-traditional perspective by means of recurrence quantification
analysis (RQA), recurrence networks, visibility graphs and construction of
surrogate time series. The range of possible applications of the library is
outlined, drawing on several examples mainly from the field of climatology.Comment: 28 pages, 17 figure
fastmat: efficient linear transforms in Python
Scientific computing requires handling large linear models, which are often composed of structured matrices. With increasing model size, dense representations quickly become infeasible to compute or store. Matrix-free implementations are suited to mitigate this problem at the expense of additional implementation overhead, which complicates research and development effort by months, when applied to practical research problems. Fastmat is a framework for handling large structured matrices by offering an easy-to-use abstraction model. It allows for the expression of matrix-free linear operators in a mathematically intuitive way, while retaining their benefits in computation performance and memory efficiency. A built-in hierarchical unit-test system boosts debugging productivity and run-time execution path optimization improves the performance of highly-structured operators. The architecture is completed with an interface for abstractly describing algorithms that apply such matrix-free linear operators, while maintaining clear separation of their respective implementation levels. Fastmat achieves establishing a close relationship between implementation code and the actual mathematical notation of a given problem, promoting readable, portable and re-usable scientific code
- …