35 research outputs found

    Interpol: An R package for preprocessing of protein sequences

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Most machine learning techniques currently applied in the literature need a fixed dimensionality of input data. However, this requirement is frequently violated by real input data, such as DNA and protein sequences, that often differ in length due to insertions and deletions. It is also notable that performance in classification and regression is often improved by numerical encoding of amino acids, compared to the commonly used sparse encoding.</p> <p>Results</p> <p>The software "Interpol" encodes amino acid sequences as numerical descriptor vectors using a database of currently 532 descriptors (mainly from AAindex), and normalizes sequences to uniform length with one of five linear or non-linear interpolation algorithms. Interpol is distributed with open source as platform independent R-package. It is typically used for preprocessing of amino acid sequences for classification or regression.</p> <p>Conclusions</p> <p>The functionality of Interpol widens the spectrum of machine learning methods that can be applied to biological sequences, and it will in many cases improve their performance in classification and regression.</p

    Homology-based inference sets the bar high for protein function prediction

    Get PDF
    Background: Any method that de novo predicts protein function should do better than random. More challenging, it also ought to outperform simple homology-based inference. Methods: Here, we describe a few methods that predict protein function exclusively through homology. Together, they set the bar or lower limit for future improvements. Results and conclusions: During the development of these methods, we faced two surprises. Firstly, our most successful implementation for the baseline ranked very high at CAFA1. In fact, our best combination of homology-based methods fared only slightly worse than the top-of-the-line prediction method from the Jones group. Secondly, although the concept of homology-based inference is simple, this work revealed that the precise details of the implementation are crucial: not only did the methods span from top to bottom performers at CAFA, but also the reasons for these differences were unexpected. In this work, we also propose a new rigorous measure to compare predicted and experimental annotations. It puts more emphasis on the details of protein function than the other measures employed by CAFA and may best reflect the expectations of users. Clearly, the definition of proper goals remains one major objective for CAFA

    Task-specific architecture documentation for developers: Why separation of concerns in architecture documentation is counterproductive for developers

    No full text
    It is widely agreed that architecture documentation, independent of its form, is necessary to prescribe architectural concepts for development and to conserve architectural information over time. However, very often architecture documentation is perceived as inadequate, too long, too abstract, too detailed, or simply outdated. While developers have tasks to develop certain features or parts of a system, they are confronted with architecture documents that globally describe the architecture and use concepts like separation of concerns. Then, the developers have the hard task to find all information of the separated concerns and to synthesize the excerpt relevant for their concrete task. Ideally, they would get an architecture document, which is exactly tailored to their need of architectural information for their task at hand. Such documentation can however not be created by architects in reasonable time. In this paper, we propose an approach of modeling architecture and automatically synthesizing a tailored architecture documentation for each developer and each development task. Therefore architectural concepts are selected from the model based on the task and an interleaving of concepts is done. This makes for example all interfaces explicit, which a component has to implement in order to comply with security, availability, etc. concepts. The required modeling and automation is realized in the tool Enterprise Architect. We got already very positive feedback for this idea from practitioners and expect a significant improvement of implementation quality and architecture compliance

    Optimierte Agilität: Qualitätssteigerung durch Best Practices

    No full text
    Ziel agiler Entwicklungsvorgehen ist eine effiziente Erstellung hochqualitativer Softwareprodukte. Im Fokus der Qualitätsbetrachtung steht bei agilen Projekten meist die funktionale Vollständigkeit und Korrektheit des Softwareprodukts. Damit aber die übrigen Qualitätsmerkmale nicht leiden, kann auf bewährte Best Practices aus verschiedenen Disziplinen des Softwareengineerings zurückgegriffen werden. Diese müssen allerdings in vielen Fällen angepasst werden, damit sie nahtlos in agile Entwicklungsprozesse integriert werden können

    The impact of social class on top managers’ attitudes towards employee downsizing

    Full text link
    In this paper, we examine the impact of top managers' social class on their attitude towards employee downsizing. Mobilizing Bourdieu's concepts of social class as a unique social position defined by the combination of economic, cultural, and social capital, we develop hypotheses about the effects of different capital endowments, which we test with unique data on more than 2500 top managers in Germany. We find that both higher economic and higher social capital increase openness towards employee dismissals, while higher cultural capital reduces it. We also find that the overall effect of a top manager's social position is an aggregate of the effects of the individual types of capital: Managers with high cultural, low social and low economic capital are least open to employee dismissals, while those with low cultural, high social and high economic capital are most open – with the other combinations lying somewhere between the two extremes
    corecore