6 research outputs found

    Mining Task-Specific Lines of Code Counters

    No full text
    Context: Lines of code (LOC) is a fundamental software code measure that is widely used as a proxy for software development effort or as a normalization factor in many other software-related measures (e.g., defect density). Unfortunately, the problem is that it is not clear which lines of code should be counted: all of them or some specific ones depending on the project context and task in mind? Objective: To design a generator of task-specific LOC measures and their counters mined directly from data that optimize the correlation between the LOC measures and variables they proxy for (e.g., code-review duration). Method: We use Design Science Research as our research methodology to build and validate a generator of task-specific LOC measures and their counters. The generated LOC counters have a form of binary decision trees inferred from historical data using Genetic Programming. The proposed tool was validated based on three tasks, i.e., mining LOC measures to proxy for code readability, number of assertions in unit tests, and code-review duration. Results: Task-specific LOC measures showed a “strong” to “very strong” negative correlation with code-readability score (Kendall’s Ï„\tau ranging from −0.83 to −0.76) compared to “weak” to “strong” negative correlation for the best among the standard LOC measures ( Ï„\tau ranging from −0.36 to −0.13). For the problem of proxying for the number of assertions in unit tests, correlation coefficients were also higher for task-specific LOC measures by ca. 11% to 21% ( Ï„\tau ranged from 0.31 to 0.34). Finally, task-specific LOC measures showed a stronger correlation with code-review duration than the best among the standard LOC measures ( Ï„\tau = 0.31, 0.36, and 0.37 compared to 0.11, 0.08, 0.16, respectively). Conclusions: Our study shows that it is possible to mine task-specific LOC counters from historical datasets using Genetic Programming. Task-specific LOC measures obtained that way show stronger correlations with the variables they proxy for than the standard LOC measures

    1.45\AA resolution crystal structure of recombinant PNP in complex with a pM multisubstrate analogue inhibitor bearing one feature of the postulated transition state

    No full text
    Low molecular mass purine nucleoside phosphorylases (PNPs, E.C. 2.4.2.1) are homotrimeric enzymes that are tightly inhibited by immucillins. Due to the positive charge on the ribose like part (iminoribitol moiety) and protonation of the N7 atom of the purine ring, immucillins are believed to act as transition state analogues. Over a wide range of concentrations, immucillins bind with strong negative cooperativity to PNPs, so that only every third binding site of the enzyme is occupied (third-of-the-sites binding). 9-(5',5'-difluoro-5'-phosphonopentyl)-9-deazaguanine (DFPP-DG) shares with immucillins the protonation of the N7, but not the positive charge on the ribose like part of the molecule. We have previously shown that DFPP-DG interacts with PNPs with subnanomolar inhibition constant. Here, we report additional biochemical experiments to demonstrate that the inhibitor can be bound with the same K(d) ( approximately 190pM) to all three substrate binding sites of the trimeric PNP, and a crystal structure of PNP in complex with DFPP-DG at 1.45A resolution, the highest resolution published for PNPs so far. The crystals contain the full PNP homotrimer in the asymmetric unit. DFPP-DG molecules are bound in superimposable manner and with full occupancies to all three PNP subunits. Thus the postulated third-of-the-sites binding of immucillins should be rather attribute to the second feature of the transition state, ribooxocarbenium ion character of the ligand or to the coexistence of both features characteristic for the transition state. The DFPP-DG/PNP complex structure confirms the earlier observations, that the loop from Pro57 to Gly66 covering the phosphate-binding site cannot be stabilized by phosphonate analogues. The loop from Glu250 to Gln266 covering the base-binding site is organized by the interactions of Asn243 with the Hoogsteen edge of the purine base of analogues bearing one feature of the postulated transition state (protonated N7 position)

    Activities of Topoisomerase I in Its Complex with SRSF1

    No full text
    Human DNA topoisomerase I (topo I) catalyzes DNA relaxation and phosphorylates SRSF1. Whereas the structure of topo I complexed with DNA has been resolved, the structure of topo I in the complex with SRSF1 and structural determinants of topo I activities in this complex are not known. The main obstacle to resolving the structure is a contribution of unfolded domains of topo I and SRSF1 in formation of the complex. To overcome this difficulty, we employed a three-step strategy: identifying the interaction regions, modeling the complex, and validating the model with biochemical methods. The binding sites in both topo I and SRSF1 are localized in the structured regions as well as in the unfolded domains. One observes cooperation between the binding sites in topo I but not in SRSF1. Our results indicate two features of the unfolded RS domain of SRSF1 containing phosphorylated residues that are critical for the kinase activity of topo I: its spatial arrangement relative to topo I and the organization of its sequence. The efficiency of phosphorylation of SRSF1 depends on the length and flexibility of the spacer between the two RRM domains that uniquely determine an arrangement of the RS domain relative to topo I. The spacer also influences inhibition of DNA nicking, a prerequisite for DNA relaxation. To be phosphorylated, the RS domain has to include a short sequence recognized by topo I. A lack of this sequence in the mutants of SRSF1 or its spatial inaccessibility in SRSF9 makes them inadequate as topo I/kinase substrates
    corecore