11,570 research outputs found

    Data quality: Some comments on the NASA software defect datasets

    Get PDF
    Background-Self-evidently empirical analyses rely upon the quality of their data. Likewise, replications rely upon accurate reporting and using the same rather than similar versions of datasets. In recent years, there has been much interest in using machine learners to classify software modules into defect-prone and not defect-prone categories. The publicly available NASA datasets have been extensively used as part of this research. Objective-This short note investigates the extent to which published analyses based on the NASA defect datasets are meaningful and comparable. Method-We analyze the five studies published in the IEEE Transactions on Software Engineering since 2007 that have utilized these datasets and compare the two versions of the datasets currently in use. Results-We find important differences between the two versions of the datasets, implausible values in one dataset and generally insufficient detail documented on dataset preprocessing. Conclusions-It is recommended that researchers 1) indicate the provenance of the datasets they use, 2) report any preprocessing in sufficient detail to enable meaningful replication, and 3) invest effort in understanding the data prior to applying machine learners

    Banking system soundness is the key to more SME financing. Bruegel Policy Contribution 2013/10, July 2013

    Get PDF
    The SME access-to-finance problem is not universal in the European Union and there are reasons for the fall in credit aggregates and higher SME lending rates in southern Europe. Possible market failures, high unemployment and externalities justify making greater and easier access to finance for SMEs a top priority. Previous European initiatives were able to support only a tiny fraction of Europe’s SMEs; merely stepping-up these programmes is unlikely to result in a breakthrough. Without repairing bank balance sheets and resuming economic growth, initiatives to help SMEs get access to finance will have limited success. The European Central Bank can foster bank recapitalisation by performing in the toughest possible way the asset quality review before it takes over the single supervisory role. Of the possible initiatives for fostering SME access to finance, a properly designed scheme for targeted central bank lending seems to be the best complement to the banking clean-up, but other options, such as increased European Investment Bank lending and the promotion of securitisation of SME loans, should also be explored

    Mining Domain Knowledge: Using Functional Dependencies to Profile Data

    Get PDF
    Poor data quality is one of the primary issues facing big data projects. Cleaning data and improving quality can be expensive and time-intensive. In data warehouse projects, data cleaning is estimated to account for 30% to 80% of the project\u27s development time and budget. Data quality mining is one method used to identify errors that has become increasingly popular in the past 20 years. Our research-in-progress aims to identify multi-field errors via the mining of functional dependencies. Existing research on data quality mining and functional dependencies has focused on improving algorithms to identify a higher percentage of complex errors. The proposed process strives to introduce an efficient method for expediting error identification and increasing a user\u27s domain knowledge in order to reduce the costs associated with cleaning; the process will also include an assessment of when further cleaning is unlikely to be cost effective

    ERBlox: Combining Matching Dependencies with Machine Learning for Entity Resolution

    Full text link
    Entity resolution (ER), an important and common data cleaning problem, is about detecting data duplicate representations for the same external entities, and merging them into single representations. Relatively recently, declarative rules called "matching dependencies" (MDs) have been proposed for specifying similarity conditions under which attribute values in database records are merged. In this work we show the process and the benefits of integrating four components of ER: (a) Building a classifier for duplicate/non-duplicate record pairs built using machine learning (ML) techniques; (b) Use of MDs for supporting the blocking phase of ML; (c) Record merging on the basis of the classifier results; and (d) The use of the declarative language "LogiQL" -an extended form of Datalog supported by the "LogicBlox" platform- for all activities related to data processing, and the specification and enforcement of MDs.Comment: Final journal version, with some minor technical corrections. Extended version of arXiv:1508.0601

    Possible and certain SQL keys

    Get PDF

    We Make the Spring Rolls, They Make Their Own Rules: Filipina Domestic Workers’ Fight for Labor Rights in New York City and Los Angeles

    Get PDF
    This article provides a multidimensional examination of Filipina domestic workers’ efforts to promote workers’ rights nationally and globally. Through their own experiences as transnational workers, Filipina activists were able to translate their knowledge of labor dynamics into practical and effective tactics such as the demand for labor contracts as an industry standard. Combining ethnographic research and interviews conducted with New York– based Filipina domestic worker activists with primary and secondary sources from Los Angeles, recent advocacy work in New York is compared with efforts in Los Angeles and California more broadly. Key points of comparison—demographics and organizing histories, geography and usage of public space, and political contexts and legislation—illuminate significant divergences and continuities between the two regions. The marchers participated in the first National Domestic Worker Congress, forging alliances with workers from across the country and taking to the the streets to support the proposed New York State Domestic Workers Bill of Rights (see Figure 1). Domestic worker rights organizing within the United States and globally has become a leading example of a multiracial women-led labor movement. New York’s Domestic Workers United (DWU) emerged as a leader in promoting successful strategies and network building, developing out of Committee Against Anti-Asian Violence (CAAAV): Organizing Asian Communities’ Kalayaan [Freedom] Women Workers’ Project. (Kalayaan is a common name under which Filipinas/os have organized globally for women workers’ rights.) CAAAV’s Women Workers’ Project (WWP) was initially led primarily by Filipina domestic workers but incubated into DWU, a multiracial organization that has succeeded in changing local and state labor laws and coordinating domestic worker rights organizing across the country. Immigrant women generally have played crucial leadership roles in DWU, which has reported that 99 percent of domestic workers in New York City were foreign-born and 76 percent were not U.S. citizens (Domestic Workers United and Data Center, 2006, 10). Through their efforts, domestic workers in general and, more specifically, the Filipina workers in New York and Los Angeles analyzed here have countered popular assumptions that they are satisfied with their conditions or too isolated to change them. Because U.S. domestic workers are denied the right to organize and lack many other labor protections, they have generally had access to few legal means to protect themselves. Moreover, until the efforts of groups such as DWU, domestic workers were often dismissed as “unorganizable” by unions because of the highly gendered, private, and isolating character of the job (Mercado and Poo, 2009, 9). Household employment produces situations in which workers are not able to claim the value of their work, offering a striking repetition of the public/private division that so long led to the undervaluation of domestic labor, paid or unpaid. However, the dual processes of feminization and casualization of work in the United States, in conjunction with women’s increasing visibility within labor and feminist movements, has produced new opportunities for women to organize. By using a range of tactics, such as “[mobilizing] public opinion, political action, and community organizing,” women are working within, in alliance with, and outside of unions (Cobble, 2007). Although street protests like the one shown in Figure 1 publicly expose domestic workers’ frustrations with their ongoing exclusion from a range of federal labor protections and civil rights laws, their turn toward activism is often based in personal experiences that spill beyond the individual workplace. On a Saturday morning during the spring of 2009, I met with CAAAV’s WWP organizer Carolyn De Leon and five WWP members at a coffee shop on Manhattan’s Upper East Side. All six were initially drawn to the project because of their own negative experiences as workers or their concern for friends and other women in their community. In 2004, Nancy Vedic’s employer attempted to forcibly send her back to the Philippines when he terminated her employment after she complained about working conditions, which included ninety hours a week for two to three dollars an hour. She was able to escape at the last minute when De Leon and other members met her outside her employers’ building, quickly taking her bags when her employer went back inside for a moment and escorted her to De Leon’s apartment. With the support of CAAAV’s WWP, Vedic brought a lawsuit against her former employer for back pay that gained local news coverage (Casimir and Shin, 2004, 8). Similarly, Nita Asuncion, after working for a family for seven years in Hong Kong and for seven more years in the United States, was fired. Her employers offered to send her “maybe one hundred dollars, maybe one hundred and fifty dollars a month” if she went back to the Philippines. These experiences drew both women to CAAAV’s WWP, but they continued to participate because they recognized that their experiences were not unique. As Asuncion stated, “We make the spring rolls, they make their own rules.” This comment received resounding laughter from the group, suggesting their familiarity with a dynamic faced by Filipina domestic workers in cities around the globe. Because of the international scope of Filipina employment in domestic work, transnational practices are key to analyzing their organizing in New York City and Los Angeles. Workers often share migration experiences whatever their location, particularly in light of U.S.-Philippine state relations and labor policies. Thus activists collaborate under large umbrella organizations such as the General Assembly Binding Women for Reforms, Integrity, Equality, Leadership, and Action (GABRIELA) and the recently formed National Domestic Workers Alliance. Nonetheless, local contexts inform different histories and current activities in Los Angeles and New York City. Combining ethnographic research and interviews conducted with New York–based Filipina domestic worker activists with primary and secondary sources from Los Angeles, I compare recent advocacy work in New York with that in Los Angeles and California more broadly. Efforts on both coasts are read through the ongoing experience of Filipina domestic workers as transnational laborers and the growing efforts on national and international levels to promote domestic workers’ rights. Key points of comparison—demographics and organizing histories, geography and usage of public space, and political contexts and legislation—illuminate significant divergences and continuities between the two regions

    Applying the FAHP to Improve the Performance Evaluation Reliability of Software Defect Classifiers

    Get PDF
    Today's software complexity makes developing defect-free software almost impossible. Consequently, developing classifiers to classify software modules into defective and non-defective before software releases have attracted great interest in academia and software industry alike. Although many classifiers have been proposed, no one has been proven superior over others. The major reason is that while a research shows that classifier A is better than classifier B, we can find other research that shows the opposite. These conflicts are usually triggered when researchers report results using their preferable performance evaluation measures such as, recall and precision. Although this approach is valid, it does not examine all possible facets of classifiers performance characteristics. Thus, the performance evaluation might improve or deteriorate if researchers choose other performance measures. As a result, software developers usually struggle to select the most suitable classifier to use in their projects. The goal of this paper is to apply the fuzzy analytical hierarchy process (FAHP) as a popular multicriteria decision-making technique to reliably evaluate classifiers' performance. This evaluation framework incorporates a wider spectrum of performance measures to evaluate classifiers performance rather than relying on selected preferable measures. The results show that this approach will increase software developers' confidence in research outcomes and help them in avoiding false conclusions and infer reasonable boundaries for them. We exploited 22 popular performance measures and 11 software defect classifiers. The analysis was carried out using KNIME data mining platform and 12 software defect data sets provided by the NASA metrics data program (MDP) repository.https://doi.org/10.1109/ACCESS.2019.291596

    On the Limits of Liberalism in Participatory Environmental Governance: Conflict and Conservation in Ukraine\u27s Danube Delta

    Get PDF
    Participatory management techniques are widely promoted in environmental and protected area governance as a means of preventing and mitigating conflict. The World Bank project that created Ukraine’s Danube Biosphere Reserve included such ‘community participation’ components. The Reserve, however, has been involved in conflicts and scandals in which rumour, denunciation and prayer have played a prominent part. The cases described in this article demonstrate that the way conflict is escalated and mitigated differs according to foundational assumptions about what ‘the political’ is and what counts as ‘politics’. The contrasting forms of politics at work in the Danube Delta help to explain why a 2005 World Bank assessment report could only see failure in the Reserve’s implementation of participatory management, and why liberal participatory management approaches may founder when introduced in settings where relationships are based on non-liberal political ontologies. The author argues that environmental management needs to be rethought in ways that take ontological differences seriously rather than assuming the universality of liberal assumptions about the individual, the political and politics
    • …
    corecore