21 research outputs found

    Personalized First Issue Recommender for Newcomers in Open Source Projects

    Full text link
    Many open source projects provide good first issues (GFIs) to attract and retain newcomers. Although several automated GFI recommenders have been proposed, existing recommenders are limited to recommending generic GFIs without considering differences between individual newcomers. However, we observe mismatches between generic GFIs and the diverse background of newcomers, resulting in failed attempts, discouraged onboarding, and delayed issue resolution. To address this problem, we assume that personalized first issues (PFIs) for newcomers could help reduce the mismatches. To justify the assumption, we empirically analyze 37 newcomers and their first issues resolved across multiple projects. We find that the first issues resolved by the same newcomer share similarities in task type, programming language, and project domain. These findings underscore the need for a PFI recommender to improve over state-of-the-art approaches. For that purpose, we identify features that influence newcomers' personalized selection of first issues by analyzing the relationship between possible features of the newcomers and the characteristics of the newcomers' chosen first issues. We find that the expertise preference, OSS experience, activeness, and sentiment of newcomers drive their personalized choice of the first issues. Based on these findings, we propose a Personalized First Issue Recommender (PFIRec), which employs LamdaMART to rank candidate issues for a given newcomer by leveraging the identified influential features. We evaluate PFIRec using a dataset of 68,858 issues from 100 GitHub projects. The evaluation results show that PFIRec outperforms existing first issue recommenders, potentially doubling the probability that the top recommended issue is suitable for a specific newcomer and reducing one-third of a newcomer's unsuccessful attempts to identify suitable first issues, in the median.Comment: The 38th IEEE/ACM International Conference on Automated Software Engineering (ASE 2023

    Influence of Social Network Integration on the Online Review Helpfulness

    Get PDF
    Online consumer reviews are important for consumers when they make purchasing decisions. However, the large volume of online reviews makes it difficult for consumers to identify those helpful reviews. The influencing factors on online review helpfulness have drawn great attention from different research fields. In recent years, online review websites start to exhibit more features of social media. For example, some websites allow users to integrate with other social media accounts. The influences of such social factors, however, are rarely studied in the literature. Drawing on a dataset from Qunar.com, this paper explores how social network integration and reviewer network centrality influence online review helpfulness through a negative binomial regression model. Our results show that both factors have a positive effect on review helpfulness, and that network centrality positively moderates the effect of social network integration. Our research results provide important implications for reviewers, industry practitioners, and online review websites

    From Smart Meter Data to Pricing Intelligence -- Visual Data Mining towards Real-Time BI

    Get PDF
    The deployment of smart metering in the electricity industry has opened up the opportunity for real-time BI-enabled innovative business applications, such as demand response. Taking a holistic view of BI, this study introduced a visual data mining driven application in order to exemplify the potentials of real-time BI to the electricity businesses. The empirical findings indicate that such an application is capable of extracting actionable insights about customer’s electricity consumption patterns, which will lead to turn timely measured data into pricing intelligence. Based on the findings, we proposed a real-time BI framework, and discussed how it will facilitate the formulation of strategic initiatives for transforming the electricity utility towards sustainable growth. Our research is conducted by following the design science research paradigm. By addressing an emerging issue in the problem domain, it adds empirical knowledge to the BI research landscape

    FIN-DM: finantsteenuste andmekaeve protsessi mudel

    Get PDF
    Andmekaeve hõlmab reeglite kogumit, protsesse ja algoritme, mis võimaldavad ettevõtetel iga päev kogutud andmetest rakendatavaid teadmisi ammutades suurendada tulusid, vähendada kulusid, optimeerida tooteid ja kliendisuhteid ning saavutada teisi eesmärke. Andmekaeves ja -analüütikas on vaja hästi määratletud metoodikat ja protsesse. Saadaval on mitu andmekaeve ja -analüütika standardset protsessimudelit. Kõige märkimisväärsem ja laialdaselt kasutusele võetud standardmudel on CRISP-DM. Tegu on tegevusalast sõltumatu protsessimudeliga, mida kohandatakse sageli sektorite erinõuetega. CRISP-DMi tegevusalast lähtuvaid kohandusi on pakutud mitmes valdkonnas, kaasa arvatud meditsiini-, haridus-, tööstus-, tarkvaraarendus- ja logistikavaldkonnas. Seni pole aga mudelit kohandatud finantsteenuste sektoris, millel on omad valdkonnapõhised erinõuded. Doktoritöös käsitletakse seda lünka finantsteenuste sektoripõhise andmekaeveprotsessi (FIN-DM) kavandamise, arendamise ja hindamise kaudu. Samuti uuritakse, kuidas kasutatakse andmekaeve standardprotsesse eri tegevussektorites ja finantsteenustes. Uurimise käigus tuvastati mitu tavapärase raamistiku kohandamise stsenaariumit. Lisaks ilmnes, et need meetodid ei keskendu piisavalt sellele, kuidas muuta andmekaevemudelid tarkvaratoodeteks, mida saab integreerida organisatsioonide IT-arhitektuuri ja äriprotsessi. Peamised finantsteenuste valdkonnas tuvastatud kohandamisstsenaariumid olid seotud andmekaeve tehnoloogiakesksete (skaleeritavus), ärikesksete (tegutsemisvõime) ja inimkesksete (diskrimineeriva mõju leevendus) aspektidega. Seejärel korraldati tegelikus finantsteenuste organisatsioonis juhtumiuuring, mis paljastas 18 tajutavat puudujääki CRISP- DMi protsessis. Uuringu andmete ja tulemuste abil esitatakse doktoritöös finantsvaldkonnale kohandatud CRISP-DM nimega FIN-DM ehk finantssektori andmekaeve protsess (Financial Industry Process for Data Mining). FIN-DM laiendab CRISP-DMi nii, et see toetab privaatsust säilitavat andmekaevet, ohjab tehisintellekti eetilisi ohte, täidab riskijuhtimisnõudeid ja hõlmab kvaliteedi tagamist kui osa andmekaeve elutsüklisData mining is a set of rules, processes, and algorithms that allow companies to increase revenues, reduce costs, optimize products and customer relationships, and achieve other business goals, by extracting actionable insights from the data they collect on a day-to-day basis. Data mining and analytics projects require well-defined methodology and processes. Several standard process models for conducting data mining and analytics projects are available. Among them, the most notable and widely adopted standard model is CRISP-DM. It is industry-agnostic and often is adapted to meet sector-specific requirements. Industry- specific adaptations of CRISP-DM have been proposed across several domains, including healthcare, education, industrial and software engineering, logistics, etc. However, until now, there is no existing adaptation of CRISP-DM for the financial services industry, which has its own set of domain-specific requirements. This PhD Thesis addresses this gap by designing, developing, and evaluating a sector-specific data mining process for financial services (FIN-DM). The PhD thesis investigates how standard data mining processes are used across various industry sectors and in financial services. The examination identified number of adaptations scenarios of traditional frameworks. It also suggested that these approaches do not pay sufficient attention to turning data mining models into software products integrated into the organizations' IT architectures and business processes. In the financial services domain, the main discovered adaptation scenarios concerned technology-centric aspects (scalability), business-centric aspects (actionability), and human-centric aspects (mitigating discriminatory effects) of data mining. Next, an examination by means of a case study in the actual financial services organization revealed 18 perceived gaps in the CRISP-DM process. Using the data and results from these studies, the PhD thesis outlines an adaptation of CRISP-DM for the financial sector, named the Financial Industry Process for Data Mining (FIN-DM). FIN-DM extends CRISP-DM to support privacy-compliant data mining, to tackle AI ethics risks, to fulfill risk management requirements, and to embed quality assurance as part of the data mining life-cyclehttps://www.ester.ee/record=b547227

    Strategies and Approaches for Exploiting the Value of Open Data

    Get PDF
    Data is increasingly permeating into all dimensions of our society and has become an indispensable commodity that serves as a basis for many products and services. Traditional sectors, such as health, transport, retail, are all benefiting from digital developments. In recent years, governments have also started to participate in the open data venture, usually with the motivation of increasing transparency. In fact, governments are one of the largest producers and collectors of data in many different domains. As the increasing amount of open data and open government data initiatives show, it is becoming more and more vital to identify the means and methods how to exploit the value of this data that ultimately affects various dimensions. In this thesis we therefore focus on researching how open data can be exploited to its highest value potential, and how we can enable stakeholders to create value upon data accordingly. Albeit the radical advances in technology enabling data and knowledge sharing, and the lowering of barriers to information access, raw data was given only recently the attention and relevance it merits. Moreover, even though the publishing of data is increasing at an enormously fast rate, there are many challenges that hinder its exploitation and consumption. Technical issues hinder the re-use of data, whilst policy, economic, organisational and cultural issues hinder entities from participating or collaborating in open data initiatives. Our focus is thus to contribute to the topic by researching current approaches towards the use of open data. We explore methods for creating value upon open (government) data, and identify the strengths and weaknesses that subsequently influence the success of an open data initiative. This research then acts as a baseline for the value creation guidelines, methodologies, and approaches that we propose. Our contribution is based on the premise that if stakeholders are provided with adequate means and models to follow, then they will be encouraged to create value and exploit data products. Our subsequent contribution in this thesis therefore enables stakeholders to easily access and consume open data, as the first step towards creating value. Thereafter we proceed to identify and model the various value creation processes through the definition of a Data Value Network, and also provide a concrete implementation that allows stakeholders to create value. Ultimately, by creating value on data products, stakeholders participate in the global data economy and impact not only the economic dimension, but also other dimensions including technical, societal and political

    Three Essays on Law Enforcement and Emergency Response Information Sharing and Collaboration: An Insider Perspective

    Get PDF
    This dissertation identifies what may be done to overcome barriers to information sharing among federal, tribal, state, and local law enforcement agencies and emergency responders. Social, technical, and policy factors related to information sharing and collaboration in the law enforcement and emergency response communities are examined. This research improves information sharing and cooperation in this area. Policing in most societies exists in a state of dynamic tension between forces that tend to isolate it and those that tend to integrate its functioning with other social structures (Clark, 1965). Critical incidents and crimes today cross jurisdictions and involve multiple stakeholders and levels. Law enforcement and emergency response agencies at federal, tribal, state, and local levels, including private sector entities, gather information and resources but do not effectively share this with each other. Despite mandates to improve information sharing and cooperation, gaps remain perhaps because there is no clear understanding of what the barriers to information sharing are. Information sharing is examined using a multi-method, primarily qualitative, approach. A model for information sharing is presented that identifies social, technical, and policy factors as influencers. Facets of General Systems Theory, Socio-technical Theory, and Stakeholder Theory (among others) are considered in this context. Information sharing is the subject of the first work of the dissertation: a theoretical piece arguing for use of a conceptual framework consisting of social, technical, and policy factors. Social, technology, and policy factors are investigated in the second essay. That essay introduces a new transformative technology, edgeware, that allows for unprecedented connectivity among devices. Social and policy implications for crisis response are examined in light of having technological barriers to sharing resources reduced. Human and other factors relevant to information sharing and collaboration are further examined through a case study of the Central New York Interoperable Communications Consortium (CNYICC) Network, a five-county collaboration involving law enforcement, public safety, government, and non-government participants. The three included essays have a common focus vis-à-vis information sharing and collaboration in law enforcement and emergency response. The propositions here include: (P1) Information sharing is affected by social, technical, and policy factors, and this conceptualization frames the problem of information sharing in a way that it can be commonly understood by government and non-government stakeholders. The next proposition involves the role of technology, policy, and social systems in information sharing: (P2) Social and policy factors influence information sharing more than technical factors (assuming it is physically possible to connect and/or share). A third proposition investigated is: (P3) Social factors play the greatest role in the creation and sustaining of information sharing relationships. The findings provide a greater understanding of the forces that impact public safety agencies as they consider information sharing and will, it is hoped, lead to identifiable solutions to the problem from a new perspective

    Internal Crowdsourcing in Companies

    Get PDF
    This open access book examines the implications of internal crowdsourcing (IC) in companies. Presenting an employee-oriented, cross-sector reference model for good IC practice, it discusses the core theoretical foundations, and offers guidelines for process-management and blueprints for the implementation of IC. Furthermore, it examines solutions for employee training and competence development based on crowdsourcing. As such, the book will appeal to scholars of management science, work studies, organizational and participation research and to readers interested in inclusive approaches for cooperative change management and the IT implications for IC platforms

    Hard and soft IT governance maturity

    Get PDF
    The goal of the research in this thesis is to determine how the IT governance (ITG) of an organisation can grow in maturity to become more effective. ITG are “the structures, process, cultures and systems that engender the successful operation of the IT of the (complete) organization”. ITG is thus not restricted to the IT organisation. The research presented here follows the stream in which ITG is considered an integral part of corporate governance, focusing on the performance perspective. The proposition is that improving “ITG maturity” results in improving ITG. Given that ITG is an integral part of corporate governance, the assumption is that improving ITG results in improving corporate governance and thus, improving organisational performance. The research methodology is based on design science and a combination of systematic literature studies, Delphi workshops and case studies. Organisations can be defined as social units of people that are structured and managed to pursue collective goals. ITG can be seen from two perspectives: - An organisational perspective referred to as “hard governance”; - A social perspective referred to as “soft governance”. This research is grounded in the assumption that in order to advance in maturity, organisations should pay attention to both the hard and soft sides of ITG. In order to improve ITG, a maturity model for hard and soft ITG was designed. The end result was a model consisting of three parts: soft governance, hard governance and the context. The thesis provides a detailed description of the design process between 2014 and 2017. Another literature review demonstrated that all 12 focus areas of the MIG model are also covered by the corporate governance literature. The assessment instrument was used in case studies conducted by students and the researchers. Between 2015 and 2017, 28 case studies were conducted using three versions of the instrument. The evaluations revealed that combining the instrument with semi-structured interviews results in an enhanced and usable instrument for determining the current level of hard and soft ITG of an organisation
    corecore