21 research outputs found
Personalized First Issue Recommender for Newcomers in Open Source Projects
Many open source projects provide good first issues (GFIs) to attract and
retain newcomers. Although several automated GFI recommenders have been
proposed, existing recommenders are limited to recommending generic GFIs
without considering differences between individual newcomers. However, we
observe mismatches between generic GFIs and the diverse background of
newcomers, resulting in failed attempts, discouraged onboarding, and delayed
issue resolution. To address this problem, we assume that personalized first
issues (PFIs) for newcomers could help reduce the mismatches. To justify the
assumption, we empirically analyze 37 newcomers and their first issues resolved
across multiple projects. We find that the first issues resolved by the same
newcomer share similarities in task type, programming language, and project
domain. These findings underscore the need for a PFI recommender to improve
over state-of-the-art approaches. For that purpose, we identify features that
influence newcomers' personalized selection of first issues by analyzing the
relationship between possible features of the newcomers and the characteristics
of the newcomers' chosen first issues. We find that the expertise preference,
OSS experience, activeness, and sentiment of newcomers drive their personalized
choice of the first issues. Based on these findings, we propose a Personalized
First Issue Recommender (PFIRec), which employs LamdaMART to rank candidate
issues for a given newcomer by leveraging the identified influential features.
We evaluate PFIRec using a dataset of 68,858 issues from 100 GitHub projects.
The evaluation results show that PFIRec outperforms existing first issue
recommenders, potentially doubling the probability that the top recommended
issue is suitable for a specific newcomer and reducing one-third of a
newcomer's unsuccessful attempts to identify suitable first issues, in the
median.Comment: The 38th IEEE/ACM International Conference on Automated Software
Engineering (ASE 2023
Influence of Social Network Integration on the Online Review Helpfulness
Online consumer reviews are important for consumers when they make purchasing decisions. However, the large volume of online reviews makes it difficult for consumers to identify those helpful reviews. The influencing factors on online review helpfulness have drawn great attention from different research fields. In recent years, online review websites start to exhibit more features of social media. For example, some websites allow users to integrate with other social media accounts. The influences of such social factors, however, are rarely studied in the literature. Drawing on a dataset from Qunar.com, this paper explores how social network integration and reviewer network centrality influence online review helpfulness through a negative binomial regression model. Our results show that both factors have a positive effect on review helpfulness, and that network centrality positively moderates the effect of social network integration. Our research results provide important implications for reviewers, industry practitioners, and online review websites
Recommended from our members
Commit to Be Fit: Antecedents and Consequences of Goal-Directed Health and Fitness Technology Use
This three-essay dissertation explores the antecedents and consequences of goal-directed health and fitness technology use. My first essay is an interdisciplinary review of consumer health IT research in the IS field and health informatics field. To explore the nature of consumer health IT research, I conducted a thematic analysis using Orlikowski and Iacono's (2001) IT artifact framework. I found that the nominal, proxy, and tool views of IT artifact are the most widely used perspectives when scholars frame how consumer health IT as an IT artifact is designed, deployed, and used. This study summarizes current consumer health IT research trends and suggests promising directions for future work.Inspired by my first essay, my second and third essays focus on extending current knowledge of fitness technologies, a subtype of consumer health IT. In my second essay, I look at fitness technology use behavior from a goal-directed behavior perspective, to ask how fitness technologies help users achieve their fitness goals. Drawing on the concepts of IT affordance and engagement, I propose that actualized fitness technology affordances influence users' cognitive and emotional exercise engagement, leading to fitness goal attainment. The results show that only emotional exercise engagement exerts significant influences on fitness goal attainment and that only actualized self-appraisal affordance and actualized social appraisal affordance significantly impact users' emotional exercise engagement.My third essay focuses on understanding individuals' discontinuance and habitual use of fitness trackers from the perspectives of fitness tracker identity and exercise identity. Drawing on Carter and Grover's (2015) theoretical model of IT identity, this research investigates how the technological, personal, and social aspects of users' fitness technology use experiences shape their fitness tracker identity to influence their fitness tracker use behaviors. My results show that individuals who positively identify with a fitness tracker are less likely to discontinue and more likely to develop habitual use. Further, for the same fitness tracker identity, employees with stronger exercise identity have lower discontinuance intention. Also, for the same fitness tracker identity, employees with more monetary rewards from their organizations have lower discontinuance intention. Moreover, I found various significant antecedents of fitness tracker identity
From Smart Meter Data to Pricing Intelligence -- Visual Data Mining towards Real-Time BI
The deployment of smart metering in the electricity industry has opened up the opportunity for real-time BI-enabled innovative business applications, such as demand response. Taking a holistic view of BI, this study introduced a visual data mining driven application in order to exemplify the potentials of real-time BI to the electricity businesses. The empirical findings indicate that such an application is capable of extracting actionable insights about customer’s electricity consumption patterns, which will lead to turn timely measured data into pricing intelligence. Based on the findings, we proposed a real-time BI framework, and discussed how it will facilitate the formulation of strategic initiatives for transforming the electricity utility towards sustainable growth. Our research is conducted by following the design science research paradigm. By addressing an emerging issue in the problem domain, it adds empirical knowledge to the BI research landscape
FIN-DM: finantsteenuste andmekaeve protsessi mudel
Andmekaeve hõlmab reeglite kogumit, protsesse ja algoritme, mis võimaldavad ettevõtetel iga päev kogutud andmetest rakendatavaid teadmisi ammutades suurendada tulusid, vähendada kulusid, optimeerida tooteid ja kliendisuhteid ning saavutada teisi eesmärke. Andmekaeves ja -analüütikas on vaja hästi määratletud metoodikat ja protsesse. Saadaval on mitu andmekaeve ja -analüütika standardset protsessimudelit. Kõige märkimisväärsem ja laialdaselt kasutusele võetud standardmudel on CRISP-DM. Tegu on tegevusalast sõltumatu protsessimudeliga, mida kohandatakse sageli sektorite erinõuetega. CRISP-DMi tegevusalast lähtuvaid kohandusi on pakutud mitmes valdkonnas, kaasa arvatud meditsiini-, haridus-, tööstus-, tarkvaraarendus- ja logistikavaldkonnas. Seni pole aga mudelit kohandatud finantsteenuste sektoris, millel on omad valdkonnapõhised erinõuded.
Doktoritöös käsitletakse seda lünka finantsteenuste sektoripõhise andmekaeveprotsessi (FIN-DM) kavandamise, arendamise ja hindamise kaudu. Samuti uuritakse, kuidas kasutatakse andmekaeve standardprotsesse eri tegevussektorites ja finantsteenustes. Uurimise käigus tuvastati mitu tavapärase raamistiku kohandamise stsenaariumit. Lisaks ilmnes, et need meetodid ei keskendu piisavalt sellele, kuidas muuta andmekaevemudelid tarkvaratoodeteks, mida saab integreerida organisatsioonide IT-arhitektuuri ja äriprotsessi. Peamised finantsteenuste valdkonnas tuvastatud kohandamisstsenaariumid olid seotud andmekaeve tehnoloogiakesksete (skaleeritavus), ärikesksete (tegutsemisvõime) ja inimkesksete (diskrimineeriva mõju leevendus) aspektidega. Seejärel korraldati tegelikus finantsteenuste organisatsioonis juhtumiuuring, mis paljastas 18 tajutavat puudujääki CRISP- DMi protsessis.
Uuringu andmete ja tulemuste abil esitatakse doktoritöös finantsvaldkonnale kohandatud CRISP-DM nimega FIN-DM ehk finantssektori andmekaeve protsess (Financial Industry Process for Data Mining). FIN-DM laiendab CRISP-DMi nii, et see toetab privaatsust säilitavat andmekaevet, ohjab tehisintellekti eetilisi ohte, täidab riskijuhtimisnõudeid ja hõlmab kvaliteedi tagamist kui osa andmekaeve elutsüklisData mining is a set of rules, processes, and algorithms that allow companies to increase revenues, reduce costs, optimize products and customer relationships, and achieve other business goals, by extracting actionable insights from the data they collect on a day-to-day basis. Data mining and analytics projects require well-defined methodology and processes. Several standard process models for conducting data mining and analytics projects are available. Among them, the most notable and widely adopted standard model is CRISP-DM. It is industry-agnostic and often is adapted to meet sector-specific requirements. Industry- specific adaptations of CRISP-DM have been proposed across several domains, including healthcare, education, industrial and software engineering, logistics, etc. However, until now, there is no existing adaptation of CRISP-DM for the financial services industry, which has its own set of domain-specific requirements.
This PhD Thesis addresses this gap by designing, developing, and evaluating a sector-specific data mining process for financial services (FIN-DM). The PhD thesis investigates how standard data mining processes are used across various industry sectors and in financial services. The examination identified number of adaptations scenarios of traditional frameworks. It also suggested that these approaches do not pay sufficient attention to turning data mining models into software products integrated into the organizations' IT architectures and business processes. In the financial services domain, the main discovered adaptation scenarios concerned technology-centric aspects (scalability), business-centric aspects (actionability), and human-centric aspects (mitigating discriminatory effects) of data mining. Next, an examination by means of a case study in the actual financial services organization revealed 18 perceived gaps in the CRISP-DM process.
Using the data and results from these studies, the PhD thesis outlines an adaptation of
CRISP-DM for the financial sector, named the Financial Industry Process for Data Mining
(FIN-DM). FIN-DM extends CRISP-DM to support privacy-compliant data mining, to tackle AI ethics risks, to fulfill risk management requirements, and to embed quality assurance as part of the data mining life-cyclehttps://www.ester.ee/record=b547227
Strategies and Approaches for Exploiting the Value of Open Data
Data is increasingly permeating into all dimensions of our society and has become an indispensable commodity that serves as a basis for many products and services. Traditional sectors, such as health, transport, retail, are all benefiting from digital developments. In recent years, governments have also started to participate in the open data venture, usually with the motivation of increasing transparency. In fact, governments are one of the largest producers and collectors of data in many different domains. As the increasing amount of open data and open government data initiatives show, it is becoming more and more vital to identify the means and methods how to exploit the value of this data that ultimately affects various dimensions. In this thesis we therefore focus on researching how open data can be exploited to its highest value potential, and how we can enable stakeholders to create value upon data accordingly. Albeit the radical advances in technology enabling data and knowledge sharing, and the lowering of barriers to information access, raw data was given only recently the attention and relevance it merits. Moreover, even though the publishing of data is increasing at an enormously fast rate, there are many challenges that hinder its exploitation and consumption. Technical issues hinder the re-use of data, whilst policy, economic, organisational and cultural issues hinder entities from participating or collaborating in open data initiatives. Our focus is thus to contribute to the topic by researching current approaches towards the use of open data. We explore methods for creating value upon open (government) data, and identify the strengths and weaknesses that subsequently influence the success of an open data initiative. This research then acts as a baseline for the value creation guidelines, methodologies, and approaches that we propose. Our contribution is based on the premise that if stakeholders are provided with adequate means and models to follow, then they will be encouraged to create value and exploit data products. Our subsequent contribution in this thesis therefore enables stakeholders to easily access and consume open data, as the first step towards creating value. Thereafter we proceed to identify and model the various value creation processes through the definition of a Data Value Network, and also provide a concrete implementation that allows stakeholders to create value. Ultimately, by creating value on data products, stakeholders participate in the global data economy and impact not only the economic dimension, but also other dimensions including technical, societal and political
Three Essays on Law Enforcement and Emergency Response Information Sharing and Collaboration: An Insider Perspective
This dissertation identifies what may be done to overcome barriers to information sharing among federal, tribal, state, and local law enforcement agencies and emergency responders. Social, technical, and policy factors related to information sharing and collaboration in the law enforcement and emergency response communities are examined. This research improves information sharing and cooperation in this area. Policing in most societies exists in a state of dynamic tension between forces that tend to isolate it and those that tend to integrate its functioning with other social structures (Clark, 1965). Critical incidents and crimes today cross jurisdictions and involve multiple stakeholders and levels. Law enforcement and emergency response agencies at federal, tribal, state, and local levels, including private sector entities, gather information and resources but do not effectively share this with each other. Despite mandates to improve information sharing and cooperation, gaps remain perhaps because there is no clear understanding of what the barriers to information sharing are. Information sharing is examined using a multi-method, primarily qualitative, approach. A model for information sharing is presented that identifies social, technical, and policy factors as influencers. Facets of General Systems Theory, Socio-technical Theory, and Stakeholder Theory (among others) are considered in this context. Information sharing is the subject of the first work of the dissertation: a theoretical piece arguing for use of a conceptual framework consisting of social, technical, and policy factors. Social, technology, and policy factors are investigated in the second essay. That essay introduces a new transformative technology, edgeware, that allows for unprecedented connectivity among devices. Social and policy implications for crisis response are examined in light of having technological barriers to sharing resources reduced. Human and other factors relevant to information sharing and collaboration are further examined through a case study of the Central New York Interoperable Communications Consortium (CNYICC) Network, a five-county collaboration involving law enforcement, public safety, government, and non-government participants. The three included essays have a common focus vis-à-vis information sharing and collaboration in law enforcement and emergency response. The propositions here include: (P1) Information sharing is affected by social, technical, and policy factors, and this conceptualization frames the problem of information sharing in a way that it can be commonly understood by government and non-government stakeholders. The next proposition involves the role of technology, policy, and social systems in information sharing: (P2) Social and policy factors influence information sharing more than technical factors (assuming it is physically possible to connect and/or share). A third proposition investigated is: (P3) Social factors play the greatest role in the creation and sustaining of information sharing relationships. The findings provide a greater understanding of the forces that impact public safety agencies as they consider information sharing and will, it is hoped, lead to identifiable solutions to the problem from a new perspective
Internal Crowdsourcing in Companies
This open access book examines the implications of internal crowdsourcing (IC) in companies. Presenting an employee-oriented, cross-sector reference model for good IC practice, it discusses the core theoretical foundations, and offers guidelines for process-management and blueprints for the implementation of IC. Furthermore, it examines solutions for employee training and competence development based on crowdsourcing. As such, the book will appeal to scholars of management science, work studies, organizational and participation research and to readers interested in inclusive approaches for cooperative change management and the IT implications for IC platforms
Hard and soft IT governance maturity
The goal of the research in this thesis is to determine how the IT governance (ITG) of an organisation can grow in maturity to become more effective. ITG are “the structures, process, cultures and systems that engender the successful operation of the IT of the (complete) organization”. ITG is thus not restricted to the IT organisation. The research presented here follows the stream in which ITG is considered an integral part of corporate governance, focusing on the performance perspective. The proposition is that improving “ITG maturity” results in improving ITG. Given that ITG is an integral part of corporate governance, the assumption is that improving ITG results in improving corporate governance and thus, improving organisational performance. The research methodology is based on design science and a combination of systematic literature studies, Delphi workshops and case studies. Organisations can be defined as social units of people that are structured and managed to pursue collective goals. ITG can be seen from two perspectives: - An organisational perspective referred to as “hard governance”; - A social perspective referred to as “soft governance”. This research is grounded in the assumption that in order to advance in maturity, organisations should pay attention to both the hard and soft sides of ITG. In order to improve ITG, a maturity model for hard and soft ITG was designed. The end result was a model consisting of three parts: soft governance, hard governance and the context. The thesis provides a detailed description of the design process between 2014 and 2017. Another literature review demonstrated that all 12 focus areas of the MIG model are also covered by the corporate governance literature. The assessment instrument was used in case studies conducted by students and the researchers. Between 2015 and 2017, 28 case studies were conducted using three versions of the instrument. The evaluations revealed that combining the instrument with semi-structured interviews results in an enhanced and usable instrument for determining the current level of hard and soft ITG of an organisation