16 research outputs found
Designing Guidelines to Discover Causes of Delays in Construction Projects: The Case of Lebanon
Construction projects in developing countries have the priority among other projects as they are considered safe investments in an unpredictable market. Due to this prioritization, it has become increasingly important that such projects are managed in accordance with internationally accepted management best practice. Project managers of construction projects in developing countries face difficulties in effectively monitoring the progress of projects they are responsible for due to many variables. The purpose of this study is to investigate the causes of delays in the construction projects that were covered in the considered literature and conduct qualitative research to investigate their relevance by interviewing project managers of actual projects in Lebanon. Based on the literature review and from the recommendations recorded during interviews, the researcher aims to create a set of guidelines that will improve the manner in which project managers can adapt to, discover and deal with project delays. These guidelines can be utilized as an early warning system concerning delays in construction projects
Using an Instant Visual and Text Based Feedback_Tool to Teach Path Finding Algorithms A Concept
SEENG 2021 was held remotely as an integral component of the Joint Track on Software Engineering Education and Training (JSEET) at the 43rd IEEE/ACM International Conference on Software Engineering (ICSE).Methods of teaching path finding algorithms, based purely on programming, provide an additional challenge to stu- dents. Indeed many courses use graphs and other visualisations to aid students in grasping concepts quickly. Globally we are rapidly altering our teaching tools to suit the current blended or remote learning style due to the global COVID-19 pandemic. We propose a method that provides instant feedback showing how their programmed path finding algorithm works based upon games. The tool will provide feedback to the student about their code quality. Along with an element of gamification we aim to improve both initial understanding and further exploration into the algorithms taught. This tool aims to provide useful feedback to students in the absence of immediate laboratory support and gives students the flexibility to conduct laboratory worksheets outside of scheduled laboratory slots.
Position: Software tools and teaching assistants heavily assist undergraduate students in learning how to program. In develop- ing enhanced software tools, we can provide immediate feedback to learners. Thus, allowing them to gain an initial understanding of the algorithm before facilitated sessions. This further enriches their experience and learning during contact hours with teaching assistants
An Empirical Analysis of the Seasonal Patterns in Aggregate Directorsâ Trades
This paper examins the seasonal patterns in aggregate insider trading transactions, specifically, do insiders prefer to trade on any particular day of the week or month of the year? It also, given that such seasonal patterns exist, attempts to relate these patterns to explanations drawn from the literature on calendar anomalies in returns (and volumes). The results outlined from this paper includes: There is a day of the week anomaly in aggregate insider activities (as measured by number and value of insider transactions). Particularly, relative to other days, insiders tend to trade more on Fridays and less on Tuesdays. Also, the distribution of the average value of directorsâ trades (buys and sells) across the week days forms a U shape i.e. high trading value on the beginning of the week (Monday) and the end of the week (Friday). Also, there is a month of the year anomaly in aggregate insider activities (as measured by the number of insider transactions). Insiders tend to trade most frequently in March and least in August. The results of OLS Regression Model indicate that there is no monthly anomaly in aggregate insider selling activities as measured by the aggregate value of insider transactions. The results of TOBIT Regression Model show that the average value of directorsâ selling activities in March is higher and significantly different relative to other months of the year. The results of OLS regression are also confirmed by the results of K-W statistic test which supported the non existence of monthly anomaly in aggregate director trading (measured by the value of director transactions)
On the Relationship Between Coupling and Refactoring: An Empirical Viewpoint
[Background] Refactoring has matured over the past twenty years to become
part of a developer's toolkit. However, many fundamental research questions
still remain largely unexplored. [Aim] The goal of this paper is to investigate
the highest and lowest quartile of refactoring-based data using two coupling
metrics - the Coupling between Objects metric and the more recent Conceptual
Coupling between Classes metric to answer this question. Can refactoring trends
and patterns be identified based on the level of class coupling? [Method] In
this paper, we analyze over six thousand refactoring operations drawn from
releases of three open-source systems to address one such question. [Results]
Results showed no meaningful difference in the types of refactoring applied
across either lower or upper quartile of coupling for both metrics;
refactorings usually associated with coupling removal were actually more
numerous in the lower quartile in some cases. A lack of inheritance-related
refactorings across all systems was also noted. [Conclusions] The emerging
message (and a perplexing one) is that developers seem to be largely
indifferent to classes with high coupling when it comes to refactoring types -
they treat classes with relatively low coupling in almost the same way
Do developers really worry about refactoring re-test? An empirical study of open-source systems
© Springer Nature Switzerland AG 2018. In this paper, we explore the extent to which a set of over 12000 refactorings fell into one of four re-test categories defined by van Deursen and Moonen; the âleast disruptiveâ of the four categories contains refactorings requiring only minimal re-test. The âmost disruptiveâ category of refactorings on the other hand requires significant re-test effort. We used multiple versions of three open-source systems to answer one research question: Do developers prefer to undertake refactorings in the least disruptive categories or in the most disruptive? The simple answer is that they prefer to do both. We provide insights into these refactoring patterns across the systems and highlight a fundamental weakness with software metrics trying to capture the refactoring process
On the Link between Refactoring Activity and Class Cohesion through the Prism of Two Cohesion-Based Metrics
The practice of refactoring has evolved over the past thirty years to become standard developer practice; for almost the same amount of time, proposals for measuring object-oriented cohesion have also been suggested. Yet, we still know very little about their inter-relationship empirically, despite the fact that classes exhibiting low cohesion would be strong candidates for refactoring. In this paper, we use a large set of refactorings to understand the characteristics of two cohesion metrics from a refactoring perspective. Firstly, through the well-known LCOM metric of Chidamber and Kemerer and, secondly, the C3 metric proposed more recently by Marcus et al. Our research question is motivated by the premise that different refactorings will be applied to classes with low cohesion compared with those applied to classes with high cohesion. We used three open-source systems as a basis of our analysis and on data from the lower and upper quartiles of metric data. Results showed that the set of refactoring types across both upper and lower quartiles was broadly the same, although very different in actual numbers. The `rename method' refactoring stood out from the rest, being applied over three times as often to classes with low cohesion than to classes with high cohesion
Opening the black box: Personalizing type 2 diabetes patients based on their latent phenotype and temporal associated complication rules
© 2020 The Authors. It is widely considered that approximately 10% of the population suffers from type 2 diabetes. Unfortunately, the impact of this disease is underestimated. Patient's mortality often occurs due to complications caused by the disease and not the disease itself. Many techniques utilized in modeling diseases are often in the form of a âblack boxâ where the internal workings and complexities are extremely difficult to understand, both from practitioners' and patients' perspective. In this work, we address this issue and present an informative model/pattern, known as a âlatent phenotype,â with an aim to capture the complexities of the associated complications' over time. We further extend this idea by using a combination of temporal association rule mining and unsupervised learning in order to find explainable subgroups of patients with more personalized prediction. Our extensive findings show how uncovering the latent phenotype aids in distinguishing the disparities among subgroups of patients based on their complications patterns. We gain insight into how best to enhance the prediction performance and reduce bias in the models applied using uncertainty in the patients' data
MedT5SQL: a transformers-based large language model for text-to-SQL conversion in the healthcare domain
Data availability statement: Publicly available datasets were analyzed in this study. This data can be found at: https://github.com/wangpinggl/TREQS/tree/master/mimicsql_data/mimicsql_natural_v2; https://huggingface.co/datasets/wikisql .Supplementary material: The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fdata.2024.1371680/full#supplementary-material .Introduction: In response to the increasing prevalence of electronic medical records (EMRs) stored in databases, healthcare staff are encountering difficulties retrieving these records due to their limited technical expertise in database operations. As these records are crucial for delivering appropriate medical care, there is a need for an accessible method for healthcare staff to access EMRs.
Methods: To address this, natural language processing (NLP) for Text-to-SQL has emerged as a solution, enabling non-technical users to generate SQL queries using natural language text. This research assesses existing work on Text-to-SQL conversion and proposes the MedT5SQL model specifically designed for EMR retrieval. The proposed model utilizes the Text-to-Text Transfer Transformer (T5) model, a Large Language Model (LLM) commonly used in various text-based NLP tasks. The model is fine-tuned on the MIMICSQL dataset, the first Text-to-SQL dataset for the healthcare domain. Performance evaluation involves benchmarking the MedT5SQL model on two optimizers, varying numbers of training epochs, and using two datasets, MIMICSQL and WikiSQL.
Results: For MIMICSQL dataset, the model demonstrates considerable effectiveness in generating question-SQL pairs achieving accuracy of 80.63%, 98.937%, and 90% for exact match accuracy matrix, approximate string-matching, and manual evaluation, respectively. When testing the performance of the model on WikiSQL dataset, the model demonstrates efficiency in generating SQL queries, with an accuracy of 44.2% on WikiSQL and 94.26% for approximate string-matching.
Discussion: Results indicate improved performance with increased training epochs. This work highlights the potential of fine-tuned T5 model to convert medical-related questions written in natural language to Structured Query Language (SQL) in healthcare domain, providing a foundation for future research in this area.The author(s) declare that no financial support was received for the research, authorship, and/or publication of this article
Estimating the Optimal Number of Clusters from Subsets of Ensembles
This research estimates the optimal number of clusters in a dataset using a novel ensemble technique - a preferred alternative to relying on the output of a single clustering. Combining clusterings from different algorithms can lead to a more stable and robust solution, often unattainable by any single clustering solution. Technically, we created subsets of ensembles as possible estimates; and evaluated them using a quality metric to obtain the best subset. We tested our method on publicly available datasets of varying types, sources and clustering difficulty to establish the accuracy and performance of our approach against eight standard methods. Our method outperforms all the techniques in the number of clusters estimated correctly. Due to the exhaustive nature of the initial algorithm, it is slow as the number of ensembles or the solution space increases; hence, we have provided an updated version based on the single-digit difference of Gray code that runs in linear time in terms of the subset size