289 research outputs found

    Fast Exact String Pattern-matching Algorithms Adapted to the Characteristics of the Medical Language

    Get PDF
    Objective: The authors consider the problem of exact string pattern matching using algorithms that do not require any preprocessing. To choose the most appropriate algorithm, distinctive features of the medical language must be taken into account. The characteristics of medical language are emphasized in this regard, the best algorithm of those reviewed is proposed, and detailed evaluations of time complexity for processing medical texts are provided. Design: The authors first illustrate and discuss the techniques of various string pattern-matching algorithms. Next, the source code and the behavior of representative exact string pattern-matching algorithms are presented in a comprehensive manner to promote their implementation. Detailed explanations of the use of various techniques to improve performance are given. Measurements: Real-time measures of time complexity with English medical texts are presented. They lead to results distinct from those found in the computer science literature, which are typically computed with normally distributed texts. Results: The Boyer-Moore-Horspool algorithm achieves the best overall results when used with medical texts. This algorithm usually performs at least twice as fast as the other algorithms tested. Conclusion: The time performance of exact string pattern matching can be greatly improved if an efficient algorithm is used. Considering the growing amount of text handled in the electronic patient record, it is worth implementing this efficient algorith

    Handheld vs. Laptop Computers for Electronic Data Collection in Clinical Research: A Crossover Randomized Trial

    Get PDF
    AbstractObjective To compare users' speed, number of entry errors and satisfaction in using two current devices for electronic data collection in clinical research: handheld and laptop computers. Design The authors performed a randomized cross-over trial using 160 different paper-based questionnaires and representing altogether 45,440 variables. Four data coders were instructed to record, according to a random predefined and equally balanced sequence, the content of these questionnaires either on a laptop or on a handheld computer. Instructions on the kind of device to be used were provided to data-coders in individual sealed and opaque envelopes. Study conditions were controlled and the data entry process performed in a quiet environment. Measurements The authors compared the duration of the data recording process, the number of errors and users' satisfaction with the two devices. The authors divided errors into two separate categories, typing and missing data errors. The original paper-based questionnaire was used as a gold-standard. Results The overall duration of the recording process was significantly reduced (2.0 versus 3.3 min) when data were recorded on the laptop computer (p < 0.001). Data accuracy also improved. There were 5.8 typing errors per 1,000 entries with the laptop compared to 8.4 per 1,000 with the handheld computer (p < 0.001). The difference was even more important for missing data which decreased from 22.8 to 2.9 per 1,000 entries when a laptop was used (p < 0.001). Users found the laptop easier, faster and more satisfying to use than the handheld computer. Conclusions Despite the increasing use of handheld computers for electronic data collection in clinical research, these devices should be used with caution. They double the duration of the data entry process and significantly increase the risk of typing errors and missing data. This may become a particularly crucial issue in studies where these devices are provided to patients or healthcare workers, unfamiliar with Computer Technologies, for self-reporting or research data collection processe

    The Earth as an extrasolar transiting planet - II: HARPS and UVES detection of water vapor, biogenic O2_2, and O3_3

    Full text link
    The atmospheric composition of transiting exoplanets can be characterized during transit by spectroscopy. For the transit of an Earth twin, models predict that biogenic O2O_2 and O3O_3 should be detectable, as well as water vapour, a molecule linked to habitability as we know it on Earth. The aim is to measure the Earth radius versus wavelength λ\lambda - or the atmosphere thickness h(λ)h(\lambda) - at the highest spectral resolution available to fully characterize the signature of Earth seen as a transiting exoplanet. We present observations of the Moon eclipse of 21-12-2010. Seen from the Moon, the Earth eclipses the Sun and opens access to the Earth atmosphere transmission spectrum. We used HARPS and UVES spectrographs to take penumbra and umbra high-resolution spectra from 3100 to 10400 Ang. A change of the quantity of water vapour above the telescope compromised the quality of the UVES data. We corrected for this effect in the data processing. We analyzed the data by 3 different methods. The 1st method is based on the analysis of pairs of penumbra spectra. The 2nd makes use of a single penumbra spectrum, and the 3rd of all penumbra and umbra spectra. Profiles h(λ)h(\lambda) are obtained with the three methods for both instruments. The 1st method gives the best result, in agreement with a model. The second method seems to be more sensitive to the Doppler shift of solar spectral lines with respect to the telluric lines. The 3rd method makes use of umbra spectra which bias the result, but it can be corrected for this a posteriori from results with the first method. The 3 methods clearly show the spectral signature of the Rayleigh scattering in the Earth atmosphere and the bands of H2_2O, O2_2, and O3_3. Sodium is detected. Assuming no atmospheric perturbations, we show that the E-ELT is theoretically able to detect the O2O_2 A-band in 8~h of integration for an Earth twin at 10pc.Comment: Final version accepted for publication in A&A - 21 pages, 27 figures. Abstract above slightly shortened wrt the original. The ArXiv version has low resolution figures, but a version with full resolution figures is available here: http://www.obs-hp.fr/~larnold/publi_to_download/eclipse2010_AA_v5_final.pd

    FRASIMED: a Clinical French Annotated Resource Produced through Crosslingual BERT-Based Annotation Projection

    Full text link
    Natural language processing (NLP) applications such as named entity recognition (NER) for low-resource corpora do not benefit from recent advances in the development of large language models (LLMs) where there is still a need for larger annotated datasets. This research article introduces a methodology for generating translated versions of annotated datasets through crosslingual annotation projection. Leveraging a language agnostic BERT-based approach, it is an efficient solution to increase low-resource corpora with few human efforts and by only using already available open data resources. Quantitative and qualitative evaluations are often lacking when it comes to evaluating the quality and effectiveness of semi-automatic data generation strategies. The evaluation of our crosslingual annotation projection approach showed both effectiveness and high accuracy in the resulting dataset. As a practical application of this methodology, we present the creation of French Annotated Resource with Semantic Information for Medical Entities Detection (FRASIMED), an annotated corpus comprising 2'051 synthetic clinical cases in French. The corpus is now available for researchers and practitioners to develop and refine French natural language processing (NLP) applications in the clinical field (https://zenodo.org/record/8355629), making it the largest open annotated corpus with linked medical concepts in French

    Matching Study to Registry data: Maintaining Data Privacy in a Study on Family based Colorectal Cancer

    Get PDF
    Confidentiality of patient data in the field of medical informatics is an important task. Leaked sensitive information within this data can be adverse to and being abused against a patient. Therefore, when working with medical data, appropriate and secure models which serve as guidelines for different applications are needed. Consequently, this work presents a model for performing a privacy preserving record linkage between study and registry data. The model takes into account seven requirements related to data privacy. Furthermore, this model is exemplified with a study on family based colorectal cancer in Germany. The model is very strict and excludes possible violations towards data privacy protection to a reasonable degree. It should be applicable to similar use cases which are in need of a mapping between medical data of a study and a registry database

    Social media and internet search data to inform drug utilization: A systematic scoping review

    Get PDF
    IntroductionDrug utilization is currently assessed through traditional data sources such as big electronic medical records (EMRs) databases, surveys, and medication sales. Social media and internet data have been reported to provide more accessible and more timely access to medications' utilization.ObjectiveThis review aims at providing evidence comparing web data on drug utilization to other sources before the COVID-19 pandemic.MethodsWe searched Medline, EMBASE, Web of Science, and Scopus until November 25th, 2019, using a predefined search strategy. Two independent reviewers conducted screening and data extraction.ResultsOf 6,563 (64%) deduplicated publications retrieved, 14 (0.2%) were included. All studies showed positive associations between drug utilization information from web and comparison data using very different methods. A total of nine (64%) studies found positive linear correlations in drug utilization between web and comparison data. Five studies reported association using other methods: One study reported similar drug popularity rankings using both data sources. Two studies developed prediction models for future drug consumption, including both web and comparison data, and two studies conducted ecological analyses but did not quantitatively compare data sources. According to the STROBE, RECORD, and RECORD-PE checklists, overall reporting quality was mediocre. Many items were left blank as they were out of scope for the type of study investigated.ConclusionOur results demonstrate the potential of web data for assessing drug utilization, although the field is still in a nascent period of investigation. Ultimately, social media and internet search data could be used to get a quick preliminary quantification of drug use in real time. Additional studies on the topic should use more standardized methodologies on different sets of drugs in order to confirm these findings. In addition, currently available checklists for study quality of reporting would need to be adapted to these new sources of scientific information

    Evaluation of a Command-line Parser-based Order Entry Pathway for the Department of Veterans Affairs Electronic Patient Record

    Get PDF
    Objective: To improve and simplify electronic order entry in an existing electronic patient record, the authors developed an alternative system for entering orders, which is based on a command- interface using robust and simple natural-language techniques. Design: The authors conducted a randomized evaluation of the new entry pathway, measuring time to complete a standard set of orders, and users' satisfaction measured by questionnaire. A group of 16 physician volunteers from the staff of the Department of Veterans Affairs Puget Sound Health Care System-Seattle Division participated in the evaluation. Results: Thirteen of the 16 physicians (81%) were able to enter medical orders more quickly using the natural-language-based entry system than the standard graphical user interface that uses menus and dialogs (mean time spared, 16.06 ± 4.52 minutes; P=0.029). Compared with the graphical user interface, the command--based pathway was perceived as easier to learn (P<0.01), was considered easier to use and faster (P<0.01), and was rated better overall (P<0.05). Conclusion: Physicians found the command- interface easier to learn and faster to use than the usual menu-driven system. The major advantage of the system is that it combines an intuitive graphical user interface with the power and speed of a natural-language analyze

    A National Survey Comparing Patients' and Transplant Professionals' Research Priorities in the Swiss Transplant Cohort Study

    Full text link
    We aimed to identify, assess, compare and map research priorities of patients and professionals in the Swiss Transplant Cohort Study. The project followed 3 steps. 1) Focus group interviews identified patients' (n = 22) research priorities. 2) A nationwide survey assessed and compared the priorities in 292 patients and 175 professionals. 3) Priorities were mapped to the 4 levels of Bronfenbrenner's ecological framework. The 13 research priorities (financial pressure, medication taking, continuity of care, emotional well-being, return to work, trustful relationships, person-centredness, organization of care, exercise and physical fitness, graft functioning, pregnancy, peer contact and public knowledge of transplantation), addressed all framework levels: patient (n = 7), micro (n = 3), meso (n = 2), and macro (n = 1). Comparing each group's top 10 priorities revealed that continuity of care received highest importance rating from both (92.2% patients, 92.5% professionals), with 3 more agreements between the groups. Otherwise, perspectives were more diverse than congruent: Patients emphasized patient level priorities (emotional well-being, graft functioning, return to work), professionals those on the meso level (continuity of care, organization of care). Patients' research priorities highlighted a need to expand research to the micro, meso and macro level. Discrepancies should be recognized to avoid understudying topics that are more important to professionals than to patients
    • …
    corecore