601 research outputs found

    Evolution of Efficient Symbolic Communication Codes

    Full text link
    The paper explores how the human natural language structure can be seen as a product of evolution of inter-personal communication code, targeting maximisation of such culture-agnostic and cross-lingual metrics such as anti-entropy, compression factor and cross-split F1 score. The exploration is done as part of a larger unsupervised language learning effort, the attempt is made to perform meta-learning in a space of hyper-parameters maximising F1 score based on the "ground truth" language structure, by means of maximising the metrics mentioned above. The paper presents preliminary results of cross-lingual word-level segmentation tokenisation study for Russian, Chinese and English as well as subword segmentation or morphological parsing study for English. It is found that language structure form the word-level segmentation or tokenisation can be found as driven by all of these metrics, anti-entropy being more relevant to English and Russian while compression factor more specific for Chinese. The study for subword segmentation or morphological parsing on English lexicon has revealed straight connection between the compression been found to be associated with compression factor, while, surprising, the same connection with anti-entropy has turned to be the inverse.Comment: 9 pages, 6 figure

    Artificial intelligence surgery: how do we get to autonomous actions in surgery?

    Get PDF
    Most surgeons are skeptical as to the feasibility of autonomous actions in surgery. Interestingly, many examples of autonomous actions already exist and have been around for years. Since the beginning of this millennium, the field of artificial intelligence (AI) has grown exponentially with the development of machine learning (ML), deep learning (DL), computer vision (CV) and natural language processing (NLP). All of these facets of AI will be fundamental to the development of more autonomous actions in surgery, unfortunately, only a limited number of surgeons have or seek expertise in this rapidly evolving field. As opposed to AI in medicine, AI surgery (AIS) involves autonomous movements. Fortuitously, as the field of robotics in surgery has improved, more surgeons are becoming interested in technology and the potential of autonomous actions in procedures such as interventional radiology, endoscopy and surgery. The lack of haptics, or the sensation of touch, has hindered the wider adoption of robotics by many surgeons; however, now that the true potential of robotics can be comprehended, the embracing of AI by the surgical community is more important than ever before. Although current complete surgical systems are mainly only examples of tele-manipulation, for surgeons to get to more autonomously functioning robots, haptics is perhaps not the most important aspect. If the goal is for robots to ultimately become more and more independent, perhaps research should not focus on the concept of haptics as it is perceived by humans, and the focus should be on haptics as it is perceived by robots/computers. This article will discuss aspects of ML, DL, CV and NLP as they pertain to the modern practice of surgery, with a focus on current AI issues and advances that will enable us to get to more autonomous actions in surgery. Ultimately, there may be a paradigm shift that needs to occur in the surgical community as more surgeons with expertise in AI may be needed to fully unlock the potential of AIS in a safe, efficacious and timely manner

    Building a biomedical tokenizer using the token lattice design pattern and the adapted Viterbi algorithm

    Get PDF
    Abstract: Background: Tokenization is an important component of language processing yet there is no widely accepted tokenization method for English texts, including biomedical texts. Other than rule based techniques, tokenization in the biomedical domain has been regarded as a classification task. Biomedical classifier-based tokenizers either split or join textual objects through classification to form tokens. The idiosyncratic nature of each biomedical tokenizer’s output complicates adoption and reuse. Furthermore, biomedical tokenizers generally lack guidance on how to apply an existing tokenizer to a new domain (subdomain). We identify and complete a novel tokenizer design pattern and suggest a systematic approach to tokenizer creation. We implement a tokenizer based on our design pattern that combines regular expressions and machine learning. Our machine learning approach differs from the previous split-join classification approaches. We evaluate our approach against three other tokenizers on the task of tokenizing biomedical text. Results: Medpost and our adapted Viterbi tokenizer performed best with a 92.9% and 92.4% accuracy respectively. Conclusions: Our evaluation of our design pattern and guidelines supports our claim that the design pattern and guidelines are a viable approach to tokenizer construction (producing tokenizers matching leading custom-built tokenizers in a particular domain). Our evaluation also demonstrates that ambiguous tokenizations can be disambiguated through POS tagging. In doing so, POS tag sequences and training data have a significant impact on proper text tokenization

    Unsupervised learning methods for identifying and evaluating disease clusters in electronic health records

    Get PDF
    Introduction Clustering algorithms are a class of algorithms that can discover groups of observations in complex data and are often used to identify subtypes of heterogeneous diseases in electronic health records (EHR). Evaluating clustering experiments for biological and clinical significance is a vital but challenging task due to the lack of consensus on best practices. As a result, the translation of findings from clustering experiments to clinical practice is limited. Aim The aim of this thesis was to investigate and evaluate approaches that enable the evaluation of clustering experiments using EHR. Methods We conducted a scoping review of clustering studies in EHR to identify common evaluation approaches. We systematically investigated the performance of the identified approaches using a cohort of Alzheimer's Disease (AD) patients as an exemplar comparing four different clustering methods (K-means, Kernel K-means, Affinity Propagation and Latent Class Analysis.). Using the same population, we developed and evaluated a method (MCHAMMER) that tested whether clusterable structures exist in EHR. To develop this method we tested several cluster validation indexes and methods of generating null data to see which are the best at discovering clusters. In order to enable the robust benchmarking of evaluation approaches, we created a tool that generated synthetic EHR data that contain known cluster labels across a range of clustering scenarios. Results Across 67 EHR clustering studies, the most popular internal evaluation metric was comparing cluster results across multiple algorithms (30% of studies). We examined this approach conducting a clustering experiment on AD patients using a population of 10,065 AD patients and 21 demographic, symptom and comorbidity features. K-means found 5 clusters, Kernel K means found 2 clusters, Affinity propagation found 5 and latent class analysis found 6. K-means 4 was found to have the best clustering solution with the highest silhouette score (0.19) and was more predictive of outcomes. The five clusters found were: typical AD (n=2026), non-typical AD (n=1640), cardiovascular disease cluster (n=686), a cancer cluster (n=1710) and a cluster of mental health issues, smoking and early disease onset (n=1528), which has been found in previous research as well as in the results of other clustering methods. We created a synthetic data generation tool which allows for the generation of realistic EHR clusters that can vary in separation and number of noise variables to alter the difficulty of the clustering problem. We found that decreasing cluster separation did increase cluster difficulty significantly whereas noise variables increased cluster difficulty but not significantly. To develop the tool to assess clusters existence we tested different methods of null dataset generation and cluster validation indices, the best performing null dataset method was the min max method and the best performing indices we Calinksi Harabasz index which had an accuracy of 94%, Davies Bouldin index (97%) silhouette score ( 93%) and BWC index (90%). We further found that when clusters were identified using the Calinski Harabasz index they were more likely to have significantly different outcomes between clusters. Lastly we repeated the initial clustering experiment, comparing 10 different pre-processing methods. The three best performing methods were RBF kernel (2 clusters), MCA (4 clusters) and MCA and PCA (6 clusters). The MCA approach gave the best results highest silhouette score (0.23) and meaningful clusters, producing 4 clusters; heart and circulatory( n=1379), early onset mental health (n=1761), male cluster with memory loss (n = 1823), female with more problem (n=2244). Conclusion We have developed and tested a series of methods and tools to enable the evaluation of EHR clustering experiments. We developed and proposed a novel cluster evaluation metric and provided a tool for benchmarking evaluation approaches in synthetic but realistic EHR

    System Designs for Diabetic Foot Ulcer Image Assessment

    Get PDF
    For individuals with type 2 diabetes, diabetic foot ulcers represent a significant health issue and the wound care cost is quite high. Currently, clinicians and nurses mainly base their wound assessment on visual examination of wound size and the status of the wound tissue. This method is potentially inaccurate for wound assessment and requires extra clinical workload. In view of the prevalence of smartphones with high resolution digital camera, assessing wound healing by analyzing of real-time images using the significant computational power of today’s mobile devices is an attractive approach for managing foot ulcers. Alternatively, the smartphone may be used just for image capture and wireless transfer to a PC or laptop for image processing. To achieve accurate foot ulcer image assessment, we have developed and tested a novel automatic wound image analysis system which accomplishes the following conditions: 1) design of an easy-to-use image capture system which makes the image capture process comfortable for the patient and provides well-controlled image capture conditions; 2) synthesis of efficient and accurate algorithms for real-time wound boundary determination to measure the wound area size; 3) development of a quantitative method to assess the wound healing status based on a foot ulcer image sequence for a given patient and 4) design of a wound image assessment and management system that can be used both in the patient’s home and clinical environment in a tele-medicine fashion. In our work, the wound image is captured by the camera on the smartphone while the patient’s foot is held in place by an image capture box, which is specially design to aid patients in photographing ulcers occurring on the sole of their feet. The experimental results prove that our image capture system guarantees consistent illumination and a fixed distance between the foot and camera. These properties greatly reduce the complexity of the subsequent wound recognition and assessment. The most significant contribution of our work is the development of five different wound boundary determination approaches based on different computer vision algorithms. The first approach employs the level set algorithm to determine the wound boundary directly based on a manually set initial curve. The second and third approaches are the mean-shift segmentation based methods augmented by foot outline detection and analysis. These two approaches have been shown to be efficient to implement (especially on smartphones), prior-knowledge independent and able to provide reasonably accurate wound segmentation results given a set of well-tuned parameters. However, this method suffers from the lack of self-adaptivity due to the fact that it is not based on machine learning. Consequently, a two-stage Support Vector Machine (SVM) binary classifier based wound recognition approach is developed and implemented. This approach consists of three major steps 1) unsupervised super-pixel segmentation, 2) feature descriptor extraction for each super-pixel and 3) supervised classifier based wound boundary determination. The experimental results show that this approach provides promising performance (sensitivity: 73.3%, specificity: 95.6%) when dealing with foot ulcer images captured with our image capture box. In the third approach, we further relax the image capture constraints and generalize the application of our wound recognition system by applying the conditional random field (CRF) based model to solve the wound boundary determination. The key modules in this approach are the TextonBoost based potential learning at different scales and efficient CRF model inference to find the optimal labeling. Finally, the standard K-means clustering algorithm is applied to the determined wound area for color based wound tissue classification. To train the models used in the last two approaches, as well as to evaluate all three methods, we have collected about 100 wound images at the wound clinic in UMass Medical School by tracking 15 patients for a 2-year period, following an IRB approved protocol. The wound recognition results were compared with the ground truth generated by combining clinical labeling from three experienced clinicians. Specificity and sensitivity based measures indicate that the CRF based approach is the most reliable method despite its implementation complexity and computational demands. In addition, sample images of Moulage wound simulations are also used to increase the evaluation flexibility. The advantages and disadvantages of three approaches are described. Another important contribution of this work has been development of a healing score based mechanism for quantitative wound healing status assessment. The wound size and color composition measurements were converted to a score number ranging from 0-10, which indicates the healing trend based on comparisons of subsequent images to an initial foot ulcer image. By comparing the result of the healing score algorithm to the healing scores determined by experienced clinicians, we assess the clinical validity of our healing score algorithm. The level of agreement of our healing score with the three assessing clinicians was quantified by using the Kripendorff’s Alpha Coefficient (KAC). Finally, a collaborative wound image management system between the PC and smartphone was designed and successfully applied in the wound clinic for patients’ wound tracking purpose. This system is proven to be applicable in clinical environment and capable of providing interactive foot ulcer care in a telemedicine fashion

    Review of Wearable Devices and Data Collection Considerations for Connected Health

    Get PDF
    Wearable sensor technology has gradually extended its usability into a wide range of well-known applications. Wearable sensors can typically assess and quantify the wearer’s physiology and are commonly employed for human activity detection and quantified self-assessment. Wearable sensors are increasingly utilised to monitor patient health, rapidly assist with disease diagnosis, and help predict and often improve patient outcomes. Clinicians use various self-report questionnaires and well-known tests to report patient symptoms and assess their functional ability. These assessments are time consuming and costly and depend on subjective patient recall. Moreover, measurements may not accurately demonstrate the patient’s functional ability whilst at home. Wearable sensors can be used to detect and quantify specific movements in different applications. The volume of data collected by wearable sensors during long-term assessment of ambulatory movement can become immense in tuple size. This paper discusses current techniques used to track and record various human body movements, as well as techniques used to measure activity and sleep from long-term data collected by wearable technology devices

    Quantitative imaging in radiation oncology

    Get PDF
    Artificially intelligent eyes, built on machine and deep learning technologies, can empower our capability of analysing patients’ images. By revealing information invisible at our eyes, we can build decision aids that help our clinicians to provide more effective treatment, while reducing side effects. The power of these decision aids is to be based on patient tumour biologically unique properties, referred to as biomarkers. To fully translate this technology into the clinic we need to overcome barriers related to the reliability of image-derived biomarkers, trustiness in AI algorithms and privacy-related issues that hamper the validation of the biomarkers. This thesis developed methodologies to solve the presented issues, defining a road map for the responsible usage of quantitative imaging into the clinic as decision support system for better patient care
    corecore