11,816 research outputs found

    Analysis of radiation dose using DICOM metadata

    Get PDF
    La informació és el recurs més valuós mundialment [1], i és possible trobar-la a tot arreu. En el cas de les imatges mèdiques, la informació significa no només imatges de giga píxels, sinó que també inclou metadades i mesures quantitatives [2]. DICOM (Digital Imaging and Communications in Medicine) és una clara font d’informació, ja que es tracta del Standard per emmagatzemar i transmetre imatges mèdiques [2] i la informació relacionada [3]; això significa que conté dades d’imatges sense processar i metadades relacionades amb els procediments que es donen a terme amb l’adquisició i curació de la imatge [2]. Una de les dades més rellevants que es poden trobar en un arxiu DICOM són els paràmetres de dosi de radiació. Actualment, ni les Directives Europees ni les Regulacions Espanyoles estableixen límits en la dosi de radiació per a pacients sota processos de diagnòstic ni tractament. Està científicament demostrat que les radiacions ionitzants tenen efectes nocius en la salut humana [4], i per això s’han de prendre mesures com abans possible. Aquestes accions comencen per poder quantificar la radiació rebuda pel pacient en diferents estudis al llarg del temps, cosa que es pot fer fent ús d’una taula de metadades DICOM. Aquest projecte fa èmfasi en l’anàlisi de les metadades generades per Tomografia Computeritzada per comprovar la qualitat de la informació i veure si és possible estimar la quantitat dosimètrica que té en compte la sensibilitat biològica del teixit irradiat i reflecteix el risc d’una exposició no uniforme de cos sencer: la Dosi Efectiva [5]. Finalment, amb aquesta avaluació, és possible definir passos futurs per al desenvolupament d’una eina digital capaç d’analitzar dades relacionades amb radiació i controlar el risc de la ionització per a qualsevol tipus d’examinació mèdica.La información es el recurso más valioso mundialmente [1], y es posible encontrarla en todas partes. En el caso de las imágenes médicas, la información significa no solo imágenes de giga píxeles, sino que también incluye metadatos y medidas cuantitativas [2]. DICOM (Digital Imaging and Communications in Medicine) es una clara fuente de información, ya que se trata del estándar para almacenar y transmitir imágenes médicas [2] e información relacionada [3]; esto significa que contiene datos de imágenes sin procesar y metadatos relacionados con los procedimientos que se llevan a cabo durante la adquisición y curación de la imagen [2]. Uno de los datos más relevantes que se puede encontrar en un archivo DICOM son los parámetros de dosis de radiación. Actualmente, ni las Directivas Europeas ni Regulaciones Españolas establecen límites en la dosis de radiación para pacientes ante procesos de diagnóstico ni tratamiento. Está científicamente comprobado que las radiaciones ionizantes tienen efectos nocivos en la salud humana [4], por lo que se tienen que tomar medidas lo antes posible. Estas acciones empiezan con poder cuantificar la radiación recibida por el paciente en diferentes estudios a lo largo del tiempo, lo cual se puede conseguir haciendo uso de una tabla de metadatos DICOM. Este proyecto hace énfasis en el análisis de los metadatos generados por Tomografía Computarizada para comprobar la calidad de la información y ver si es posible estimar una cantidad dosimétrica que tiene en cuenta la sensibilidad biológica del tejido irradiado y refleja el riesgo de una exposición no uniforme de cuerpo completo: la Dosis Efectiva [5]. Finalmente, con esta evaluación, es posible definir pasos futuros para el desarrollo de una herramienta digital capaz de analizar datos relacionados con la radiación y controlar el riesgo de dicha ionización para cualquier tipo de examinación médica.Data is the world’s most valuable resource [1], and it is possible to find data everywhere. In medical images, data covers not only gigapixel images, but also metadata and quantitative measurements [2]. DICOM (Digital Imaging and Communications in Medicine) is a clear source of medical data, since it is the current standard for storing and transmitting medical images [2] and related information [3]; this means it contains raw data imaging and all metadata related to the procedures of image acquisition and curation [2]. Some of the most relevant information found on DICOM files relies on radiation dose parameters. In the current defined Directives and Regulations, no limits on the radiation dose are stipulated for patients undergoing diagnostic nor treatment procedures. There is proof that ionizing radiation has direct implications in human health [4], which is why measures need to be taken as soon as possible. These actions start with being able to quantify the radiation received by a patient in studies over time, which can be done by means of a DICOM dataset of metadata. The main focus of this project is to analyze Computerized Tomography (CT) scans generated metadata in order to check the quality of the data and then be able to estimate a dosimetric quantity that takes into account the biological sensitivity of the irradiated tissue and reflects the risk of a non-uniform whole-body exposure: the Effective Dose [5]. Finally, with this evaluation, it is possible to define the future steps for the development of a digital tool to analyze radiation-related data and risk control of ionizing radiation expanded for all type of medical examinations

    Building a biomedical tokenizer using the token lattice design pattern and the adapted Viterbi algorithm

    Get PDF
    Abstract: Background: Tokenization is an important component of language processing yet there is no widely accepted tokenization method for English texts, including biomedical texts. Other than rule based techniques, tokenization in the biomedical domain has been regarded as a classification task. Biomedical classifier-based tokenizers either split or join textual objects through classification to form tokens. The idiosyncratic nature of each biomedical tokenizer’s output complicates adoption and reuse. Furthermore, biomedical tokenizers generally lack guidance on how to apply an existing tokenizer to a new domain (subdomain). We identify and complete a novel tokenizer design pattern and suggest a systematic approach to tokenizer creation. We implement a tokenizer based on our design pattern that combines regular expressions and machine learning. Our machine learning approach differs from the previous split-join classification approaches. We evaluate our approach against three other tokenizers on the task of tokenizing biomedical text. Results: Medpost and our adapted Viterbi tokenizer performed best with a 92.9% and 92.4% accuracy respectively. Conclusions: Our evaluation of our design pattern and guidelines supports our claim that the design pattern and guidelines are a viable approach to tokenizer construction (producing tokenizers matching leading custom-built tokenizers in a particular domain). Our evaluation also demonstrates that ambiguous tokenizations can be disambiguated through POS tagging. In doing so, POS tag sequences and training data have a significant impact on proper text tokenization

    Addressing the Higher Level Language Skills for the Common Core State Standards in Kindergarten

    Get PDF
    Kindergarten is a critical year, providing a foundation for children’s success in school. With a common set of standards, the Common Core State Standards (CCSS), finalized and made available to states for adoption critical skills in numeracy and literacy will be uniformed from kindergarten through high school. Some children enter school with a sufficient foundation to support success in kindergarten and subsequent years. However, some children either because of lack of exposure during preschool years (e.g., Aikens & Barbarin, 2008; Hart & Risley, 1995; Schacter, 1979; Snow, Burns & Griffin, 1998) or because of language delays associated with developmental disabilities or delays (e.g., Catts, Adolf & Weismer, 2006; Gough & Tunmer, 1986; Kuhn & Stahl, 2003; Nation & Snowling, 1998; Yuill & Oakhill, 1991) are already far behind their peers upon entrance into kindergarten. The current study investigated the effects of presenting a multilevel approach to storybook reading on a broad range of language skills over 32 weeks of intervention for children at-risk for reading. Specifically, growth in overall language, semantics, syntax, letter awareness, and phonology was explored. Thirty-six at-risk kindergarten students comprised a group that either received intervention utilizing scaffolded talk across a continuum of increasingly more decentered meanings or represented a comparison group. The results of the study revealed that the intervention group made statistically significant gains in overall language, semantic, and syntax skills. A visual inspection of gain composite scores revealed that majority of the intervention groups increased near or at least one standard deviation of change from pre- to posttest; these gains were not evident in the comparison group. The result of the study indicated that utilizing scaffolded talk across a continuum of increasing more decentered meanings in kindergarten hold potential to address the language goals of the CCSS

    Phraseology in Corpus-Based Translation Studies: A Stylistic Study of Two Contemporary Chinese Translations of Cervantes's Don Quijote

    No full text
    The present work sets out to investigate the stylistic profiles of two modern Chinese versions of Cervantes’s Don Quijote (I): by Yang Jiang (1978), the first direct translation from Castilian to Chinese, and by Liu Jingsheng (1995), which is one of the most commercially successful versions of the Castilian literary classic. This thesis focuses on a detailed linguistic analysis carried out with the help of the latest textual analytical tools, natural language processing applications and statistical packages. The type of linguistic phenomenon singled out for study is four-character expressions (FCEXs), which are a very typical category of Chinese phraseology. The work opens with the creation of a descriptive framework for the annotation of linguistic data extracted from the parallel corpus of Don Quijote. Subsequently, the classified and extracted data are put through several statistical tests. The results of these tests prove to be very revealing regarding the different use of FCEXs in the two Chinese translations. The computational modelling of the linguistic data would seem to indicate that among other findings, while Liu’s use of archaic idioms has followed the general patterns of the original and also of Yang’s work in the first half of Don Quijote I, noticeable variations begin to emerge in the second half of Liu’s more recent version. Such an idiosyncratic use of archaisms by Liu, which may be defined as style shifting or style variation, is then analyzed in quantitative terms through the application of the proposed context-motivated theory (CMT). The results of applying the CMT-derived statistical models show that the detected stylistic variation may well point to the internal consistency of the translator in rendering the second half of Part I of the novel, which reflects his freer, more creative and experimental style of translation. Through the introduction and testing of quantitative research methods adapted from corpus linguistics and textual statistics, this thesis has made a major contribution to methodological innovation in the study of style within the context of corpus-based translation studies

    Phraseology in Corpus-based transaltion studies : stylistic study of two contempoarary Chinese translation of Cervantes's Don Quijote

    No full text
    The present work sets out to investigate the stylistic profiles of two modern Chinese versions of Cervantes???s Don Quijote (I): by Yang Jiang (1978), the first direct translation from Castilian to Chinese, and by Liu Jingsheng (1995), which is one of the most commercially successful versions of the Castilian literary classic. This thesis focuses on a detailed linguistic analysis carried out with the help of the latest textual analytical tools, natural language processing applications and statistical packages. The type of linguistic phenomenon singled out for study is four-character expressions (FCEXs), which are a very typical category of Chinese phraseology. The work opens with the creation of a descriptive framework for the annotation of linguistic data extracted from the parallel corpus of Don Quijote. Subsequently, the classified and extracted data are put through several statistical tests. The results of these tests prove to be very revealing regarding the different use of FCEXs in the two Chinese translations. The computational modelling of the linguistic data would seem to indicate that among other findings, while Liu???s use of archaic idioms has followed the general patterns of the original and also of Yang???s work in the first half of Don Quijote I, noticeable variations begin to emerge in the second half of Liu???s more recent version. Such an idiosyncratic use of archaisms by Liu, which may be defined as style shifting or style variation, is then analyzed in quantitative terms through the application of the proposed context-motivated theory (CMT). The results of applying the CMT-derived statistical models show that the detected stylistic variation may well point to the internal consistency of the translator in rendering the second half of Part I of the novel, which reflects his freer, more creative and experimental style of translation. Through the introduction and testing of quantitative research methods adapted from corpus linguistics and textual statistics, this thesis has made a major contribution to methodological innovation in the study of style within the context of corpus-based translation studies.Imperial Users onl

    Automated Analysis of Metacarpal Cortical Thickness in Serial Hand Radiographs

    Get PDF
    To understand the roles of various genes that influence skeletal bone accumulation and loss, accurate measurement of bone mineralization is needed. However, it is a challenging task to accurately assess bone growth over a person\u27s lifetime. Traditionally, manual analysis of hand radiographs has been used to quantify bone growth, but these measurements are tedious and may be impractical for a large-scale growth study. The aim of this project was to develop a tool to automate the measurement of metacarpal cortical bone thickness in standard hand-wrist radiographs of humans aged 3 months to 70+ years that would be more accurate, precise and efficient than manual radiograph analysis. The task was divided into two parts: development of automatic analysis software and the implementation of the routines in a Graphical User Interface (GUI). The automatic analysis was to ideally execute without user intervention, but we anticipated that not all images would be successfully analyzed. The GUI, therefore, provides the interface for the user to execute the program, review results of the automated routines, make semi-automated and manual corrections, view the quantitative results and growth trend of the participant and save the results of all analyses. The project objectives were attained. Of a test set of about 350 images from participants in a large research study, automatic analysis was successful in approximately 75% of the reasonable quality images and manual intervention allowed the remaining 25% of these images to be successfully analyzed. For images of poorer quality, including many that the Lifespan Health Research Center (LHRC) clients would not expect to be analyzed successfully, the inputs provided by the user allowed approximately 80% to be analyzed, but the remaining 20% could not be analyzed with the software. The developed software tool provides results that are more accurate and precise than those from manual analyses. Measurement accuracy, as assessed by phantom measurements, was approximately 0.5% and interobserver and intraobserver agreement were 92.1% and 96.7%, respectively. Interobserver and intraobserver correlation values for automated analysis were 0.9674 and 0.9929, respectively, versus 0.7000 and 0.7820 for manual analysis. The automated analysis process is also approximately 87.5% more efficient than manual image analysis and automatically generates an output file containing over 160 variables of interest. The software is currently being used successfully to analyze over 17,000 images in a study of human bone growth

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    Give the Fans What They Want: A Market Segmentation Approach to Sport Fans’ Social Media Usage

    Get PDF
    The purpose of this study was to construct a model that segments fans of professional sport based on the type of social media platform they preferred to use as well as their social media usage motivations. In addition, the current study sought to investigate whether previously identified motives like escape and socialization, have transformed into more selfish motives such as narcissism. Convenience and snowball sampling techniques were used to collect data from fans of professional sport who specifically used social media to consume sport, resulting in a total sample size of 176. The online survey instrument was comprised of items from the previously validated Motivation Scale for Sport Online Consumption (MSSOC; Seo & Green, 2008) scale and the Narcissism Personality Inventory-16 (NPI-16; Ames, Rose, & Anderson, 2006) scale. In addition, several frequency, usage, and duration items, including how often respondents used Facebook, Twitter, Instagram, and Snapchat, were generated to gauge how often respondents spent time on social media consuming sport. Composite scores were calculated for the MSSOC and NPI-16 responses. Hierarchical cluster analysis revealed three distinct social media preference groups labeled a) Facebook Devotees (n=51), b) Infrequent Users (n=71), and c) Social Media Aficionados (n=54). Facebook Devotees generally preferred to use Facebook more than any other social media platform, while the Social Media Aficionados had the highest mean usage rates for Twitter, Instagram, and Snapchat. Descriptive discriminant analysis indicated that 67% of the differences among Facebook Devotees, Infrequent Users, and Social Media Aficionados can be attributed to social media preference. With regard to social media usage motivation, hierarchical cluster analysis identified two groups labeled a) Multifaceted Fans (n=72) and b) Casual Supporter (n=104). Multifaceted fans exhibited high levels of motivation for nearly all usage motivations, while Casual Supporters had high motivation mean scores for only two motivations, “passing the time,” and “information.” Descriptive discriminant analysis revealed that 61% of the differences between Multifaceted Fans and Casual Supporters was explained by social media usage motivation. Finally, a Pearson correlation analysis (two-tailed) revealed no statistically significant correlations between narcissism and social media usage motivation. Overall, the findings from this study provide sport organizations with valuable marketing and communication information. The fan segments uncovered in the results reveal that fans have different motivations for consuming sport via social media. Sport organizations can use this information to tailor their social media strategy to specific fan segments, increasing engagement, strengthening fans’ brand loyalty, and ultimately generating more revenue

    Linking social media, medical literature, and clinical notes using deep learning.

    Get PDF
    Researchers analyze data, information, and knowledge through many sources, formats, and methods. The dominant data format includes text and images. In the healthcare industry, professionals generate a large quantity of unstructured data. The complexity of this data and the lack of computational power causes delays in analysis. However, with emerging deep learning algorithms and access to computational powers such as graphics processing unit (GPU) and tensor processing units (TPUs), processing text and images is becoming more accessible. Deep learning algorithms achieve remarkable results in natural language processing (NLP) and computer vision. In this study, we focus on NLP in the healthcare industry and collect data not only from electronic medical records (EMRs) but also medical literature and social media. We propose a framework for linking social media, medical literature, and EMRs clinical notes using deep learning algorithms. Connecting data sources requires defining a link between them, and our key is finding concepts in the medical text. The National Library of Medicine (NLM) introduces a Unified Medical Language System (UMLS) and we use this system as the foundation of our own system. We recognize social media’s dynamic nature and apply supervised and semi-supervised methodologies to generate concepts. Named entity recognition (NER) allows efficient extraction of information, or entities, from medical literature, and we extend the model to process the EMRs’ clinical notes via transfer learning. The results include an integrated, end-to-end, web-based system solution that unifies social media, literature, and clinical notes, and improves access to medical knowledge for the public and experts
    corecore