11,653 research outputs found

    Mining Frequency of Drug Side Effects Over a Large Twitter Dataset Using Apache Spark

    Get PDF
    Despite clinical trials by pharmaceutical companies as well as current FDA reporting systems, there are still drug side effects that have not been caught. To find a larger sample of reports, a possible way is to mine online social media. With its current widespread use, social media such as Twitter has given rise to massive amounts of data, which can be used as reports for drug side effects. To process these large datasets, Apache Spark has become popular for fast, distributed batch processing. In this work, we have improved on previous pipelines in sentimental analysis-based mining, processing, and extracting tweets with drug-caused side effects. We have also added a new ensemble classifier using a combination of sentiment analysis features to increase the accuracy of identifying drug-caused side effects. In addition, the frequency count for the side effects is also provided. Furthermore, we have also implemented the same pipeline in Apache Spark to improve the speed of processing of tweets by 2.5 times, as well as to support the process of large tweet datasets. As the frequency count of drug side effects opens a wide door for further analysis, we present a preliminary study on this issue, including the side effects of simultaneously using two drugs, and the potential danger of using less-common combination of drugs. We believe the pipeline design and the results present in this work would have great implication on studying drug side effects and on big data analysis in general

    Cipp-Based Evaluation On English For Sport Science at Sport Education Study Program Of The University Of Ma’Arif Nahdlatul Ulama (Umnu) Kebumen

    Get PDF
    The objective of this research is to evaluate the effectiveness of a program called English for Sport Science at Sport Education Study Program of Universitas Ma’arif Nahdlatul Ulama Kebumen (UMNU) based on CIPP. The study focuses on the elements of the CIPP by Stufflebeam (1971), namely: context, input, process, and product. It is an evaluation research and conducted qualitatively. The research was excecuted in 2016 at Sport Education Study Program of UMNU Kebumen, Central Java. The participants are the students, the English teacher, and chief of the program. Data were gained by in depth interview with the research participants, analysis of the existing documents, and observation. The data are analyzed through four steps by using interactive model as proposed by Miles and Huberman, (1994), as follow: 1) Data collection, 2) Data reduction, 3) Data serving; and, 4) Verification. The technique to use in data validation is by using source triangulation. The study found that the program was ineffective. There were four main factors to cause the ineffectiveness: First, the teaching context was less supportive to good teaching practices. Second, the inputs for the program were in lack of quality, indicated by unprofessional teachers. Third, the process of teaching and learning didn’t meet the stakeholders’ expectation; and last, the product indicated the students’ less competency on good communication skills required. In other words, the program didn’t achieve its goal; and, it is a clear  prove to Dunkin’s and Biddle’s theory (1974) that context variables, presage variables or the input teachers, process variables, and product variables interrelatedly affect the success of teaching for a course program

    Complementing the US Food and Drug Administration Adverse Event Reporting System With Adverse Drug Reaction Reporting From Social Media: Comparative Analysis

    Get PDF
    Background: Adverse drug reactions (ADRs) can occur any time someone uses a medication. ADRs are systematically tracked and cataloged, with varying degrees of success, in order to better understand their etiology and develop methods of prevention. The US Food and Drug Administration (FDA) has developed the FDA Adverse Event Reporting System (FAERS) for this purpose. FAERS collects information from myriad sources, but the primary reporters have traditionally been medical professionals and pharmacovigilance data from manufacturers. Recent studies suggest that information shared publicly on social media platforms related to medication use could be of benefit in complementing FAERS data in order to have a richer picture of how medications are actually being used and the experiences people are having across large populations. Objective: The aim of this study is to validate the accuracy and precision of social media methodology and conduct evaluations of Twitter ADR reporting for commonly used pharmaceutical agents. Methods: ADR data from the 10 most prescribed medications according to pharmacy claims data were collected from both FAERS and Twitter. In order to obtain data from FAERS, the SafeRx database, a curated collection of FAERS data, was used to collect data from March 1, 2016, to March 31, 2017. Twitter data were manually scraped during the same time period to extract similar data using an algorithm designed to minimize noise and false signals in social media data. Results: A total of 40,539 FAERS ADR reports were obtained via SafeRx and more than 40,000 tweets containing the drug names were obtained from Twitter\u27s Advanced Search engine. While the FAERS data were specific to ADRs, the Twitter data were more limited. Only hydrocodone/acetaminophen, prednisone, amoxicillin, gabapentin, and metformin had a sufficient volume of ADR content for review and comparison. For metformin, diarrhea was the side effect that resulted in no difference between the two platforms (P=.30). For hydrocodone/acetaminophen, ineffectiveness as an ADR that resulted in no difference (P=.60). For gabapentin, there were no differences in terms of the ADRs ineffectiveness and fatigue (P=.15 and P=.67, respectively). For amoxicillin, hypersensitivity, nausea, and rash shared similar profiles between platforms (P=.35, P=.05, and P=.31, respectively). Conclusions: FAERS and Twitter shared similarities in types of data reported and a few unique items to each data set as well. The use of Twitter as an ADR pharmacovigilance platform should continue to be studied as a unique and complementary source of information rather than a validation tool of existing ADR databases

    A study comparing table-based and list-based smartphone interface usability

    Get PDF
    Never before has society seen a technology or medium advance as quickly as the smartphone. With advancements of smartphone technology, many daily tasks can be accomplished easier and faster with smartphone devices, which require more and more people from numerous backgrounds to use a variety of interface layouts. This study hopes to contribute in building a framework for conducting usability studies to assist in creating a foundation for smartphone interface development by evaluating the effectiveness of two commonly used mobile website interfaces. Since smartphone usability studies are relatively new, there is no smartphone software to record or track this kind of information. Three usability methods reviewed in this study were demographics, usability study through video recordings, and evaluation through exit survey. In regards to usability, the table interface is more effective than the list interface. User testing of the two navigation prototypes, as well as user comparison of one prototype to the other, gave feedback that will contribute to improve the mobile website navigation experiences for users

    Visualising the structure of document search results: A comparison of graph theoretic approaches

    Get PDF
    This is the post-print of the article - Copyright @ 2010 Sage PublicationsPrevious work has shown that distance-similarity visualisation or ‘spatialisation’ can provide a potentially useful context in which to browse the results of a query search, enabling the user to adopt a simple local foraging or ‘cluster growing’ strategy to navigate through the retrieved document set. However, faithfully mapping feature-space models to visual space can be problematic owing to their inherent high dimensionality and non-linearity. Conventional linear approaches to dimension reduction tend to fail at this kind of task, sacrificing local structural in order to preserve a globally optimal mapping. In this paper the clustering performance of a recently proposed algorithm called isometric feature mapping (Isomap), which deals with non-linearity by transforming dissimilarities into geodesic distances, is compared to that of non-metric multidimensional scaling (MDS). Various graph pruning methods, for geodesic distance estimation, are also compared. Results show that Isomap is significantly better at preserving local structural detail than MDS, suggesting it is better suited to cluster growing and other semantic navigation tasks. Moreover, it is shown that applying a minimum-cost graph pruning criterion can provide a parameter-free alternative to the traditional K-neighbour method, resulting in spatial clustering that is equivalent to or better than that achieved using an optimal-K criterion

    Context-aware Document-clustering Technique

    Get PDF
    Document clustering is an intentional act that should reflect individuals’ preferences with regard to the semantic coherency or relevant categorization of documents and should conform to the context of a target task under investigation. Thus, effective documentclustering techniques need to take into account a user’s categorization context defined by or relevant to the target task under consideration. However, existing document-clustering techniques generally anchor in pure content-based analysis and therefore are not able to facilitate context-aware document-clustering. In response, we propose a Context-Aware document-Clustering (CAC) technique that takes into consideration a user’s categorization preference (expressed as a list of anchoring terms) relevant to the context of a target task and subsequently generates a set of document clusters from this specific contextual perspective. Our empirical evaluation results suggest that our proposed CAC technique outperforms the pure content-based document-clustering technique
    • …
    corecore