147 research outputs found

    Growing a list

    Get PDF
    It is easy to find expert knowledge on the Internet on almost any topic, but obtaining a complete overview of a given topic is not always easy: Information can be scattered across many sources and must be aggregated to be useful. We introduce a method for intelligently growing a list of relevant items, starting from a small seed of examples. Our algorithm takes advantage of the wisdom of the crowd, in the sense that there are many experts who post lists of things on the Internet. We use a collection of simple machine learning components to find these experts and aggregate their lists to produce a single complete and meaningful list. We use experiments with gold standards and open-ended experiments without gold standards to show that our method significantly outperforms the state of the art. Our method uses the clustering algorithm Bayesian Sets even when its underlying independence assumption is violated, and we provide a theoretical generalization bound to motivate its use.

    The detection of fraud activities on the stock market through forward analysis methodology of financial discussion boards

    Get PDF
    Financial discussion boards (FDBs) have been widely used for a variety of financial knowledge exchange activities through the posting of comments on the FDBs. Popular public FDBs are prone to be used as a medium to spread false financial information due to having a larger group of audiences. Although online forums, in general, are usually integrated with anti-spam tools such as Akismet, moderation of posted contents heavily relies on human moderators. Unfortunately, popular FDBs attract many comments per day which realistically prevents human moderators from continuously monitoring and moderating possibly fraudulent contents. Such manual moderation can be extremely time-consuming. Moreover, due to the absence of useful tools, no relevant authorities are actively monitoring and handling potential financial crimes on FDBs. This paper presents a novel forward analysis methodology implemented in an Information Extraction (IE) prototype system named FDBs Miner (FDBM). This methodology aims to detect potentially illegal comments on FDBs while integrating share prices in the detection process as this helps to categorise the potentially illegal comments into different risk levels for investigation priority. The IE prototype system will first extract the public comments and per minute share prices from FDBs for the selected listed companies on London Stock Exchange (LSE). In the forward analysis process, the comments are flagged using a predefined Pump and Dump financial crime related keyword template. By only flagging the comments against the keyword template yields an average of 9.82% potentially illegal comments. It is unrealistic and unaffordable for human moderators to read these comments on a daily basis in long run. Hence, by integrating the share prices’ hikes and falls to categorise the flagged comments based on risk levels, it saves time and allows relevant authorities to prioritise and investigate into the higher risk flagged comments as it can potentially indicate real Pump and Dump crimes on FDBs

    An Analysis of How Interactive Technology Supports the Appreciation of Traditional Chinese Puppetry: A Review of Case Studies

    Get PDF
    From the perspective of safeguarding Chinese Cultural Heritage, this paper discusses how to enhance the appreciation of traditional Chinese puppetry through the support of interactive technology. The author analyses extensive, yet current case studies, based on the findings described in the interactive systems for puppetry performances and interactive technology for puppetry appreciation. The author summarises four aspects of how to enhance the appreciation of, and engagement with, traditional Chinese puppetry: (1) maintaining originality is necessary for the design phase; (2) it is crucial to explore how to use interactive technology in order to design a way for adults to appreciate this form of art; (3) it is also necessary to determine ways to support adult audiences in grasping the cultural significance and folk customs of traditional Chinese puppetry; and (4) the study’s further main research goals are to investigate ways to use emotional expressions, digital storytelling and other methods in conjunction with interactive technology to help multi-cultural users comprehend traditional Chinese puppetry

    A Methodological Reflection: Deconstructing Cultural Elements for Enhancing Cross-cultural Appreciation of Chinese Intangible Cultural Heritage

    Get PDF
    This paper presents a practical method of deconstructing cultural elements based on the Human Computer Interaction (HCI) perspective to enhance cross-cultural appreciation of Chinese Intangible Cultural Heritage (ICH). The author pioneered this approach during conducting two case studies as a means to enhance appreciation and engagement with Chinese ICH, such as the extraction of elements from traditional Chinese painting and puppetry with potential to support cross-cultural appreciation, as well as the establishment of an elements archive. Through integrating a series of HCI research methods, this approach provides a specific foundational framework that assists non-Chinese people to better understand the cultural significance of Chinese ICH

    Full Spectrum Archaeology

    Get PDF
    Full Spectrum Archaeology (FSA) is an aspiration stemming from the convergence of archaeology’s fundamental principles with international heritage policies and community preferences. FSA encompasses study and stewardship of the full range of heritage resources in accord with the full range of associated values and through the application of treatments selected from the full range of appropriate options. Late modern states, including British Columbia, Canada, nominally embrace de jure heritage policies consonant with international standards yet also resist de facto heritage management practice grounded in professional ethics and local values and preferences. In response, inheritor communities and their allies in archaeology are demonstrating the benefits of FSA and reclaiming control over cultural heritage. Archaeology and heritage management driven by altruistic articulation of communal, educational, scientific and other values further expose shortcomings and vulnerabilities of late modern states as well as public goods in and from FSA

    Preparation of name and address data for record linkage using hidden Markov models

    Get PDF
    BACKGROUND: Record linkage refers to the process of joining records that relate to the same entity or event in one or more data collections. In the absence of a shared, unique key, record linkage involves the comparison of ensembles of partially-identifying, non-unique data items between pairs of records. Data items with variable formats, such as names and addresses, need to be transformed and normalised in order to validly carry out these comparisons. Traditionally, deterministic rule-based data processing systems have been used to carry out this pre-processing, which is commonly referred to as "standardisation". This paper describes an alternative approach to standardisation, using a combination of lexicon-based tokenisation and probabilistic hidden Markov models (HMMs). METHODS: HMMs were trained to standardise typical Australian name and address data drawn from a range of health data collections. The accuracy of the results was compared to that produced by rule-based systems. RESULTS: Training of HMMs was found to be quick and did not require any specialised skills. For addresses, HMMs produced equal or better standardisation accuracy than a widely-used rule-based system. However, acccuracy was worse when used with simpler name data. Possible reasons for this poorer performance are discussed. CONCLUSION: Lexicon-based tokenisation and HMMs provide a viable and effort-effective alternative to rule-based systems for pre-processing more complex variably formatted data such as addresses. Further work is required to improve the performance of this approach with simpler data such as names. Software which implements the methods described in this paper is freely available under an open source license for other researchers to use and improve

    Association of eGFR-Related Loci Identified by GWAS with Incident CKD and ESRD

    Get PDF
    Family studies suggest a genetic component to the etiology of chronic kidney disease (CKD) and end stage renal disease (ESRD). Previously, we identified 16 loci for eGFR in genome-wide association studies, but the associations of these single nucleotide polymorphisms (SNPs) for incident CKD or ESRD are unknown. We thus investigated the association of these loci with incident CKD in 26,308 individuals of European ancestry free of CKD at baseline drawn from eight population-based cohorts followed for a median of 7.2 years (including 2,122 incident CKD cases defined as eGFR <60ml/min/1.73m2 at follow-up) and with ESRD in four case-control studies in subjects of European ancestry (3,775 cases, 4,577 controls). SNPs at 11 of the 16 loci (UMOD, PRKAG2, ANXA9, DAB2, SHROOM3, DACH1, STC1, SLC34A1, ALMS1/NAT8, UBE2Q2, and GCKR) were associated with incident CKD; p-values ranged from p = 4.1e-9 in UMOD to p = 0.03 in GCKR. After adjusting for baseline eGFR, six of these loci remained significantly associated with incident CKD (UMOD, PRKAG2, ANXA9, DAB2, DACH1, and STC1). SNPs in UMOD (OR = 0.92, p = 0.04) and GCKR (OR = 0.93, p = 0.03) were nominally associated with ESRD. In summary, the majority of eGFR-related loci are either associated or show a strong trend towards association with incident CKD, but have modest associations with ESRD in individuals of European descent. Additional work is required to characterize the association of genetic determinants of CKD and ESRD at different stages of disease progression
    corecore