14 research outputs found

    ORSP Research Newsletter - Fall 2011

    Get PDF

    ORSP Research Newsletter - Fall 2010

    Get PDF

    Mining Reaction and Diffusion Dynamics in Social Activities

    Full text link
    Large quantifies of online user activity data, such as weekly web search volumes, which co-evolve with the mutual influence of several queries and locations, serve as an important social sensor. It is an important task to accurately forecast the future activity by discovering latent interactions from such data, i.e., the ecosystems between each query and the flow of influences between each area. However, this is a difficult problem in terms of data quantity and complex patterns covering the dynamics. To tackle the problem, we propose FluxCube, which is an effective mining method that forecasts large collections of co-evolving online user activity and provides good interpretability. Our model is the expansion of a combination of two mathematical models: a reaction-diffusion system provides a framework for modeling the flow of influences between local area groups and an ecological system models the latent interactions between each query. Also, by leveraging the concept of physics-informed neural networks, FluxCube achieves high interpretability obtained from the parameters and high forecasting performance, together. Extensive experiments on real datasets showed that FluxCube outperforms comparable models in terms of the forecasting accuracy, and each component in FluxCube contributes to the enhanced performance. We then show some case studies that FluxCube can extract useful latent interactions between queries and area groups.Comment: Accepted by CIKM 202

    An Exploratory Empirical Assessment of Italian Open Government Data Quality With an eye to enabling linked open data

    Get PDF
    Context The diffusion of Linked Data and Open Data in recent years kept a very fast pace. However evidence from practitioners shows that disclosing data without proper quality control may jeopardize datasets reuse in terms of apps, linking, and other transformations. Objective Our goals are to understand practical problems experienced by open data users in using and integrating them and build a set of concrete metrics to assess the quality of disclosed data and better support the transition towards linked open data. Method We focus on Open Government Data (OGD), collecting problems experienced by developers and mapping them to a data quality model available in literature. Then we derived a set of metrics and applied them to evaluate a few samples of Italian OGD. Result We present empirical evidence concerning the common quality problems experienced by open data users when using and integrating datasets. The measurements effort showed a few acquired good practices and common weaknesses, and a set of discriminant factors among datasets. Conclusion The study represents the first empirical attempt to evaluate the quality of open datasets at an operational level. Our long-term goal is to support the transition towards Linked Open Government Data (LOGD) with a quality improvement process in the wake of the current practices in Software Qualit

    ORSP Research Newsletter - Fall 2012

    Get PDF

    ORSP Research Newsletter - Fall 2012

    Get PDF

    GI Systems for public health with an ontology based approach

    Get PDF
    Dissertation submitted in partial fulfillment of the requirements for the Degree of Master of Science in Geospatial Technologies.Health is an indispensable attribute of human life. In modern age, utilizing technologies for health is one of the emergent concepts in several applied fields. Computer science, (geographic) information systems are some of the interdisciplinary fields which motivates this thesis. Inspiring idea of the study is originated from a rhetorical disease DbHd: Database Hugging Disorder, defined by Hans Rosling at World Bank Open Data speech in May 2010. The cure of this disease can be offered as linked open data, which contains ontologies for health science, diseases, genes, drugs, GEO species etc. LOD-Linked Open Data provides the systematic application of information by publishing and connecting structured data on the Web. In the context of this study we aimed to reduce boundaries between semantic web and geo web. For this reason a use case data is studied from Valencia CSISP- Research Center of Public Health in which the mortality rates for particular diseases are represented spatio-temporally. Use case data is divided into three conceptual domains (health, spatial, statistical), enhanced with semantic relations and descriptions by following Linked Data Principles. Finally in order to convey complex health-related information, we offer an infrastructure integrating geo web and semantic web. Based on the established outcome, user access methods are introduced and future researches/studies are outlined

    WHISPER – service integrated incident management system

    Get PDF
    ABSTRACT: This paper presents a cohesive summary of existing emergency response systems. We investigate and integrate principles, theories, and practices from four diverse, yet related, fields of knowledge with respect to information representation and decision support capability requirements for emergency planning and response (EPR) systems. This enables the cooperation between constituent agencies (e.g., fire, police and medical) and surrounding municipalities which operate using assorted decision support protocols, system architectures, networking strategies and along different levels of data security needs. Based upon our investigation, we have built a service architectural framework for providing and disseminating an integrated platform of knowledge capable of being used as intelligent interconnects between distributed EPR systems. Such a framework can support affordable integration for municipalities of all sizes, in particular smaller municipalities that often cannot afford costly off-the-shelf software solutions consisting of proprietary logic and requiring extensive customization and support cost. We also present a prototype web service based implementation and summarize the limitations of such an approach. Index: Emergency response system, emergency planning and response, emergency management, decision support, web service

    Transforming Web Data Into Knowledge - Implications for Management

    Get PDF
    Much of one’s online behavior, including browsing, shopping, posting, is recorded in databases on companies’ computers on a daily basis. Those data sets are referred to as web data. The patterns which are the indicators of one’s interests, habits, preferences or behaviors are stored within those data. More useful than an individual indicator is when a company records data on all its users and when it gains an insight into their habits and tendencies. Detecting and interpreting such patterns can help managers to make informed decisions and serve their customers better. Utilizing data mining with respect to web data is said to turn them into web knowledge. The research study conducted in this paper demonstrates how data mining methods and models can be applied to the web-based forms of data, on the one hand, and what the implications of uncovering patterns in web content, the structure and their usage are for management

    Gesetz zur Bestimmung des Wortschatzumfangs von Texten. Das Heaps\u27sche Gesetz und die Bestimmung der Wortschatzgröße in kroatischen Texten

    Get PDF
    Postoje}a formula Heapsova zakona o veli~ini vokabulara teksta nije univerzalna te zakon treba redefinirati, kako bi se mogao rabiti za analizu korpusa na raznim jezicima. Analiza korpusa tekstova na hrvatskom jeziku potvr|uje hipotezu da je broj funkcionalnih pojavnica u tekstu konstantan te iznosi 21% veli~ine teksta. Autor dokazuje da se postotak funkcionalnih pojavnica u tekstu mo`e uzimati kao vrijednost za parametar K te da je parametar K konstantna vrijednost za svaki jezi~ni korpus. Empirijska istra`ivanja potvr|uju autorovu tezu da se broj funkcionalnih pojavnica u tekstu mo`e izra~unati po formuli F = nK/100, a da za veli~inu najfrekventnije pojavnice (MF) vrijedi formula MF = n (K/100)2. Vrijednost drugoga parametra Heapsova zakona tako|er se mo`e precizno odrediti i zato autor predla`e novi oblik zakona o veli~ini vokabulara teksta. Istra`ivanja potvr|uju da je vrlo visoka korelacija izme|u izra~unanih i stvarnih vrijednosti veli~ine vokabulara, odnosno izme|u stvarnih i izra~unanih vrijednosti jednokratnih rije~i u tekstu. Ovako interpretiran i definiran zakon o veli~ini vokabulara teksta omogu}uje izra~un veli~ine vokabulara teksta na svakom jeziku, kada se zna postotak funkcionalnih rije~i koji je konstantan za taj jezik. No ova interpretacija zakona omogu}uje, osim izra~una veli~ine vokabulara teksta, i odre|ivanje broja funkcionalnih pojavnica u tekstu, veli~ine najfrekventnije rije~i u tekstu te broja jednokratnih pojavnica koje tvore vokabular teksta.The existing formula / Vr(n)=Knß / of Heaps\u27 Law regarding the size of a text\u27s vocabulary is not universal, thus the law needs to be redefined, in order to be used for analysis of a different language corpus. The analysis of a corpus of texts in the Croatian language confirms the hypothesis that the number of functional items (F) in a text is constant and amounts to 21% of the size of the text n (there are 26% of functional items in English texts). The author proves that the percentage of functional items in a text can be used as the value for the parameter K, and that the parameter K presents a constant value for every language corpus. Empirical research has confirmed the author\u27s thesis that the number of functional items in a text can be calculated according to the formula F=nK/100, and that for the value of the most frequent item (MF) the formula MF=n(K/100)2 can be applied. The value of the other parameter of Heaps\u27 Law can also be accurately determined: ß=log K/100. The author therefore suggests a new form of the text vocabulary size law: Vr(n)=(Kn)ß. The number of words appearing only once (HL) in the text can be calculated according to the formula: HL= ((Kn)/2)ß . Research confirms that there is a very high correlation between the calculated and real values of the vocabulary size, i.e. between the real and calculated values of single words in the text. Interpreted and defined in such a way, the law of the text vocabulary size enables the calculation of the text\u27s vocabulary size in every language, if the percentage of constant functional words for this language is known. However, this interpretation of the law enables, apart from determining the size of the text\u27s vocabulary, also the calculation of the number of functional items in the text, the size of the most frequent word in the text, and the number of single items comprising the text\u27s vocabularyDie bestehende Formel / Vr(n)=Knß / des Heaps\u27schen Gesetzes zur Bestimmung des Wortschatzumfangs von Texten hat keine universale Gültigkeit, sodass das Gesetz, soll es zur Textkorpusanalyse in verschiedenen Sprachen angewandt werden, redefiniert werden muss. Die Analyse von Textkorpora in kroatischer Sprache bestätigt die Hypothese, dass die Zahl funktionaler Wörter (F) in einem Text konstant ist und 21% der Größe eines Textes n ausmacht (in englischen Texten beträgt die Zahl funktionaler Wörter 26%). Der Verfasser weist nach, dass der in einem Text vertretene Prozentsatz funktionaler Wörter als Wertangabe für den Parameter K benutzt werden kann und dass der Parameter K einen gleichbleibenden Wert für jedes sprachliche Korpus darstellt. Empirische Forschungen bestätigen die These des Verfassers, dass die Zahl funktionaler Wörter in einem Text mit der Formel F = nK/100 errechnet werden kann, dass wiederum für die Größe der häufigsten Wörter (MF) die Formel MF = n(K/100)2 gilt. Der zweite Parameter des Heaps\u27schen Gesetzes kann ebenfalls genau bestimmt werden: ß = log K/100. Der Verfasser schlägt daher vor, das Heaps\u27sche Gesetz in neuer Form zu bestimmen: Vr(n) = (Kn)ß. Die Zahl der nur einmal im Text vorkommenden Wörter (HL) kann anhand der folgenden Formel errechnet werden: HL = ((Kn)/2)ß. Forschungen haben bestätigt, dass die errechneten und die wirklichen Werte des Vokabularumfanges, bzw. dass die wirklichen und die errechneten Werte von einmalig vorkommenden Wörtern in einem Text in hohem Maße miteinander korrelieren. Ein solchermaßen interpretiertes und definiertes Gesetz zur Bestimmung des Wortschatzumfangs ermöglicht uns, den Wortschatzumfang eines Textes in jeglicher Sprache auszurechnen, hat man erst einmal den Prozentsatz funktionaler Wörter, der für die betreffende Sprache gleichbleibend ist, erstellt. Des Weiteren ermöglicht diese Interpretation des Heaps\u27schen Gesetzes, die Zahl der funktionalen Wörter, den Umfang der am häufigsten vertretenen Wörter sowie die Zahl der einmalig vorkommenden Wörter in einem Text zu bestimmen
    corecore