Search CORE

688,065 research outputs found

DTD level authorization in XML documents with usage control

Author: Li Yan
Sun Lili
Publication venue: International Journal of Computer Science and Network Security
Publication date: 25/11/2006
Field of study

[Summary]: In recent years an increasing amount of semi-structured data has become important to humans and programs. XML promoted by the World Wide Web Consortium (W3C) is rapidly emerging as the new standard language for semi-structured data representation and exchange on the Internet. XML documents may contain private information that cannot be shared by all user communities. So securing XML data is becoming increasingly important and several approaches have been designed to protect information in a website. However, these approaches typically are used at file system level, rather than for the data in XML documents. Usage control has been considered as the next generation access control model with distinguishing properties of decision continuity. Usage control enables finer-grained control over usage of digital objects than that of traditional access control policies and models. In this paper, we present a usage control model to protect information distributed on the web, which allows the access restrictions directly at DTD-level and XML document-level. Finally, comparisons with related works are analysed

University of Southern Queensland ePrints

Semi-parametric regression: Efficiency gains from modeling the nonparametric part

Author: Mammen Enno
Park Byeong U.
Yu Kyusang
Publication venue: 'Bernoulli Society for Mathematical Statistics and Probability'
Publication date: 22/04/2011
Field of study

It is widely admitted that structured nonparametric modeling that circumvents the curse of dimensionality is important in nonparametric estimation. In this paper we show that the same holds for semi-parametric estimation. We argue that estimation of the parametric component of a semi-parametric model can be improved essentially when more structure is put into the nonparametric part of the model. We illustrate this for the partially linear model, and investigate efficiency gains when the nonparametric part of the model has an additive structure. We present the semi-parametric Fisher information bound for estimating the parametric part of the partially linear additive model and provide semi-parametric efficient estimators for which we use a smooth backfitting technique to deal with the additive nonparametric part. We also present the finite sample performances of the proposed estimators and analyze Boston housing data as an illustration.Comment: Published in at http://dx.doi.org/10.3150/10-BEJ296 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

arXiv.org e-Print Archive

Crossref

A performance of comparative study for semi-structured web data extraction model

Author: Man Mustafa
Sabri Ily Amalina Ahmad
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/12/2019
Field of study

The extraction of information from multi-sources of web is an essential yet complicated step for data analysis in multiple domains. In this paper, we present a data extraction model based on visual segmentation, DOM tree and JSON approach which is known as Wrapper Extraction of Image using DOM and JSON (WEIDJ) for extracting semi-structured data from biodiversity web. The large number of information from multiple sources of web which is image’s information will be extracted using three different approach; Document Object Model (DOM), Wrapper image using Hybrid DOM and JSON (WHDJ) and Wrapper Extraction of Image using DOM and JSON (WEIDJ). Experiments were conducted on several biodiversity website. The experiment results show that WEIDJ approach promising results with respect to time analysis values. WEIDJ wrapper has successfully extracted greater than 100 images of data from the multi-source web biodiversity of over 15 different websites

Crossref

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Institute of Advanced Engineering and Science

The Xeros data model: tracking interpretations of archaeological finds

Author: Costanza Enrico
Earl Graeme
Frankland Tom
Jewell Michael O.
Moreau Luc
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

At an archaeological dig, interpretations are built around discovered artifacts based on measurements and informed intuition. These interpretations are semi-structured and organic, yet existing tools do not capture their creation or evolution. Patina of Notes (PoN) is an application designed to tackle this, and is underpinned by the Xeros data model. Xeros is a graph structure and a set of operations that can deal with the addition, edition, and removal of interpretations. This data model is a specialisation of the W3C PROV provenance data model, tracking the evolution of interpretations. The model is presented, with operations defined formally, and characteristics of the representation that are beneficial to implementations are discussed

Southampton (e-Prints Soton)

Crossref

UCL Discovery

King's Research Portal

Multimodal Machine Learning for Automated ICD Coding

Author: Band Charlotte
Gao Xin
Lam Mike
MD Ashish K. Khanna
MD Frank Papay
MD Jacek B. Cywinski
MD Kamal Maheshwari
MD Piyush Mathur
Pang Jingzhi
Xie Pengtao
Xing Eric
Xu Keyang
Publication venue
Publication date: 06/08/2019
Field of study

This study presents a multimodal machine learning model to predict ICD-10 diagnostic codes. We developed separate machine learning models that can handle data from different modalities, including unstructured text, semi-structured text and structured tabular data. We further employed an ensemble method to integrate all modality-specific models to generate ICD-10 codes. Key evidence was also extracted to make our prediction more convincing and explainable. We used the Medical Information Mart for Intensive Care III (MIMIC -III) dataset to validate our approach. For ICD code prediction, our best-performing model (micro-F1 = 0.7633, micro-AUC = 0.9541) significantly outperforms other baseline models including TF-IDF (micro-F1 = 0.6721, micro-AUC = 0.7879) and Text-CNN model (micro-F1 = 0.6569, micro-AUC = 0.9235). For interpretability, our approach achieves a Jaccard Similarity Coefficient (JSC) of 0.1806 on text data and 0.3105 on tabular data, where well-trained physicians achieve 0.2780 and 0.5002 respectively.Comment: Machine Learning for Healthcare 201

arXiv.org e-Print Archive

Why Can’t Tyrone Write: Reconceptualizing Flower and Hayes for African-American Adolescent Male Writers

Author: Stormer Kimberly J
Publication venue: UVM ScholarWorks
Publication date: 30/12/2017
Field of study

Using qualitative methods and a case study design, the perceptions and writing processes of three African-American eighth grade males were explored. Data were derived from semi-structured and informal interviews; and document analysis. The study concluded that the perceptions of the three participants’ writing processes did not adhere to the steps depicted by the cognitive process model of writing (Flower and Hayes, 1981) that has become a dominant model for describing the composing processes of students. Recommendations are made for altering the Flower and Hayes model to depict how these three, African-American eighth graders perceive school writing

UVM ScholarWorks