10,983 research outputs found
XML content warehousing: Improving sociological studies of mailing lists and web data
In this paper, we present the guidelines for an XML-based approach for the
sociological study of Web data such as the analysis of mailing lists or
databases available online. The use of an XML warehouse is a flexible solution
for storing and processing this kind of data. We propose an implemented
solution and show possible applications with our case study of profiles of
experts involved in W3C standard-setting activity. We illustrate the
sociological use of semi-structured databases by presenting our XML Schema for
mailing-list warehousing. An XML Schema allows many adjunctions or crossings of
data sources, without modifying existing data sets, while allowing possible
structural evolution. We also show that the existence of hidden data implies
increased complexity for traditional SQL users. XML content warehousing allows
altogether exhaustive warehousing and recursive queries through contents, with
far less dependence on the initial storage. We finally present the possibility
of exporting the data stored in the warehouse to commonly-used advanced
software devoted to sociological analysis
Identification of delivery models for the provision of predictive genetic testing in Europe: protocol for a multicentre qualitative study and a systematic review of the literature
Introduction: The appropriate application of genomic technologies in healthcare is surrounded by many concerns. In particular, there is a lack of evidence on what constitutes an optimal genetic service delivery model, which depends on the type of genetic test and healthcare context considered. The present project aims to identify, classify, and evaluate delivery models for the provision of predictive genetic testing in Europe and in selected Anglophone extra-European countries (the USA, Canada, Australia, and New Zealand). It also sets out to survey the European public health community’s readiness to incorporate public health genomics into their practice.
Materials and equipment: The project consists of (i) a systematic review of published literature and selected country websites, (ii) structured interviews with health experts on the genetic service delivery models in their respective countries, and (iii) a survey of European Public Health Association (EUPHA) members’ knowledge and attitudes toward genomics applications in clinical practice. The inclusion criteria for the systematic review are that articles be published in the period 2000–2015; be in English or Italian; and be from European countries or from Canada, the USA, Australia, or New Zealand. Additional policy documents will be retrieved from represented countries’ government-affiliated websites. The results of the research will be disseminated through the EUPHA network, the Italian Network for Genomics in Public Health (GENISAP), and seminars and workshops.
Expected impact of the study on public health: The transfer of genomic technologies from research to clinical application is influenced not only by several factors inherent to research goals and delivery of healthcare but also by external and commercial interests that may cause the premature introduction of genetic tests in the public and private sectors. Furthermore, current genetic services are delivered without a standardized set of process and outcome measures, which makes the evaluation of healthcare services difficult. The present study will identify and classify delivery models and, subsequently, establish which are appropriate for the provision of predictive genetic testing in Europe by comparing sets of process and outcome measures. In this way, the study will provide a basis for future recommendations to decision makers involved in the financing, delivery, and consumption of genetic services
Natural language processing
Beginning with the basic issues of NLP, this chapter aims to chart the major research activities in this area since the last ARIST Chapter in 1996 (Haas, 1996), including: (i) natural language text processing systems - text summarization, information extraction, information retrieval, etc., including domain-specific applications; (ii) natural language interfaces; (iii) NLP in the context of www and digital libraries ; and (iv) evaluation of NLP systems
Design and Development of a User Specific Dynamic E-Magazine
Internet and electronic media gaining more popularity due to ease and speed,
the count of Internet users has increased tremendously. The world is moving
faster each day with several events taking place at once and the Internet is
flooded with information in every field. There are categories of information
ranging from most relevant to user, to the information totally irrelevant or
less relevant to specific users. In such a scenario getting the information
which is most relevant to the user is indispensable to save time. The
motivation of our solution is based on the idea of optimizing the search for
information automatically. This information is delivered to user in the form of
an interactive GUI. The optimization of the contents or information served to
him is based on his social networking profiles and on his reading habits on the
proposed solution. The aim is to get the user's profile information based on
his social networking profile considering that almost every Internet user has
one. This helps us personalize the contents delivered to the user in order to
produce what is most relevant to him, in the form of a personalized e-magazine.
Further the proposed solution learns user's reading habits for example the news
he saves or clicks the most and makes a decision to provide him with the best
contents.Comment: 19 pages, 6 figure
Curating E-Mails; A life-cycle approach to the management and preservation of e-mail messages
E-mail forms the backbone of communications in many modern institutions and organisations and is a valuable type of organisational, cultural, and historical record. Successful management and preservation of valuable e-mail messages and collections is therefore vital if organisational accountability is to be achieved and historical or cultural memory retained for the future. This requires attention by all stakeholders across the entire life-cycle of the e-mail records.
This instalment of the Digital Curation Manual reports on the several issues involved in managing and curating e-mail messages for both current and future use. Although there is no 'one-size-fits-all' solution, this instalment outlines a generic framework for e-mail curation and preservation, provides a summary of current approaches, and addresses the technical, organisational and cultural challenges to successful e-mail management and longer-term curation.
Automatic detection of change in address blocks for reply forms processing
In this paper, an automatic method to detect the presence of on-line erasures/scribbles/corrections/over-writing in the address block of various types of subscription and utility payment forms is presented. The proposed approach employs bottom-up segmentation of the address block. Heuristic rules based on structural features are used to automate the detection process. The algorithm is applied on a large dataset of 5,780 real world document forms of 200 dots per inch resolution. The proposed algorithm performs well with an average processing time of 108 milliseconds per document with a detection accuracy of 98.96%
Links between the personalities, styles and performance in computer programming
There are repetitive patterns in strategies of manipulating source code. For
example, modifying source code before acquiring knowledge of how a code works
is a depth-first style and reading and understanding before modifying source
code is a breadth-first style. To the extent we know there is no study on the
influence of personality on them. The objective of this study is to understand
the influence of personality on programming styles. We did a correlational
study with 65 programmers at the University of Stuttgart. Academic achievement,
programming experience, attitude towards programming and five personality
factors were measured via self-assessed survey. The programming styles were
asked in the survey or mined from the software repositories. Performance in
programming was composed of bug-proneness of programmers which was mined from
software repositories, the grades they got in a software project course and
their estimate of their own programming ability. We did statistical analysis
and found that Openness to Experience has a positive association with
breadth-first style and Conscientiousness has a positive association with
depth-first style. We also found that in addition to having more programming
experience and better academic achievement, the styles of working depth-first
and saving coarse-grained revisions improve performance in programming.Comment: 27 pages, 6 figure
- …