Search CORE

1,342 research outputs found

Improving Data on Race and Ethnicity: A Roadmap to Measure and Advance Health Equity

Author
Publication venue: National Committee for Quality Assurance
Publication date: 12/12/2021
Field of study

Achieving health equity begins with an ability to identify health disparities and their causes. To do that, we must have complete and accurate data on race, ethnicity, and other drivers of health. For far too long, large percentages of race and ethnicity data have been missing from federal and state health programs, with little progress towards closing the gaps. To identify the barriers and opportunities, Grantmakers In Health, in collaboration with the National Committee for Quality Assurance, interviewed a variety of stakeholders across the country, representing all levels of the health system. The second of two reports, Improving Data on Race and Ethnicity: A Roadmap to Measure and Advance Health Equity, builds on an earlier report, Federal Action Is Needed to Improve Race and Ethnicity Data in Health Programs, by providing more details about race and ethnicity data collection in federally administered health programs and an expanded list of recommendations for improving the data. The recommendations consider actions for states and the private sector as well as actions for the federal government.Philanthropy has a critical role to play in ensuring that health disparities are acknowledged and addressed, and can work directly with state and federal government to support the implementation of the actions outlined in this report

IssueLab

Going into depth:Learning morphological aspects in data modalities using neural networks

Author: Groenendijk R.W.
Publication venue
Publication date: 01/01/2023
Field of study

International Migration, Integration and Social Cohesion online publications

The Use of Mobility Data for Responding to the COVID-19 Pandemic

Author: Aditi Ramesh
Andrew J. Zahuranec
Andrew Young
Ben Snaith
Brennan Lake
Olivier Thereaux
Stefaan G. Verhulst
Publication venue: 'Foundation Center'
Publication date: 03/03/2021
Field of study

As the COVID-19 pandemic continues to upend the way people move, work, and gather, governments, businesses, and public health researchers have looked increasingly at mobility data to support pandemic response. This data, assets that describe human location and movement, generally has been collected for purposes directly related to a company's business model, including optimizing the delivery of consumer services, supply chain management or targeting advertisements. However, these call detail records, smartphone-mobility data, vehicle-derived GPS, and other mobility data assets can also be used to study patterns of movement. These patterns of movement have, in turn, been used by organizations to forecast disease spread and inform decisions on how to best manage activity in certain locations.Researchers at The GovLab and Cuebiq, supported by the Open Data Institute, identified 51 notable projects from around the globe launched by public sector and research organizations with companies that use mobility data for these purposes. It curated five projects among this listing that highlight the specific opportunities (and risks) presented by using this asset. Though few of these highlighted projects have provided public outputs that make assessing project success difficult, organizations interviewed considered mobility data to be a useful asset that enabled better public health surveillance, supported existing decision-making processes, or otherwise allowed groups to achieve their research goals.The report below summarizes some of the major points identified in those case studies. While acknowledging that location data can be a highly sensitive data type that can facilitate surveillance or expose data subjects if used carelessly, it finds mobility data can support research and inform decisions when applied toward narrowly defined research questions through frameworks that acknowledge and proactively mitigate risk. These frameworks can vary based on the individual circumstances facing data users, suppliers, and subjects. However, there are a few conditions that can enable users and suppliers to promote publicly beneficial and responsible data use and overcome the serious obstacles facing them.For data users (governments and research institutions), functional access to real-time and contextually relevant data can support research goals, even though a lack of data science competencies and both short and long-term funding sources represent major obstacles for this goal. Data suppliers (largely companies), meanwhile, need governance structures and mechanisms that facilitate responsible re-use, including data re-use agreements that define who, what, where, and when, and under what conditions data can be shared. A lack of regulatory clarity and the absence of universal governance and privacy standards have impeded effective and responsible dissemination of mobility for research and humanitarian purposes. Finally, for both data users and suppliers, we note that collaborative research networks that allow organizations to seek out and provide data can serve as enablers of project success by facilitating exchange of methods and resources, and closing the gap between research and practice.Based on these findings, we recommend the development of clear governance and privacy frameworks, increased capacity building around data use within the public sector, and more regular convenings of ecosystem stakeholders (including the public and data subjects) to broaden collaborative networks. We also propose solutions towards making the responsible use of mobility data more sustainable for longterm impact beyond the current pandemic. A failure to develop regulatory and governance frameworks that can responsibly manage mobility data could lead to a regression to the ad hoc and uncoordinated approaches that previously defined mobility data applications. It could also lead to disparate standards about organizations' responsibilities to the public

IssueLab

Keeping Research Data Safe 2: Final Report

Author: Beagrie N
Lavoie B
Woollard M
Publication venue: Joint Information Sytems Committee (JISC)
Publication date: 01/01/2010
Field of study

The first Keeping Research Data Safe study funded by JISC made a major contribution to understanding of long-term preservation costs for research data by developing a cost model and indentifying cost variables for preserving research data in UK universities (Beagrie et al, 2008). However it was completed over a very constrained timescale of four months with little opportunity to follow up other major issues or sources of preservation cost information it identified. It noted that digital preservation costs are notoriously difficult to address in part because of the absence of good case studies and longitudinal information for digital preservation costs or cost variables. In January 2009 JISC issued an ITT for a study on the identification of long-lived digital datasets for the purposes of cost analysis. The aim of this work was to provide a larger body of material and evidence against which existing and future data preservation cost modelling exercises could be tested and validated. The proposal for the KRDS2 study was submitted in response by a consortium consisting of 4 partners involved in the original Keeping Research Data Safe study (Universities of Cambridge and Southampton, Charles Beagrie Ltd, and OCLC Research) and 4 new partners with significant data collections and interests in preservation costs (Archaeology Data Service, University of London Computer Centre, University of Oxford, and the UK Data Archive). A range of supplementary materials in support of this main report have been made available on the KRDS2 project website at http://www.beagrie.com/jisc.php. That website will be maintained and continuously updated with future work as a resource for KRDS users

University of Essex Research Repository

Have health inequalities changed during childhood in the New Labour generation? Findings from the UK Millennium Cohort Study

Author: Hope Steven
Law Catherine
Pearce Anna
Rougeaux Emeline
Publication venue: 'BMJ'
Publication date: 01/01/2017
Field of study

Objectives: To examine how population-level socioeconomic health inequalities developed during childhood, for children born at the turn of the 21st century and who grew up with major initiatives to tackle health inequalities (under the New Labour Government). Setting The UK. Participants: Singleton children in the Millennium Cohort Study at ages 3 (n=15 381), 5 (n=15 041), 7 (n=13 681) and 11 (n=13 112) years. Primary outcomes: Relative (prevalence ratios (PR)) and absolute health inequalities (prevalence differences (PD)) were estimated in longitudinal models by socioeconomic circumstances (SEC; using highest maternal academic attainment, ranging from ‘no academic qualifications’ to ‘degree’ (baseline)). Three health outcomes were examined: overweight (including obesity), limiting long-standing illness (LLSI), and socio-emotional difficulties (SED). Results: Relative and absolute inequalities in overweight, across the social gradient, emerged by age 5 and increased with age. By age 11, children with mothers who had no academic qualifications were considerably more likely to be overweight as compared with those with degree-educated mothers (PR=1.6 (95% CI 1.4 to 1.8), PD=12.9% (9.1% to 16.8%)). For LLSI, inequalities emerged by age 7 and remained at 11, but only for children whose mothers had no academic qualifications (PR=1.7 (1.3 to 2.3), PD=4.8% (2% to 7.5%)). Inequalities in SED (observed across the social gradient and at all ages) declined between 3 and 11, although remained large at 11 (eg, PR=2.4 (1.9 to 2.9), PD=13.4% (10.2% to 16.7%) comparing children whose mothers had no academic qualifications with those of degree-educated mothers). Conclusions: Although health inequalities have been well documented in cross-sectional and trend data in the UK, it is less clear how they develop during childhood. We found that relative and absolute health inequalities persisted, and in some cases widened, for a cohort of children born at the turn of the century. Further research examining and comparing the pathways through which SECs influence health may further our understanding of how inequalities could be prevented in future generations of children

UCL Discovery

PubMed Central

Enlighten

Establishing a colorectal cancer research database from routinely collected health data: the process and potential from a pilot study.

Author: Campbell D
Carten R
Cunningham C
Davies J
English L
Galdikas A
Garbett A
Glampson B
Harris S
Jones HJ
Khan K
Little S
Malcomson L
Matharu S
Mayer E
Mercuri L
Morris EJ
Muirhead R
Norris R
O'Hara C
Papadimitriou D
Peek N
Perry W
Renehan A
Roadknight G
Starling N
Tamm A
Teare M
Turner R
Várnai KA
Wasan H
Woods K
Publication venue: 'BMJ'
Publication date: 01/06/2022
Field of study

OBJECTIVE: Colorectal cancer is a common cause of death and morbidity. A significant amount of data are routinely collected during patient treatment, but they are not generally available for research. The National Institute for Health Research Health Informatics Collaborative in the UK is developing infrastructure to enable routinely collected data to be used for collaborative, cross-centre research. This paper presents an overview of the process for collating colorectal cancer data and explores the potential of using this data source. METHODS: Clinical data were collected from three pilot Trusts, standardised and collated. Not all data were collected in a readily extractable format for research. Natural language processing (NLP) was used to extract relevant information from pseudonymised imaging and histopathology reports. Combining data from many sources allowed reconstruction of longitudinal histories for each patient that could be presented graphically. RESULTS: Three pilot Trusts submitted data, covering 12 903 patients with a diagnosis of colorectal cancer since 2012, with NLP implemented for 4150 patients. Timelines showing individual patient longitudinal history can be grouped into common treatment patterns, visually presenting clusters and outliers for analysis. Difficulties and gaps in data sources have been identified and addressed. DISCUSSION: Algorithms for analysing routinely collected data from a wide range of sites and sources have been developed and refined to provide a rich data set that will be used to better understand the natural history, treatment variation and optimal management of colorectal cancer. CONCLUSION: The data set has great potential to facilitate research into colorectal cancer

PubMed Central

Oxford University Research Archive

Institute of Cancer Research Repository

The Science of Detecting LLM-Generated Texts

Author: Chuang Yu-Neng
Hu Xia
Tang Ruixiang
Publication venue
Publication date: 02/06/2023
Field of study

The emergence of large language models (LLMs) has resulted in the production of LLM-generated texts that is highly sophisticated and almost indistinguishable from texts written by humans. However, this has also sparked concerns about the potential misuse of such texts, such as spreading misinformation and causing disruptions in the education system. Although many detection approaches have been proposed, a comprehensive understanding of the achievements and challenges is still lacking. This survey aims to provide an overview of existing LLM-generated text detection techniques and enhance the control and regulation of language generation models. Furthermore, we emphasize crucial considerations for future research, including the development of comprehensive evaluation metrics and the threat posed by open-source LLMs, to drive progress in the area of LLM-generated text detection

arXiv.org e-Print Archive

High-Fidelity Provenance:Exploring the Intersection of Provenance and Security

Author: Stamatogiannakis Emmanouil
Publication venue
Publication date: 18/10/2021
Field of study

In the past 25 years, the World Wide Web has disrupted the way news are disseminated and consumed. However, the euphoria for the democratization of news publishing was soon followed by scepticism, as a new phenomenon emerged: fake news. With no gatekeepers to vouch for it, the veracity of the information served over the World Wide Web became a major public concern. The Reuters Digital News Report 2020 cites that in at least half of the EU member countries, 50% or more of the population is concerned about online fake news. To help address the problem of trust on information communi- cated over the World Wide Web, it has been proposed to also make available the provenance metadata of the information. Similar to artwork provenance, this would include a detailed track of how the information was created, updated and propagated to produce the result we read, as well as what agents—human or software—were involved in the process. However, keeping track of provenance information is a non-trivial task. Current approaches, are often of limited scope and may require modifying existing applications to also generate provenance information along with thei regular output. This thesis explores how provenance can be automatically tracked in an application-agnostic manner, without having to modify the individual applications. We frame provenance capture as a data flow analysis problem and explore the use of dynamic taint analysis in this context. Our work shows that this appoach improves on the quality of provenance captured compared to traditonal approaches, yielding what we term as high-fidelity provenance. We explore the performance cost of this approach and use deterministic record and replay to bring it down to a more practical level. Furthermore, we create and present the tooling necessary for the expanding the use of using deterministic record and replay for provenance analysis. The thesis concludes with an application of high-fidelity provenance as a tool for state-of-the art offensive security analysis, based on the intuition that software too can be misguided by "fake news". This demonstrates that the potential uses of high-fidelity provenance for security extend beyond traditional forensics analysis

VU Research Portal

A Holistic Framework for Complex Big Data Governance

Author: Obatolu R.
Obatolu R.
Publication venue: University of East London
Publication date: 01/01/2021
Field of study

Big data assets are large datasets that can be leveraged by organisations if the capabilities exist, but it also brings considerable challenges. Despite the benefits that can be realised, the lack of proper big data governance is a major barrier, making the processing and control of this data exceptionally difficult to execute correctly. More specifically, organisations reportedly struggle to incorporate big data governance into their existing structures and business models to derive value from big data initiatives. Big data governance is an emerging research domain, gaining attention from both Information Systems scholars and the practitioner community. Nonetheless, there appears to have been limited scientific research in the area and most existing data governance approaches were limited, given they do not address end-to-end aspects of how big data could be governed. Furthermore, no suitable framework for handling data governance against big data complexities was found to be available. Thus, the main contribution of the work presented in this thesis is to address this requirement; by advancing research in this field and designing a novel holistic big data governance framework capable of supporting global organisations to effectively manage big data as an asset, thereby obtaining value from their big data initiatives. An extensive systematic literature review was done in order to uncover the published content that reflects the current state of knowledge in big data governance. To facilitate the creation of the proposed framework a grounded theory methodology was used to analyse openly available parliamentary inquiry data sources, with particular focus on identifying the core criteria for big data governance. The resulting novel framework generated provides new knowledge by identifying several big data governance building blocks; which are classified as strategic goals, execution stages, enablers and 22 core big data governance components to ensure an effective big data governance programme. Moreover, thesis findings indicate that big data complexities extend to the ethical side of big data governance and this is taken into consideration in the framework design. An ‘ethics by design’ component is proposed to influence how this can be addressed in a structured way

UEL Research Repository at University of East London