21 research outputs found
How Privacy Regulations Could Better Tackle Challenges Stemming from Combining Data Sets
Modern information and communication technology practices present novel
threats to privacy. This paper focuses on some shortcomings in current privacy
and data protection regulations' ability to adequately address the
ramifications of some AI-driven data processing practices, in particular where
data sets are combined and processed by AI systems. We raise attention to two
regulatory anomalies related to two fundamental assumptions underlying
traditional privacy and data protection approaches: (1) Only personally
identifiable information (PII)/personal data require privacy protection:
Privacy and data protection regulations are only triggered with respect to
PII/personal data, but not anonymous data. This is not only problematic because
determining whether data falls in the former or latter category is no longer
straightforward, but also because privacy risks associated with data processing
may exist whether or not an individual can be identified. (2) Given sufficient
information provided in a transparent and understandable manner, individuals
are able to adequately assess the privacy implications of their actions and
protect their privacy interests: We show that this assumption corresponds to
the current societal consensus on privacy protection. However, relying on human
privacy expectations fails to address some important privacy threats, because
those expectations are increasingly at odds with the actual privacy
implications of data processing practices, as most people lack the necessary
technical literacy to understand the sophisticated technologies at play, not to
mention correctly assess their privacy implications.Comment: 18 page
On the Communication of Scientific Results: The Full-Metadata Format
In this paper, we introduce a scientific format for text-based data files,
which facilitates storing and communicating tabular data sets. The so-called
Full-Metadata Format builds on the widely used INI-standard and is based on
four principles: readable self-documentation, flexible structure, fail-safe
compatibility, and searchability. As a consequence, all metadata required to
interpret the tabular data are stored in the same file, allowing for the
automated generation of publication-ready tables and graphs and the semantic
searchability of data file collections. The Full-Metadata Format is introduced
on the basis of three comprehensive examples. The complete format and syntax is
given in the appendix
Large expert-curated database for benchmarking document similarity detection in biomedical literature search
Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe
Dissipative solitons in reaction diffusion systems: mechanisms, dynamics, interaction
Dissipative solitons are local excitations of nonlinear continuous systems which emerge due to a flux of energy or matter. Although they are continuous entities, dissipative solitons in reaction diffusion systems behave like particles: They are generated or annihilated as a whole, propagate with a well-defined velocity and interact with each other, which can lead to the formation of bound states, e.g. This book introduces dissipative solitons in the context of pattern formation, discusses experimental findings in chemical and physical systems, deduces a phenomenological model of dissipative solitons from basic principles, analyzes their dynamics and interaction from a theoretical point of view and verifies these finding in an experimental system by means of stochastic data analysis. Finally, the mechanisms of annihilation and generation are explained on the basis of simulations. Theoretical considerations focus on a certain family of reaction diffusion models with the result such that basic and advanced analytical methods can be introduced from scratch and can be followed down to computational results
Erratum: Improving the Traditional Information Management in Natural Sciences [Data Science Journal, Volume 8, 10 January 2009, 18-26]
Control Task for Reinforcement Learning with Known Optimal Solution for Discrete and Continuous Actions
Personalized Machine Learning Approach to Estimating Knee Kinematics Using Only Shank-Mounted IMU
peer reviewedKnee kinematics is a valuable measure of knee joint function. However, collecting that data outside the clinic is difficult, especially with a limited number of wearable sensors and when you only use an ankle-mounted inertial measurement unit (IMU) to estimate knee kinematics. Due to the cyclic nature of gait, it is possible to use machine learning to extract joint angles from only ankle-mounted sensors. This study aimed to use time-series feature extraction and a random forest regressor to generate a person-specific surrogate model for estimating knee joint flexion angles from a single-mounted IMU above the ankle. Optical motion capture (OMC) and inertial data from ten healthy participants walking on a treadmill were collected to create ten personalized surrogate models for estimating right knee flexion angles during gait. An additional ten models were created for a leave-one-out analysis to test the generalisability of the models. Temporal cross validation of the personalized models and a leave-one-out analysis was performed on the selected feature set. The personalized models achieved an average root-mean-square error (RMSE) of 2.45 \pm 0.65 ( R2 of 0.98) compared to a gold-standard OMC. The generalized models achieved an average RMSE of 6.77 \pm 3.38 ( R2 of 0.83) in the leave-one-out analysis. Time-series feature-based personalized surrogate models could be used to accurately estimate knee kinematics by using a single ankle-mounted sensor. However, more data are required to train a generalized model using the presented method