66,108 research outputs found
An investigation on the skewness patterns and fractal nature of research productivity distributions at field and discipline level
The paper provides an empirical examination of how research productivity
distributions differ across scientific fields and disciplines. Productivity is
measured using the FSS indicator, which embeds both quantity and impact of
output. The population studied consists of over 31,000 scientists in 180 fields
(10 aggregate disciplines) of a national research system. The Characteristic
Scores and Scale technique is used to investigate the distribution patterns for
the different fields and disciplines. Research productivity distributions are
found to be asymmetrical at the field level, although the degree of skewness
varies substantially among the fields within the aggregate disciplines. We also
examine whether the field productivity distributions show a fractal nature,
which reveals an exception more than a rule. Differently, for the disciplines,
the partitions of the distributions show skewed patterns that are highly
similar
Effect of crash pulse shape on seat stroke requirements for limiting loads on occupants of aircraft
An analytical study was made to provide comparative information on various crash pulse shapes that potentially could be used to test seats under conditions included in Federal Regulations Part 23 Paragraph 23.562(b)(1) for dynamic testing of general aviation seats, show the effects that crash pulse shape can have on the seat stroke requirements necessary to maintain a specified limit loading on the seat/occupant during crash pulse loadings, compare results from certain analytical model pulses with approximations of actual crash pulses, and compare analytical seat results with experimental airplace crash data. Structural and seat/occupant displacement equations in terms of the maximum deceleration, velocity change, limit seat pan load, and pulse time for five potentially useful pulse shapes were derived; from these, analytical seat stroke data were obtained for conditions as specified in Federal Regulations Part 23 Paragraph 23.562(b)(1) for dynamic testing of general aviation seats
edge2vec: Representation learning using edge semantics for biomedical knowledge discovery
Representation learning provides new and powerful graph analytical approaches
and tools for the highly valued data science challenge of mining knowledge
graphs. Since previous graph analytical methods have mostly focused on
homogeneous graphs, an important current challenge is extending this
methodology for richly heterogeneous graphs and knowledge domains. The
biomedical sciences are such a domain, reflecting the complexity of biology,
with entities such as genes, proteins, drugs, diseases, and phenotypes, and
relationships such as gene co-expression, biochemical regulation, and
biomolecular inhibition or activation. Therefore, the semantics of edges and
nodes are critical for representation learning and knowledge discovery in real
world biomedical problems. In this paper, we propose the edge2vec model, which
represents graphs considering edge semantics. An edge-type transition matrix is
trained by an Expectation-Maximization approach, and a stochastic gradient
descent model is employed to learn node embedding on a heterogeneous graph via
the trained transition matrix. edge2vec is validated on three biomedical domain
tasks: biomedical entity classification, compound-gene bioactivity prediction,
and biomedical information retrieval. Results show that by considering
edge-types into node embedding learning in heterogeneous graphs,
\textbf{edge2vec}\ significantly outperforms state-of-the-art models on all
three tasks. We propose this method for its added value relative to existing
graph analytical methodology, and in the real world context of biomedical
knowledge discovery applicability.Comment: 10 page
Recommended from our members
Statistical Workflow for Feature Selection in Human Metabolomics Data.
High-throughput metabolomics investigations, when conducted in large human cohorts, represent a potentially powerful tool for elucidating the biochemical diversity underlying human health and disease. Large-scale metabolomics data sources, generated using either targeted or nontargeted platforms, are becoming more common. Appropriate statistical analysis of these complex high-dimensional data will be critical for extracting meaningful results from such large-scale human metabolomics studies. Therefore, we consider the statistical analytical approaches that have been employed in prior human metabolomics studies. Based on the lessons learned and collective experience to date in the field, we offer a step-by-step framework for pursuing statistical analyses of cohort-based human metabolomics data, with a focus on feature selection. We discuss the range of options and approaches that may be employed at each stage of data management, analysis, and interpretation and offer guidance on the analytical decisions that need to be considered over the course of implementing a data analysis workflow. Certain pervasive analytical challenges facing the field warrant ongoing focused research. Addressing these challenges, particularly those related to analyzing human metabolomics data, will allow for more standardization of as well as advances in how research in the field is practiced. In turn, such major analytical advances will lead to substantial improvements in the overall contributions of human metabolomics investigations
Predicting Multi-class Customer Profiles Based on Transactions: a Case Study in Food Sales
Predicting the class of a customer profile is a key task in marketing, which enables businesses to approach the right customer with the right product at the right time through the right channel to satisfy the customer's evolving needs. However, due to costs, privacy and/or data protection, only the business' owned transactional data is typically available for constructing customer profiles. Predicting the class of customer profiles based on such data is challenging, as the data tends to be very large, heavily sparse and highly skewed. We present a new approach that is designed to efficiently and accurately handle the multi-class classification of customer profiles built using sparse and skewed transactional data. Our approach first bins the customer profiles on the basis of the number of items transacted. The discovered bins are then partitioned and prototypes within each of the discovered bins selected to build the multi-class classifier models. The results obtained from using four multi-class classifiers on real-world transactional data from the food sales domain consistently show the critical numbers of items at which the predictive performance of customer profiles can be substantially improved
Beyond statistical testing: individual differences and the contentand accuracy of mental representations of space
The article uses data from two experiments on the content and accuracy of mentalrepresentations of space by the blind and visually impaired in order expose some of theshortcomings of typical statistical testing and propose an individual differences approach to theanalysis of data. It begins with a discussion of some of the problems associated with the strictclassification and eventual comparison of individuals between groups. The individual differencesapproach is then presented and the concepts of ability and present competence are explored alongwith the importance of detailed participant description. Examples from the two experiments areused to demonstrate how null hypothesis significance testing can be complemented with effect sizeestimates, box-plots and ranking techniques. Throughout the article we are reminded of the need toadopt mutually supportive techniques to account for the heterogeneity of experience and skillsbetween participants. The article uses data from two experiments on the content and accuracy of mentalrepresentations of space by the blind and visually impaired in order expose some of theshortcomings of typical statistical testing and propose an individual differences approach to theanalysis of data. It begins with a discussion of some of the problems associated with the strictclassification and eventual comparison of individuals between groups. The individual differencesapproach is then presented and the concepts of ability and present competence are explored alongwith the importance of detailed participant description. Examples from the two experiments areused to demonstrate how null hypothesis significance testing can be complemented with effect sizeestimates, box-plots and ranking techniques. Throughout the article we are reminded of the need toadopt mutually supportive techniques to account for the heterogeneity of experience and skillsbetween participants
- ā¦