80,039 research outputs found
Ensemble Committees for Stock Return Classification and Prediction
This paper considers a portfolio trading strategy formulated by algorithms in
the field of machine learning. The profitability of the strategy is measured by
the algorithm's capability to consistently and accurately identify stock
indices with positive or negative returns, and to generate a preferred
portfolio allocation on the basis of a learned model. Stocks are characterized
by time series data sets consisting of technical variables that reflect market
conditions in a previous time interval, which are utilized produce binary
classification decisions in subsequent intervals. The learned model is
constructed as a committee of random forest classifiers, a non-linear support
vector machine classifier, a relevance vector machine classifier, and a
constituent ensemble of k-nearest neighbors classifiers. The Global Industry
Classification Standard (GICS) is used to explore the ensemble model's efficacy
within the context of various fields of investment including Energy, Materials,
Financials, and Information Technology. Data from 2006 to 2012, inclusive, are
considered, which are chosen for providing a range of market circumstances for
evaluating the model. The model is observed to achieve an accuracy of
approximately 70% when predicting stock price returns three months in advance.Comment: 15 pages, 4 figures, Neukom Institute Computational Undergraduate
Research prize - second plac
MDNet: A Semantically and Visually Interpretable Medical Image Diagnosis Network
The inability to interpret the model prediction in semantically and visually
meaningful ways is a well-known shortcoming of most existing computer-aided
diagnosis methods. In this paper, we propose MDNet to establish a direct
multimodal mapping between medical images and diagnostic reports that can read
images, generate diagnostic reports, retrieve images by symptom descriptions,
and visualize attention, to provide justifications of the network diagnosis
process. MDNet includes an image model and a language model. The image model is
proposed to enhance multi-scale feature ensembles and utilization efficiency.
The language model, integrated with our improved attention mechanism, aims to
read and explore discriminative image feature descriptions from reports to
learn a direct mapping from sentence words to image pixels. The overall network
is trained end-to-end by using our developed optimization strategy. Based on a
pathology bladder cancer images and its diagnostic reports (BCIDR) dataset, we
conduct sufficient experiments to demonstrate that MDNet outperforms
comparative baselines. The proposed image model obtains state-of-the-art
performance on two CIFAR datasets as well.Comment: CVPR2017 Ora
Local feature weighting in nearest prototype classification
The distance metric is the corner stone of nearest neighbor (NN)-based methods, and therefore, of nearest prototype (NP) algorithms. That is because they classify depending on the similarity of the data. When the data is characterized by a set of features which may contribute to the classification task in different levels, feature weighting or selection is required, sometimes in a local sense. However, local weighting is typically restricted to NN approaches. In this paper, we introduce local feature weighting (LFW) in NP classification. LFW provides each prototype its own weight vector, opposite to typical global weighting methods found in the NP literature, where all the prototypes share the same one. Providing each prototype its own weight vector has a novel effect in the borders of the Voronoi regions generated: They become nonlinear. We have integrated LFW with a previously developed evolutionary nearest prototype classifier (ENPC). The experiments performed both in artificial and real data sets demonstrate that the resulting algorithm that we call LFW in nearest prototype classification (LFW-NPC) avoids overfitting on training data in domains where the features may have different contribution to the classification task in different areas of the feature space. This generalization capability is also reflected in automatically obtaining an accurate and reduced set of prototypes.Publicad
Identifying Web Tables - Supporting a Neglected Type of Content on the Web
The abundance of the data in the Internet facilitates the improvement of
extraction and processing tools. The trend in the open data publishing
encourages the adoption of structured formats like CSV and RDF. However, there
is still a plethora of unstructured data on the Web which we assume contain
semantics. For this reason, we propose an approach to derive semantics from web
tables which are still the most popular publishing tool on the Web. The paper
also discusses methods and services of unstructured data extraction and
processing as well as machine learning techniques to enhance such a workflow.
The eventual result is a framework to process, publish and visualize linked
open data. The software enables tables extraction from various open data
sources in the HTML format and an automatic export to the RDF format making the
data linked. The paper also gives the evaluation of machine learning techniques
in conjunction with string similarity functions to be applied in a tables
recognition task.Comment: 9 pages, 4 figure
Recommended from our members
Prediction of Recovery From Severe Hemorrhagic Shock Using Logistic Regression.
This paper implements logistic regression models (LRMs) and feature selection for creating a predictive model for recovery form hemorrhagic shock (HS) with resuscitation using blood in the multiple experimental rat animal protocols. A total of 61 animals were studied across multiple HS experiments, which encompassed two different HS protocols and two resuscitation protocols using blood stored for short periods using five different techniques. Twenty-seven different systemic hemodynamics, cardiac function, and blood gas parameters were measured in each experiment, of which feature selection deemed only 25% of the them as relevant. The reduced feature set was used to train a final logistic regression model. A final test set accuracy is 84% compared to 74% for a baseline classifier using only MAP and HR measurements. Receiver operating characteristics (ROC) curve analysis and Cohens kappa statistics were also used as measures of performance, with the final reduced model outperforming the model, including all parameters. Our results suggest that LRMs trained with a combination of systemic hemodynamics, cardiac function, and blood gas parameters measured at multiple timepoints during HS can successfully classify HS recovery groups. Our results show the predictive ability of traditional and novel hemodynamic and cardiac function features and their combinations, many of which had not previously been taken into consideration, for monitoring HS. Furthermore, we have devised an effective methodology for feature selection and shown ways in which the performance of such predictive models should be assessed in future studies
NEXT LEVEL: A COURSE RECOMMENDER SYSTEM BASED ON CAREER INTERESTS
Skills-based hiring is a talent management approach that empowers employers to align recruitment around business results, rather than around credentials and title. It starts with employers identifying the particular skills required for a role, and then screening and evaluating candidates’ competencies against those requirements. With the recent rise in employers adopting skills-based hiring practices, it has become integral for students to take courses that improve their marketability and support their long-term career success. A 2017 survey of over 32,000 students at 43 randomly selected institutions found that only 34% of students believe they will graduate with the skills and knowledge required to be successful in the job market. Furthermore, the study found that while 96% of chief academic officers believe that their institutions are very or somewhat effective at preparing students for the workforce, only 11% of business leaders strongly agree [11]. An implication of the misalignment is that college graduates lack the skills that companies need and value. Fortunately, the rise of skills-based hiring provides an opportunity for universities and students to establish and follow clearer classroom-to-career pathways. To this end, this paper presents a course recommender system that aims to improve students’ career readiness by suggesting relevant skills and courses based on their unique career interests
- …