11,389 research outputs found
A Two-stage Classification Method for High-dimensional Data and Point Clouds
High-dimensional data classification is a fundamental task in machine
learning and imaging science. In this paper, we propose a two-stage multiphase
semi-supervised classification method for classifying high-dimensional data and
unstructured point clouds. To begin with, a fuzzy classification method such as
the standard support vector machine is used to generate a warm initialization.
We then apply a two-stage approach named SaT (smoothing and thresholding) to
improve the classification. In the first stage, an unconstraint convex
variational model is implemented to purify and smooth the initialization,
followed by the second stage which is to project the smoothed partition
obtained at stage one to a binary partition. These two stages can be repeated,
with the latest result as a new initialization, to keep improving the
classification quality. We show that the convex model of the smoothing stage
has a unique solution and can be solved by a specifically designed primal-dual
algorithm whose convergence is guaranteed. We test our method and compare it
with the state-of-the-art methods on several benchmark data sets. The
experimental results demonstrate clearly that our method is superior in both
the classification accuracy and computation speed for high-dimensional data and
point clouds.Comment: 21 pages, 4 figure
Named Entity Recognition in Electronic Health Records Using Transfer Learning Bootstrapped Neural Networks
Neural networks (NNs) have become the state of the art in many machine
learning applications, especially in image and sound processing [1]. The same,
although to a lesser extent [2,3], could be said in natural language processing
(NLP) tasks, such as named entity recognition. However, the success of NNs
remains dependent on the availability of large labelled datasets, which is a
significant hurdle in many important applications. One such case are electronic
health records (EHRs), which are arguably the largest source of medical data,
most of which lies hidden in natural text [4,5]. Data access is difficult due
to data privacy concerns, and therefore annotated datasets are scarce. With
scarce data, NNs will likely not be able to extract this hidden information
with practical accuracy. In our study, we develop an approach that solves these
problems for named entity recognition, obtaining 94.6 F1 score in I2B2 2009
Medical Extraction Challenge [6], 4.3 above the architecture that won the
competition. Beyond the official I2B2 challenge, we further achieve 82.4 F1 on
extracting relationships between medical terms. To reach this state-of-the-art
accuracy, our approach applies transfer learning to leverage on datasets
annotated for other I2B2 tasks, and designs and trains embeddings that
specially benefit from such transfer.Comment: 11 pages, 4 figures, 8 table
A survey of machine learning techniques applied to self organizing cellular networks
In this paper, a survey of the literature of the past fifteen years involving Machine Learning (ML) algorithms applied to self organizing cellular networks is performed. In order for future networks to overcome the current limitations and address the issues of current cellular systems, it is clear that more intelligence needs to be deployed, so that a fully autonomous and flexible network can be enabled. This paper focuses on the learning perspective of Self Organizing Networks (SON) solutions and provides, not only an overview of the most common ML techniques encountered in cellular networks, but also manages to classify each paper in terms of its learning solution, while also giving some examples. The authors also classify each paper in terms of its self-organizing use-case and discuss how each proposed solution performed. In addition, a comparison between the most commonly found ML algorithms in terms of certain SON metrics is performed and general guidelines on when to choose each ML algorithm for each SON function are proposed. Lastly, this work also provides future research directions and new paradigms that the use of more robust and intelligent algorithms, together with data gathered by operators, can bring to the cellular networks domain and fully enable the concept of SON in the near future
AI and OR in management of operations: history and trends
The last decade has seen a considerable growth in the use of Artificial Intelligence (AI) for operations management with the aim of finding solutions to problems that are increasing in complexity and scale. This paper begins by setting the context for the survey through a historical perspective of OR and AI. An extensive survey of applications of AI techniques for operations management, covering a total of over 1200 papers published from 1995 to 2004 is then presented. The survey utilizes Elsevier's ScienceDirect database as a source. Hence, the survey may not cover all the relevant journals but includes a sufficiently wide range of publications to make it representative of the research in the field. The papers are categorized into four areas of operations management: (a) design, (b) scheduling, (c) process planning and control and (d) quality, maintenance and fault diagnosis. Each of the four areas is categorized in terms of the AI techniques used: genetic algorithms, case-based reasoning, knowledge-based systems, fuzzy logic and hybrid techniques. The trends over the last decade are identified, discussed with respect to expected trends and directions for future work suggested
- …