28,041 research outputs found
An Overview on Application of Machine Learning Techniques in Optical Networks
Today's telecommunication networks have become sources of enormous amounts of
widely heterogeneous data. This information can be retrieved from network
traffic traces, network alarms, signal quality indicators, users' behavioral
data, etc. Advanced mathematical tools are required to extract meaningful
information from these data and take decisions pertaining to the proper
functioning of the networks from the network-generated data. Among these
mathematical tools, Machine Learning (ML) is regarded as one of the most
promising methodological approaches to perform network-data analysis and enable
automated network self-configuration and fault management. The adoption of ML
techniques in the field of optical communication networks is motivated by the
unprecedented growth of network complexity faced by optical networks in the
last few years. Such complexity increase is due to the introduction of a huge
number of adjustable and interdependent system parameters (e.g., routing
configurations, modulation format, symbol rate, coding schemes, etc.) that are
enabled by the usage of coherent transmission/reception technologies, advanced
digital signal processing and compensation of nonlinear effects in optical
fiber propagation. In this paper we provide an overview of the application of
ML to optical communications and networking. We classify and survey relevant
literature dealing with the topic, and we also provide an introductory tutorial
on ML for researchers and practitioners interested in this field. Although a
good number of research papers have recently appeared, the application of ML to
optical networks is still in its infancy: to stimulate further work in this
area, we conclude the paper proposing new possible research directions
Towards Accountable AI: Hybrid Human-Machine Analyses for Characterizing System Failure
As machine learning systems move from computer-science laboratories into the
open world, their accountability becomes a high priority problem.
Accountability requires deep understanding of system behavior and its failures.
Current evaluation methods such as single-score error metrics and confusion
matrices provide aggregate views of system performance that hide important
shortcomings. Understanding details about failures is important for identifying
pathways for refinement, communicating the reliability of systems in different
settings, and for specifying appropriate human oversight and engagement.
Characterization of failures and shortcomings is particularly complex for
systems composed of multiple machine learned components. For such systems,
existing evaluation methods have limited expressiveness in describing and
explaining the relationship among input content, the internal states of system
components, and final output quality. We present Pandora, a set of hybrid
human-machine methods and tools for describing and explaining system failures.
Pandora leverages both human and system-generated observations to summarize
conditions of system malfunction with respect to the input content and system
architecture. We share results of a case study with a machine learning pipeline
for image captioning that show how detailed performance views can be beneficial
for analysis and debugging
SQL Query Completion for Data Exploration
Within the big data tsunami, relational databases and SQL are still there and
remain mandatory in most of cases for accessing data. On the one hand, SQL is
easy-to-use by non specialists and allows to identify pertinent initial data at
the very beginning of the data exploration process. On the other hand, it is
not always so easy to formulate SQL queries: nowadays, it is more and more
frequent to have several databases available for one application domain, some
of them with hundreds of tables and/or attributes. Identifying the pertinent
conditions to select the desired data, or even identifying relevant attributes
is far from trivial. To make it easier to write SQL queries, we propose the
notion of SQL query completion: given a query, it suggests additional
conditions to be added to its WHERE clause. This completion is semantic, as it
relies on the data from the database, unlike current completion tools that are
mostly syntactic. Since the process can be repeated over and over again --
until the data analyst reaches her data of interest --, SQL query completion
facilitates the exploration of databases. SQL query completion has been
implemented in a SQL editor on top of a database management system. For the
evaluation, two questions need to be studied: first, does the completion speed
up the writing of SQL queries? Second , is the completion easily adopted by
users? A thorough experiment has been conducted on a group of 70 computer
science students divided in two groups (one with the completion and the other
one without) to answer those questions. The results are positive and very
promising
Re-mining item associations: methodology and a case study in apparel retailing
Association mining is the conventional data mining technique for analyzing market basket data and it reveals the positive and negative associations between items. While being an integral part of transaction data, pricing and time information have not been integrated into market basket analysis in earlier studies. This paper proposes a new approach to mine price, time and domain related attributes through re-mining of association mining results. The underlying factors behind positive and negative relationships can be characterized and described through this second data mining stage. The applicability of the methodology is demonstrated through the analysis of data coming from a large apparel retail chain, and its algorithmic complexity is analyzed in comparison to the existing techniques
- …