16,393 research outputs found
Rank-based linkage I: triplet comparisons and oriented simplicial complexes
Rank-based linkage is a new tool for summarizing a collection of objects
according to their relationships. These objects are not mapped to vectors, and
``similarity'' between objects need be neither numerical nor symmetrical. All
an object needs to do is rank nearby objects by similarity to itself, using a
Comparator which is transitive, but need not be consistent with any metric on
the whole set. Call this a ranking system on . Rank-based linkage is applied
to the -nearest neighbor digraph derived from a ranking system. Computations
occur on a 2-dimensional abstract oriented simplicial complex whose faces are
among the points, edges, and triangles of the line graph of the undirected
-nearest neighbor graph on . In steps it builds an
edge-weighted linkage graph where
is called the in-sway between objects and . Take to be
the links whose in-sway is at least , and partition into components of
the graph , for varying . Rank-based linkage is a
functor from a category of out-ordered digraphs to a category of partitioned
sets, with the practical consequence that augmenting the set of objects in a
rank-respectful way gives a fresh clustering which does not ``rip apart`` the
previous one. The same holds for single linkage clustering in the metric space
context, but not for typical optimization-based methods. Open combinatorial
problems are presented in the last section.Comment: 37 pages, 12 figure
Entanglement in the full state vector of boson sampling
The full state vector of boson sampling is generated by passing S single
photons through beam splitters of M modes. The initial Fock state is expressed
withgeneralized coherent states, and an exact application of the unitary
evolution becomes possible. Due to the favorable polynomial scaling in M , we
can investigate Renyi entanglement entropies for moderate particle and huge
mode numbers. We find (almost) Renyi index independent symmetric Page curves
with maximum entropy at equal partition. Furthermore, the maximum entropy as a
function of mode index saturates as a function of M in the collision-free
subspace case. The asymptotic value of the entropy increases linearly with S.
Furthermore, we show that the build-up of the entanglement leads to a cusp at
subsystem size equal to S in the asymmetric entanglement curve. The maximum
entanglement is reached surprisingly early before the mode population is
distributed over the whole system
RAPID: Enabling Fast Online Policy Learning in Dynamic Public Cloud Environments
Resource sharing between multiple workloads has become a prominent practice
among cloud service providers, motivated by demand for improved resource
utilization and reduced cost of ownership. Effective resource sharing, however,
remains an open challenge due to the adverse effects that resource contention
can have on high-priority, user-facing workloads with strict Quality of Service
(QoS) requirements. Although recent approaches have demonstrated promising
results, those works remain largely impractical in public cloud environments
since workloads are not known in advance and may only run for a brief period,
thus prohibiting offline learning and significantly hindering online learning.
In this paper, we propose RAPID, a novel framework for fast, fully-online
resource allocation policy learning in highly dynamic operating environments.
RAPID leverages lightweight QoS predictions, enabled by
domain-knowledge-inspired techniques for sample efficiency and bias reduction,
to decouple control from conventional feedback sources and guide policy
learning at a rate orders of magnitude faster than prior work. Evaluation on a
real-world server platform with representative cloud workloads confirms that
RAPID can learn stable resource allocation policies in minutes, as compared
with hours in prior state-of-the-art, while improving QoS by 9.0x and
increasing best-effort workload performance by 19-43%
Identifying Student Profiles Within Online Judge Systems Using Explainable Artificial Intelligence
Online Judge (OJ) systems are typically considered within programming-related courses as they yield fast and objective assessments of the code developed by the students. Such an evaluation generally provides a single decision based on a rubric, most commonly whether the submission successfully accomplished the assignment. Nevertheless, since in an educational context such information may be deemed insufficient, it would be beneficial for both the student and the instructor to receive additional feedback about the overall development of the task. This work aims to tackle this limitation by considering the further exploitation of the information gathered by the OJ and automatically inferring feedback for both the student and the instructor. More precisely, we consider the use of learning-based schemes—particularly, Multi-Instance Learning and classical Machine Learning formulations—to model student behaviour. Besides, Explainable Artificial Intelligence is contemplated to provide human-understandable feedback. The proposal has been evaluated considering a case of study comprising 2,500 submissions from roughly 90 different students from a programming-related course in a Computer Science degree. The results obtained validate the proposal: the model is capable of significantly predicting the user outcome (either passing or failing the assignment) solely based on the behavioural pattern inferred by the submissions provided to the OJ. Moreover, the proposal is able to identify prone-to-fail student groups and profiles as well as other relevant information, which eventually serves as feedback to both the student and the instructor.This work has been partially funded by the “Programa Redes-I3CE de investigacion en docencia universitaria del Instituto de Ciencias de la Educacion (REDES-I3CE-2020-5069)” of the University of Alicante. The third author is supported by grant APOSTD/2020/256 from “Programa I+D+I de la Generalitat Valenciana”
Kurcuma: a kitchen utensil recognition collection for unsupervised domain adaptation
The use of deep learning makes it possible to achieve extraordinary results in all kinds of tasks related to computer vision. However, this performance is strongly related to the availability of training data and its relationship with the distribution in the eventual application scenario. This question is of vital importance in areas such as robotics, where the targeted environment data are barely available in advance. In this context, domain adaptation (DA) techniques are especially important to building models that deal with new data for which the corresponding label is not available. To promote further research in DA techniques applied to robotics, this work presents Kurcuma (Kitchen Utensil Recognition Collection for Unsupervised doMain Adaptation), an assortment of seven datasets for the classification of kitchen utensils—a task of relevance in home-assistance robotics and a suitable showcase for DA. Along with the data, we provide a broad description of the main characteristics of the dataset, as well as a baseline using the well-known domain-adversarial training of neural networks approach. The results show the challenge posed by DA on these types of tasks, pointing to the need for new approaches in future work.Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. This work was supported by the I+D+i project TED2021-132103A-I00 (DOREMI), funded by MCIN/AEI/10.13039/501100011033. Some of the computing resources were provided by the Generalitat Valenciana and the European Union through the FEDER funding program (IDIFEDER/2020/003). The second author is supported by grant APOSTD/2020/256 from “Programa I+D+i de la Generalitat Valenciana”
Qluster: An easy-to-implement generic workflow for robust clustering of health data
The exploration of heath data by clustering algorithms allows to better describe the populations of interest by seeking the sub-profiles that compose it. This therefore reinforces medical knowledge, whether it is about a disease or a targeted population in real life. Nevertheless, contrary to the so-called conventional biostatistical methods where numerous guidelines exist, the standardization of data science approaches in clinical research remains a little discussed subject. This results in a significant variability in the execution of data science projects, whether in terms of algorithms used, reliability and credibility of the designed approach. Taking the path of parsimonious and judicious choice of both algorithms and implementations at each stage, this article proposes Qluster, a practical workflow for performing clustering tasks. Indeed, this workflow makes a compromise between (1) genericity of applications (e.g. usable on small or big data, on continuous, categorical or mixed variables, on database of high-dimensionality or not), (2) ease of implementation (need for few packages, few algorithms, few parameters, ...), and (3) robustness (e.g. use of proven algorithms and robust packages, evaluation of the stability of clusters, management of noise and multicollinearity). This workflow can be easily automated and/or routinely applied on a wide range of clustering projects. It can be useful both for data scientists with little experience in the field to make data clustering easier and more robust, and for more experienced data scientists who are looking for a straightforward and reliable solution to routinely perform preliminary data mining. A synthesis of the literature on data clustering as well as the scientific rationale supporting the proposed workflow is also provided. Finally, a detailed application of the workflow on a concrete use case is provided, along with a practical discussion for data scientists. An implementation on the Dataiku platform is available upon request to the authors
Nonparametric Two-Sample Test for Networks Using Joint Graphon Estimation
This paper focuses on the comparison of networks on the basis of statistical
inference. For that purpose, we rely on smooth graphon models as a
nonparametric modeling strategy that is able to capture complex structural
patterns. The graphon itself can be viewed more broadly as density or intensity
function on networks, making the model a natural choice for comparison
purposes. Extending graphon estimation towards modeling multiple networks
simultaneously consequently provides substantial information about the
(dis-)similarity between networks. Fitting such a joint model - which can be
accomplished by applying an EM-type algorithm - provides a joint graphon
estimate plus a corresponding prediction of the node positions for each
network. In particular, it entails a generalized network alignment, where
nearby nodes play similar structural roles in their respective domains. Given
that, we construct a chi-squared test on equivalence of network structures.
Simulation studies and real-world examples support the applicability of our
network comparison strategy.Comment: 25 pages, 6 figure
Vegetation responses to variations in climate: A combined ordinary differential equation and sequential Monte Carlo estimation approach
Vegetation responses to variation in climate are a current research priority in the context of accelerated shifts generated by climate change. However, the interactions between environmental and biological factors still represent one of the largest uncertainties in projections of future scenarios, since the relationship between drivers and ecosystem responses has a complex and nonlinear nature. We aimed to develop a model to study the vegetation’s primary productivity dynamic response to temporal variations in climatic conditions as measured by rainfall, temperature and radiation. Thus, we propose a new way to estimate the vegetation response to climate via a non-autonomous version of a classical growth curve, with a time-varying growth rate and carrying capacity parameters according to climate variables. With a Sequential Monte Carlo Estimation to account for complexities in the climate-vegetation relationship to minimize the number of parameters. The model was applied to six key sites identified in a previous study, consisting of different arid and semiarid rangelands from North Patagonia, Argentina. For each site, we selected the time series of MODIS NDVI, and climate data from ERA5 Copernicus hourly reanalysis from 2000 to 2021. After calculating the time series of the a posteriori distribution of parameters, we analyzed the explained capacity of the model in terms of the linear coefficient of determination and
the parameters distribution variation. Results showed that most rangelands recorded changes in their sensitivity over time to climatic factors, but vegetation responses were heterogeneous and influenced by different drivers. Differences in this climate-vegetation relationship were recorded among different cases: (1) a marginal and decreasing sensitivity to temperature and radiation, respectively, but a high sensitivity to water availability; (2) high and increasing sensitivity to temperature and water availability, respectively; and (3) a case with an abrupt shift in vegetation dynamics driven by a progressively decreasing sensitivity to water availability, without any
changes in the sensitivity either to temperature or radiation. Finally, we also found that the time scale, in which the ecosystem integrated the rainfall phenomenon in terms of the width of the window function used to convolve the rainfall series into a water availability variable, was also variable in time. This approach allows us to estimate the connection degree between ecosystem productivity and climatic variables. The capacity of the model to identify changes over time in the vegetation-climate relationship might inform decision-makers about ecological transitions and the differential impact of climatic drivers on ecosystems.Estación Experimental Agropecuaria BarilocheFil: Bruzzone, Octavio Augusto. Instituto Nacional de Tecnología Agropecuaria (INTA). Estación Experimental Agropecuaria Bariloche; ArgentinaFil: Bruzzone, Octavio Augusto. Consejo Nacional de Investigaciones Cientificas y Tecnicas. Instituto de Investigaciones Forestales y Agropecuarias Bariloche; ArgentinaFil: Perri, Daiana Vanesa. Instituto Nacional de Tecnologia Agropecuaria (INTA). Estación Experimental Agropecuaria Bariloche. Área de Recursos Naturales; ArgentinaFil: Perri, Daiana Vanesa. Consejo Nacional de Investigaciones Cientificas y Tecnicas. Instituto de Investigaciones Forestales y Agropecuarias Bariloche; ArgentinaFil: Easdale, Marcos Horacio. Instituto Nacional de Tecnologia Agropecuaria (INTA). Estación Experimental Agropecuaria Bariloche. Área de Recursos Naturales; ArgentinaFil: Easdale, Marcos Horacio. Consejo Nacional de Investigaciones Cientificas y Tecnicas. Instituto de Investigaciones Forestales y Agropecuarias Bariloche; Argentin
Towards a generic compilation approach for quantum circuits through resynthesis
In this paper, we propose a generic quantum circuit resynthesis approach for
compilation. We use an intermediate representation consisting of Paulistrings
over {Z, I} and {X, I} called a ``mixed ZX-phase polynomial``. From this
universal representation, we generate a completely new circuit such that all
multi-qubit gates (CNOTs) are satisfying a given quantum architecture.
Moreover, we attempt to minimize the amount of generated gates.
The proposed algorithms generate fewer CNOTs than similar previous methods on
different connectivity graphs ranging from 5-20 qubits. In most cases, the CNOT
counts are also lower than Qiskit's. For large circuits, containing >= 100
Paulistrings, our proposed algorithms even generate fewer CNOTs than the TKET
compiler.
Additionally, we give insight into the trade-off between compilation time and
final CNOT count.Comment: 10 pages including references. 2 tables, 1 figur
- …