2,216 research outputs found
Explaining Latent Factor Models for Recommendation with Influence Functions
Latent factor models (LFMs) such as matrix factorization achieve the
state-of-the-art performance among various Collaborative Filtering (CF)
approaches for recommendation. Despite the high recommendation accuracy of
LFMs, a critical issue to be resolved is the lack of explainability. Extensive
efforts have been made in the literature to incorporate explainability into
LFMs. However, they either rely on auxiliary information which may not be
available in practice, or fail to provide easy-to-understand explanations. In
this paper, we propose a fast influence analysis method named FIA, which
successfully enforces explicit neighbor-style explanations to LFMs with the
technique of influence functions stemmed from robust statistics. We first
describe how to employ influence functions to LFMs to deliver neighbor-style
explanations. Then we develop a novel influence computation algorithm for
matrix factorization with high efficiency. We further extend it to the more
general neural collaborative filtering and introduce an approximation algorithm
to accelerate influence analysis over neural network models. Experimental
results on real datasets demonstrate the correctness, efficiency and usefulness
of our proposed method
Towards making NLG a voice for interpretable Machine Learning
I would like to acknowledge the support given to me by the Engineering and Physical Sciences Research Council (EPSRC) DTP grant number EP/N509814/1.Publisher PD
Are contrastive explanations useful?
Funding Information: Supported by EPSRC DTP Grant Number EP/N509814/1Peer reviewedPublisher PD
Capacity-achieving Polar-based LDGM Codes
In this paper, we study codes with sparse generator matrices. More
specifically, low-density generator matrix (LDGM) codes with a certain
constraint on the weight of all the columns in the generator matrix are
considered. In this paper, it is first shown that when a binary-input
memoryless symmetric (BMS) channel and a constant are given, there
exists a polarization kernel such that the corresponding polar code is
capacity-achieving and the column weights of the generator matrices are bounded
from above by .
Then, a general construction based on a concatenation of polar codes and a
rate- code, and a new column-splitting algorithm that guarantees a much
sparser generator matrix is given. More specifically, for any BMS channel and
any , where , an existence of
sequence of capacity-achieving codes with all the column wights of the
generator matrix upper bounded by is shown.
Furthermore, coding schemes for BEC and BMS channels, based on a second
column-splitting algorithm are devised with low-complexity decoding that uses
successive-cancellation. The second splitting algorithm allows for the use of a
low-complexity decoder by preserving the reliability of the bit-channels
observed by the source bits, and by increasing the code block length. In
particular, for any BEC and any , the
existence of a sequence of capacity-achieving codes where all the column wights
of the generator matrix are bounded from above by and
with decoding complexity is shown. The existence of similar
capacity-achieving LDGM codes with low-complexity decoding is shown for any BMS
channel, and for any .Comment: arXiv admin note: text overlap with arXiv:2001.1198
High throughput protein-protein interaction data: clues for the architecture of protein complexes
<p>Abstract</p> <p>Background</p> <p>High-throughput techniques are becoming widely used to study protein-protein interactions and protein complexes on a proteome-wide scale. Here we have explored the potential of these techniques to accurately determine the constituent proteins of complexes and their architecture within the complex.</p> <p>Results</p> <p>Two-dimensional representations of the 19S and 20S proteasome, mediator, and SAGA complexes were generated and overlaid with high quality pairwise interaction data, core-module-attachment classifications from affinity purifications of complexes and predicted domain-domain interactions. Pairwise interaction data could accurately determine the members of each complex, but was unexpectedly poor at deciphering the topology of proteins in complexes. Core and module data from affinity purification studies were less useful for accurately defining the member proteins of these complexes. However, these data gave strong information on the spatial proximity of many proteins. Predicted domain-domain interactions provided some insight into the topology of proteins within complexes, but was affected by a lack of available structural data for the co-activator complexes and the presence of shared domains in paralogous proteins.</p> <p>Conclusion</p> <p>The constituent proteins of complexes are likely to be determined with accuracy by combining data from high-throughput techniques. The topology of some proteins in the complexes will be able to be clearly inferred. We finally suggest strategies that can be employed to use high throughput interaction data to define the membership and understand the architecture of proteins in novel complexes.</p
- …