2,024 research outputs found
Detecting Sarcasm in Multimodal Social Platforms
Sarcasm is a peculiar form of sentiment expression, where the surface
sentiment differs from the implied sentiment. The detection of sarcasm in
social media platforms has been applied in the past mainly to textual
utterances where lexical indicators (such as interjections and intensifiers),
linguistic markers, and contextual information (such as user profiles, or past
conversations) were used to detect the sarcastic tone. However, modern social
media platforms allow to create multimodal messages where audiovisual content
is integrated with the text, making the analysis of a mode in isolation
partial. In our work, we first study the relationship between the textual and
visual aspects in multimodal posts from three major social media platforms,
i.e., Instagram, Tumblr and Twitter, and we run a crowdsourcing task to
quantify the extent to which images are perceived as necessary by human
annotators. Moreover, we propose two different computational frameworks to
detect sarcasm that integrate the textual and visual modalities. The first
approach exploits visual semantics trained on an external dataset, and
concatenates the semantics features with state-of-the-art textual features. The
second method adapts a visual neural network initialized with parameters
trained on ImageNet to multimodal sarcastic posts. Results show the positive
effect of combining modalities for the detection of sarcasm across platforms
and methods.Comment: 10 pages, 3 figures, final version published in the Proceedings of
ACM Multimedia 201
String Indexing for Patterns with Wildcards
We consider the problem of indexing a string of length to report the
occurrences of a query pattern containing characters and wildcards.
Let be the number of occurrences of in , and the size of
the alphabet. We obtain the following results.
- A linear space index with query time .
This significantly improves the previously best known linear space index by Lam
et al. [ISAAC 2007], which requires query time in the worst case.
- An index with query time using space , where is the maximum number of wildcards allowed in the pattern.
This is the first non-trivial bound with this query time.
- A time-space trade-off, generalizing the index by Cole et al. [STOC 2004].
We also show that these indexes can be generalized to allow variable length
gaps in the pattern. Our results are obtained using a novel combination of
well-known and new techniques, which could be of independent interest
Competition and Selection Among Conventions
In many domains, a latent competition among different conventions determines
which one will come to dominate. One sees such effects in the success of
community jargon, of competing frames in political rhetoric, or of terminology
in technical contexts. These effects have become widespread in the online
domain, where the data offers the potential to study competition among
conventions at a fine-grained level.
In analyzing the dynamics of conventions over time, however, even with
detailed on-line data, one encounters two significant challenges. First, as
conventions evolve, the underlying substance of their meaning tends to change
as well; and such substantive changes confound investigations of social
effects. Second, the selection of a convention takes place through the complex
interactions of individuals within a community, and contention between the
users of competing conventions plays a key role in the convention's evolution.
Any analysis must take place in the presence of these two issues.
In this work we study a setting in which we can cleanly track the competition
among conventions. Our analysis is based on the spread of low-level authoring
conventions in the eprint arXiv over 24 years: by tracking the spread of macros
and other author-defined conventions, we are able to study conventions that
vary even as the underlying meaning remains constant. We find that the
interaction among co-authors over time plays a crucial role in the selection of
them; the distinction between more and less experienced members of the
community, and the distinction between conventions with visible versus
invisible effects, are both central to the underlying processes. Through our
analysis we make predictions at the population level about the ultimate success
of different synonymous conventions over time--and at the individual level
about the outcome of "fights" between people over convention choices.Comment: To appear in Proceedings of WWW 2017, data at
https://github.com/CornellNLP/Macro
Comparison of Spectra in Unsequenced Species
International audienceWe introduce a new algorithm for the mass spectromet- ric identication of proteins. Experimental spectra obtained by tandem MS/MS are directly compared to theoretical spectra generated from pro- teins of evolutionarily closely related organisms. This work is motivated by the need of a method that allows the identication of proteins of unsequenced species against a database containing proteins of related organisms. The idea is that matching spectra of unknown peptides to very similar MS/MS spectra generated from this database of annotated proteins can lead to annotate unknown proteins. This process is similar to ortholog annotation in protein sequence databases. The difficulty with such an approach is that two similar peptides, even with just one mod- ication (i.e. insertion, deletion or substitution of one or several amino acid(s)) between them, usually generate very dissimilar spectra. In this paper, we present a new dynamic programming based algorithm: Packet- SpectralAlignment. Our algorithm is tolerant to modications and fully exploits two important properties that are usually not considered: the notion of inner symmetry, a relation linking pairs of spectrum peaks, and the notion of packet inside each spectrum to keep related peaks together. Our algorithm, PacketSpectralAlignment is then compared to SpectralAlignment [1] on a dataset of simulated spectra. Our tests show that PacketSpectralAlignment behaves better, in terms of results and execution tim
Proper Alignment of MS/MS Spectra from Unsequenced Species
International audienceCorrect interpretation of tandem mass spectrom- etry (MS/MS) data is a critical step in the protein identifi- cation process. Comparing experimental spectra against a library of simulated spectra generated from a database is the most common strategy for this interpretation. Unfortunately, problems arise when treating unsequenced species since, in this case, the proteins to be identified are absent from the databanks and experimental spectra can only be compared to theoretical spectra from close and already sequenced organisms. In this context, spectra comparisons become a notoriously difficult problem. In this paper, we deal with this problem by considerably improving PacketSpectralAlignment ( PSA ), a method we presented in [1]. First, we explain how to take full advantage of PSA by carefully selecting the most promising alignment positions during the algorithm, and how to precisely fix the parameters of PSA . Second, we present a new method, referred to as PSAwEL , which allows a better localisation of modifications. We then propose a new peptide identification framework that integrates these improvements. Finally, we propose a comparison between PSA and the reference, SpectralAlignment [2], which shows that PSA behaves better in terms of: (i) quality of the results; and (ii) execution time. Our tests were conducted on the ISB dataset [3]. We then validate our new framework on Brachypod
Colour reconnection in e+e- -> W+W- at sqrt(s) = 189 - 209 GeV
The effects of the final state interaction phenomenon known as colour
reconnection are investigated at centre-of-mass energies in the range sqrt(s) ~
189-209 GeV using the OPAL detector at LEP. Colour reconnection is expected to
affect observables based on charged particles in hadronic decays of W+W-.
Measurements of inclusive charged particle multiplicities, and of their angular
distribution with respect to the four jet axes of the events, are used to test
models of colour reconnection. The data are found to exclude extreme scenarios
of the Sjostrand-Khoze Type I (SK-I) model and are compatible with other
models, both with and without colour reconnection effects. In the context of
the SK-I model, the best agreement with data is obtained for a reconnection
probability of 37%. Assuming no colour reconnection, the charged particle
multiplicity in hadronically decaying W bosons is measured to be (nqqch) =
19.38+-0.05(stat.)+-0.08 (syst.).Comment: 30 pages, 9 figures, Submitted to Euro. Phys. J.
W+W- production and triple gauge boson couplings at LEP energies up to 183 GeV
A study of W-pair production in e+e- annihilations at Lep2 is presented,
based on 877 W+W- candidates corresponding to an integrated luminosity of 57
pb-1 at sqrt(s) = 183 GeV. Assuming that the angular distributions of the
W-pair production and decay, as well as their branching fractions, are
described by the Standard Model, the W-pair production cross-section is
measured to be 15.43 +- 0.61 (stat.) +- 0.26 (syst.) pb. Assuming lepton
universality and combining with our results from lower centre-of-mass energies,
the W branching fraction to hadrons is determined to be 67.9 +- 1.2 (stat.) +-
0.5 (syst.)%. The number of W-pair candidates and the angular distributions for
each final state (qqlnu,qqqq,lnulnu) are used to determine the triple gauge
boson couplings. After combining these values with our results from lower
centre-of-mass energies we obtain D(kappa_g)=0.11+0.52-0.37,
D(g^z_1)=0.01+0.13-0.12 and lambda=-0.10+0.13-0.12, where the errors include
both statistical and systematic uncertainties and each coupling is determined
setting the other two couplings to the Standard Model value. The fraction of W
bosons produced with a longitudinal polarisation is measured to be
0.242+-0.091(stat.)+-0.023(syst.). All these measurements are consistent with
the Standard Model expectations.Comment: 48 pages, LaTeX, including 13 eps or ps figures, submitted to
European Physical Journal
Measurement of the Hadronic Cross-Section for the Scattering of Two Virtual Photons at LEP
The interaction of virtual photons is investigated using the reaction e+e- ->
e+e- hadrons based on data taken by the OPAL experiment at e+e- centre-of-mass
energies sqrt(s_ee)=189-209 GeV, for W>5 GeV and at an average Q^2 of 17.9
GeV^2. The measured cross-sections are compared to predictions of the Quark
Parton Model (QPM), to the Leading Order QCD Monte Carlo model PHOJET to the
NLO prediction for the reaction e+e- -> e+e-qqbar, and to BFKL calculations.
PHOJET, NLO e+e- -> e+e-qqbar, and QPM describe the data reasonably well,
whereas the cross-section predicted by a Leading Order BFKL calculation is too
large.Comment: 30 pages, 10 figures, Submitted to Eur.Phys.J.
Bose-Einstein Correlations in e+e- to W+W- at 172 and 183 GeV
Bose-Einstein correlations between like-charge pions are studied in hadronic
final states produced by e+e- annihilations at center-of-mass energies of 172
and 183 GeV. Three event samples are studied, each dominated by one of the
processes W+W- to qqlnu, W+W- to qqqq, or (Z/g)* to qq. After demonstrating the
existence of Bose-Einstein correlations in W decays, an attempt is made to
determine Bose-Einstein correlations for pions originating from the same W
boson and from different W bosons, as well as for pions from (Z/g)* to qq
events. The following results are obtained for the individual chaoticity
parameters lambda assuming a common source radius R: lambda_same = 0.63 +- 0.19
+- 0.14, lambda_diff = 0.22 +- 0.53 +- 0.14, lambda_Z = 0.47 +- 0.11 +- 0.08, R
= 0.92 +- 0.09 +- 0.09. In each case, the first error is statistical and the
second is systematic. At the current level of statistical precision it is not
established whether Bose-Einstein correlations, between pions from different W
bosons exist or not.Comment: 24 pages, LaTeX, including 6 eps figures, submitted to European
Physical Journal
Bose-Einstein Correlations of Three Charged Pions in Hadronic Z^0 Decays
Bose-Einstein Correlations (BEC) of three identical charged pions were
studied in 4 x 10^6 hadronic Z^0 decays recorded with the OPAL detector at LEP.
The genuine three-pion correlations, corrected for the Coulomb effect, were
separated from the known two-pion correlations by a new subtraction procedure.
A significant genuine three-pion BEC enhancement near threshold was observed
having an emitter source radius of r_3 = 0.580 +/- 0.004 (stat.) +/- 0.029
(syst.) fm and a strength of \lambda_3 = 0.504 +/- 0.010 (stat.) +/- 0.041
(syst.). The Coulomb correction was found to increase the \lambda_3 value by
\~9% and to reduce r_3 by ~6%. The measured \lambda_3 corresponds to a value of
0.707 +/- 0.014 (stat.) +/- 0.078 (syst.) when one takes into account the
three-pion sample purity. A relation between the two-pion and the three-pion
source parameters is discussed.Comment: 19 pages, LaTeX, 5 eps figures included, accepted by Eur. Phys. J.
- …
