453 research outputs found
EPIE Dataset: A Corpus For Possible Idiomatic Expressions
Idiomatic expressions have always been a bottleneck for language
comprehension and natural language understanding, specifically for tasks like
Machine Translation(MT). MT systems predominantly produce literal translations
of idiomatic expressions as they do not exhibit generic and linguistically
deterministic patterns which can be exploited for comprehension of the
non-compositional meaning of the expressions. These expressions occur in
parallel corpora used for training, but due to the comparatively high
occurrences of the constituent words of idiomatic expressions in literal
context, the idiomatic meaning gets overpowered by the compositional meaning of
the expression. State of the art Metaphor Detection Systems are able to detect
non-compositional usage at word level but miss out on idiosyncratic phrasal
idiomatic expressions. This creates a dire need for a dataset with a wider
coverage and higher occurrence of commonly occurring idiomatic expressions, the
spans of which can be used for Metaphor Detection. With this in mind, we
present our English Possible Idiomatic Expressions(EPIE) corpus containing
25206 sentences labelled with lexical instances of 717 idiomatic expressions.
These spans also cover literal usages for the given set of idiomatic
expressions. We also present the utility of our dataset by using it to train a
sequence labelling module and testing on three independent datasets with high
accuracy, precision and recall scores
UNRAVELING THE COMPLEXITY OF GAS-PHASE LITHIUM-CATIONIZED CARBOHYDRATE CHEMISTRY AND STRUCTURES
Complete structural elucidation of carbohydrate molecules remains a prominent challenge in analytical chemistry. Much of the structural intricacy of carbohydrates stems from the various isomeric monosaccharide subunits that are linked together. To date, there are few analytical techniques capable of differentiating monosaccharide isomers. Mass spectrometry-based technologies are promising for differentiation of monosaccharides because of their high selectivity and sensitivity, and short analysis times. Mass spectrometry analysis first requires generation of gas-phase ions from solution-phase molecules, which are then separated based on their mass-to-charge ratio (m/z). Monosaccharide isomers cannot be distinguished by mass spectrometry alone because they have identical m/z. Tandem-mass spectrometry (MS/MS) techniques rely on gas-phase chemistry to differentiate isomers and stereoisomers within the mass spectrometer. Two examples of gas-phase chemistries that are useful for isomer differentiation are unimolecular dissociation or an ion/molecule reaction. The MS/MS response, or gas-phase chemistry, of an ion will depend on ion structure and the charge carrier (H+/Na+/Li+/etc.), which is affected by the mode of ionization.Electrospray ionization (ESI) is commonly employed for ionization of carbohydrates. Because monosaccharides have a high metal cation affinity and because sodium is ubiquitous in solvents, ESI of a monosaccharide solution results in sodium-cationized monosaccharides. Alternatively, lithium salts can be added into the ESI solution to generate lithium-cationized monosaccharides. Monosaccharide oxygen atoms form multidentate (bi-/tri-/tetradentate) coordinations with lithium, and multiple potential sites for cation coordination exist on a monosaccharide molecule. To differentiate monosaccharide isomers, the ion distribution that each isomer forms must have measurable differences in gas-phase chemistry. The most common MS/MS technique is collision-induced dissociation (CID), but, in general, CID response is not disparate enough between lithium-cationized monosaccharide isomers for differentiation. Another MS/MS technique that has been able to differentiate isomeric monosaccharides is the water adduction ion/molecule reaction. Using a combination of computational data and experimental water adduction data the structures of solution- and gas-phase lithium-cationized monosaccharide ions were explored, and the chemistry and mechanism of the water adduction reaction was investigated. Finally, using CID and water adduction, the gas-phase dissociation chemistry and product ion structures of lithium cationized hexoses were shown to be more complex than previously postulated.Doctor of Philosoph
Phosphate Tether-Mediated Ring-Closing Metathesis Studies to Complex 1,3-anti-Diol-Containing Subunits
This is the peer reviewed version of the following article: Chegondi, R., Maitra, S., Markley, J. L., & Hanson, P. R. (2013). Phosphate Tether-Mediated Ring-Closing Metathesis Studies to Complex 1,3-anti-Diol-Containing Subunits. Chemistry (Weinheim an Der Bergstrasse, Germany), 19(25), 10.1002/chem.201300913. http://doi.org/10.1002/chem.201300913, which has been published in final form at doi.org/10.1002/chem.201300913. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Self-Archiving.An array of examples of diastereoselective, phosphate tether-mediated ring-closing metathesis reactions, which highlight the importance of product ring size and substrate stereochemical compatibility, as well as complexity, is reported. Studies focus primarily on the formation of bicyclo[n.3.1]phosphates, involving the coupling of C2-symmetric dienediol subunits with a variety of simple, as well as complex alcohol cross-partners
TTT-UCDR: Test-time Training for Universal Cross-Domain Retrieval
Image retrieval under generalized test scenarios has gained significant
momentum in literature, and the recently proposed protocol of Universal
Cross-domain Retrieval is a pioneer in this direction. A common practice in any
such generalized classification or retrieval algorithm is to exploit samples
from multiple domains during training to learn a domain-invariant
representation of data. Such criterion is often restrictive, and thus in this
work, for the first time, we explore the challenges associated with generalized
retrieval problems under a low-data regime, which is quite relevant in many
real-world scenarios. We attempt to make any retrieval model trained on a small
cross-domain dataset (containing just two training domains) more generalizable
towards any unknown query domain or category by quickly adapting it to the test
data during inference. This form of test-time training or adaptation of the
retrieval model is explored by means of a number of self-supervision-based loss
functions, for example, Rotnet, Jigsaw-puzzle, Barlow twins, etc., in this
work. Extensive experiments on multiple large-scale datasets demonstrate the
effectiveness of the proposed approach.Comment: 9 pages, 1 figure, 3 table
Phosphate Tether-Mediated Ring-Closing Metathesis for the Generation of Medium to Large, P-Stereogenic Bicyclo[n.3.1]phosphates
A phosphate tether-mediated ring-closing metathesis study towards the synthesis of P-stereogenic bicyclo[6.3.1]-, bicyclo[7.3.1]-, and bicyclo[8.3.1]phosphates is reported. This study demonstrates expanded utility of phosphate tether-mediated desymmetrization of C2-symmetric, 1,3-anti-diol dienes in generating complex medium to large, P-stereogenic bicyclo[n.3.1]phosphates.
A Concise, Phosphate-Mediated Approach to the Total Synthesis of (−)-Tetrahydrolipstatin
An efficient synthesis of (−)-tetrahydrolipstatin (THL) is reported. This method takes advantage of a phosphate tether-mediated, one-pot, sequential RCM/CM/hydrogenation protocol to deliver THL in 8 total steps from a readily prepared (S,S)-triene. The strategy incorporates selective cross metathesis, regio-selective hydrogenation, regio- and diastereoselective cuprate addition and Mitsunobu inversion for installation of the C5 formamide ester subunit
- …