402 research outputs found

    Part-Of-Speech Tagging Of Urdu in Limited Resources Scenario

    Get PDF
    We address the problem of Part-of-Speech (POS) tagging of Urdu. POS tagging is the process of assigning a part-of-speech or lexical class marker to each word in the given text. Tagging for natural languages is similar to tokenization and lexical analysis for computer languages, except that we encounter ambiguities which are to be resolved. It plays a fundamental role in various Natural Language Processing (NLP) applications such as word sense disambiguation, parsing, name entity recognition and chunking. POS tagging, particularly plays very important role in processing free-word-order languages because such languages have relatively complex morphological structure. Urdu is a morphologically rich language. Forms of the verb, as well as case, gender, and number are expressed by the morphology. It shares its morphology, phonology and grammatical structures with Hindi. It shares its vocabulary with Arabic, Persian, Sanskrit, Turkish and Pashto languages. Urdu is written using the Perso-Arabic script. POS tagging of Urdu is a necessary component for most NLP applications of Urdu. Development of an Urdu POS tagger will influence several pipelined modules of natural language understanding system, including machine translation; partial parsing and word sense disambiguation. Our objective is to develop a robust POS tagger for Urdu. We have worked on the automatic annotation of part-of-speech for Urdu. We have defined a tag-set for Urdu. We manually annotated a corpus of 10,000 sentences. We have used different machine learning methods, namely Hidden Markov Model (HMM), Maximum Entropy Model (ME) and Conditional Random Field (CRF). Further, to deal with a small-annotated corpus, we explored the use of semi-supervised learning by using an additional un-annotated corpus. We also explored the use of a dictionary to provide to us all possible POS labeling for a given word. Since Urdu is morphologically productive. Hence we augmented Hidden Markov Model, Maximum Entropy Model and Conditional Random Field with morphological features, word suffixes and POS categories of words to develop robust POS tagger for Urdu in the limited resources scenario

    Cadmium in the purpleback flying squid Sthenoteuthis oualaniensis (Lesson, 1830) along northwest coast of India

    Get PDF
    The purpleback flying squid Sthenoteuthis oualaniensis (Lesson, 1830) is landed in small quantities along the northwest coast of India. Keeping in view the possibility of utilization of this species for domestic and export markets, the cadmium accumulation in the body tissues, which often causes concern, was studied. The dorsal mantle length of male and female observed during the study ranged from 34 to 47 cm and 30 to 32 cm respectively. The highest mean concentration of 435.22 ± 61.27 Όg g-1 (mean ± S.E.) of Cd was found in the liver. Accumulation of Cd was also prominent in the gut, gills and skin. Moderate concentration of Cd (1 to 4 Όg g-1) was observed in the nidamental gland, accessory nidamental gland, eyes, tentacles and muscle. In the gonads and arms, the concentration was below the acceptability level of 1.0 Όg g-1. Higher accumulation in most of the organs/tissues of larger squids was observed. Significantly higher accumulation (p < 0.05) was noticed in the liver of larger specimens, indicating bioaccumulation. As the mean Cd content in the edible part was more than 1.0 Όg g-1, this study highlights the need for detailed investigations to understand the bioaccumulation of Cd in Sthenoteuthis oualaniensi

    Search for a vector-like quark Tâ€Č → tH via the diphoton decay mode of the Higgs boson in proton-proton collisions at s \sqrt{s} = 13 TeV

    Get PDF
    A search for the electroweak production of a vector-like quark Tâ€Č, decaying to a top quark and a Higgs boson is presented. The search is based on a sample of proton-proton collision events recorded at the LHC at = 13 TeV, corresponding to an integrated luminosity of 138 fb−1. This is the first Tâ€Č search that exploits the Higgs boson decay to a pair of photons. For narrow isospin singlet Tâ€Č states with masses up to 1.1 TeV, the excellent diphoton invariant mass resolution of 1–2% results in an increased sensitivity compared to previous searches based on the same production mechanism. The electroweak production of a Tâ€Č quark with mass up to 960 GeV is excluded at 95% confidence level, assuming a coupling strength ÎșT = 0.25 and a relative decay width Γ/MTâ€Č < 5%

    Search for high-mass exclusive γγ → WW and γγ → ZZ production in proton-proton collisions at s \sqrt{s} = 13 TeV

    Get PDF

    Measurement of the Higgs boson inclusive and differential fiducial production cross sections in the diphoton decay channel with pp collisions at s \sqrt{s} = 13 TeV

    Get PDF
    The measurements of the inclusive and differential fiducial cross sections of the Higgs boson decaying to a pair of photons are presented. The analysis is performed using proton-proton collisions data recorded with the CMS detector at the LHC at a centre-of-mass energy of 13 TeV and corresponding to an integrated luminosity of 137 fb−1^{−1}. The inclusive fiducial cross section is measured to be σfidσ_{fid}=73.4−5.3+5.4^{+5.4}_{−5.3}(stat)−2.2+2.4^{+2.4}_{−2.2}(syst) fb, in agreement with the standard model expectation of 75.4 ± 4.1 fb. The measurements are also performed in fiducial regions targeting different production modes and as function of several observables describing the diphoton system, the number of additional jets present in the event, and other kinematic observables. Two double differential measurements are performed. No significant deviations from the standard model expectations are observed

    Search for Higgs Boson Decay to a Charm Quark-Antiquark Pair in Proton-Proton Collisions at √s = 13 TeV

    Get PDF
    A search for the standard model Higgs boson decaying to a charm quark-antiquark pair, H→cÂŻc, produced in association with a leptonically decaying V (W or Z) boson is presented. The search is performed with proton-proton collisions at √s=13  TeV collected by the CMS experiment, corresponding to an integrated luminosity of 138  fb−1. Novel charm jet identification and analysis methods using machine learning techniques are employed. The analysis is validated by searching for Z→cÂŻc in VZ events, leading to its first observation at a hadron collider with a significance of 5.7 standard deviations. The observed (expected) upper limit on σ(VH)B(H→cÂŻc) is 0.94 (0.50+0.22−0.15)pb at 95% confidence level (C.L.), corresponding to 14 (7.6+3.4−2.3) times the standard model prediction. For the Higgs-charm Yukawa coupling modifier, Îșc, the observed (expected) 95% C.L. interval is 1.1<|Îșc|<5.5 (|Îșc|<3.4), the most stringent constraint to date

    Evidence for four-top quark production in proton-proton collisions at √s = 13 TeV

    Get PDF

    Search for pair-produced vector-like leptons in final states with third-generation leptons and at least three b quark jets in proton-proton collisions at √s = 13 TeV

    Get PDF

    Observation of τ Lepton Pair Production in Ultraperipheral Pb-Pb Collisions at sqrt[s_{NN}]=5.02 TeV

    Get PDF
    • 

    corecore