193 research outputs found
An approach for mistranslation removal from popular dataset for Indic MT Task
The conversion of content from one language to another utilizing a computer
system is known as Machine Translation (MT). Various techniques have come up to
ensure effective translations that retain the contextual and lexical
interpretation of the source language. End-to-end Neural Machine Translation
(NMT) is a popular technique and it is now widely used in real-world MT
systems. Massive amounts of parallel datasets (sentences in one language
alongside translations in another) are required for MT systems. These datasets
are crucial for an MT system to learn linguistic structures and patterns of
both languages during the training phase. One such dataset is Samanantar, the
largest publicly accessible parallel dataset for Indian languages (ILs). Since
the corpus has been gathered from various sources, it contains many incorrect
translations. Hence, the MT systems built using this dataset cannot perform to
their usual potential. In this paper, we propose an algorithm to remove
mistranslations from the training corpus and evaluate its performance and
efficiency. Two Indic languages (ILs), namely, Hindi (HIN) and Odia (ODI) are
chosen for the experiment. A baseline NMT system is built for these two ILs,
and the effect of different dataset sizes is also investigated. The quality of
the translations in the experiment is evaluated using standard metrics such as
BLEU, METEOR, and RIBES. From the results, it is observed that removing the
incorrect translation from the dataset makes the translation quality better. It
is also noticed that, despite the fact that the ILs-English and English-ILs
systems are trained using the same corpus, ILs-English works more effectively
across all the evaluation metrics.Comment: 18 page
Circular pattern matching with k mismatches
The k-mismatch problem consists in computing the Hamming distance between a pattern P of length m and every length-m substring of a text T of length n, if this distance is no more than k. In many real-world applications, any cyclic shift of P is a relevant pattern, and thus one is interested in computing the minimal distance of every length-m substring of T and any cyclic shift of P. This is the circular pattern m
Structure and mechanics of supporting cells in the guinea pig organ of Corti.
The mechanical properties of the mammalian organ of Corti determine its sensitivity to sound frequency and intensity, and the structure of supporting cells changes progressively with frequency along the cochlea. From the apex (low frequency) to the base (high frequency) of the guinea pig cochlea inner pillar cells decrease in length incrementally from 75-55 µm whilst the number of axial microtubules increases from 1,300-2,100. The respective values for outer pillar cells are 120-65 µm and 1,500-3,000. This correlates with a progressive decrease in the length of the outer hair cells from >100 µm to 20 µm. Deiters'cell bodies vary from 60-50 µm long with relatively little change in microtubule number. Their phalangeal processes reflect the lengths of outer hair cells but their microtubule numbers do not change systematically. Correlations between cell length, microtubule number and cochlear location are poor below 1 kHz. Cell stiffness was estimated from direct mechanical measurements made previously from isolated inner and outer pillar cells. We estimate that between 200 Hz and 20 kHz axial stiffness, bending stiffness and buckling limits increase, respectively,~3, 6 and 4 fold for outer pillar cells, ~2, 3 and 2.5 fold for inner pillar cells and ~7, 20 and 24 fold for the phalangeal processes of Deiters'cells. There was little change in the Deiters'cell bodies for any parameter. Compensating for effective cell length the pillar cells are likely to be considerably stiffer than Deiters'cells with buckling limits 10-40 times greater. These data show a clear relationship between cell mechanics and frequency. However, measurements from single cells alone are insufficient and they must be combined with more accurate details of how the multicellular architecture influences the mechanical properties of the whole organ
Superlattices: problems and new opportunities, nanosolids
Superlattices were introduced 40 years ago as man-made solids to enrich the class of materials for electronic and optoelectronic applications. The field metamorphosed to quantum wells and quantum dots, with ever decreasing dimensions dictated by the technological advancements in nanometer regime. In recent years, the field has gone beyond semiconductors to metals and organic solids. Superlattice is simply a way of forming a uniform continuum for whatever purpose at hand. There are problems with doping, defect-induced random switching, and I/O involving quantum dots. However, new opportunities in component-based nanostructures may lead the field of endeavor to new heights. The all important translational symmetry of solids is relaxed and local symmetry is needed in nanosolids
STELLAR: fast and exact local alignments
<p>Abstract</p> <p>Background</p> <p>Large-scale comparison of genomic sequences requires reliable tools for the search of local alignments. Practical local aligners are in general fast, but heuristic, and hence sometimes miss significant matches.</p> <p>Results</p> <p>We present here the local pairwise aligner STELLAR that has full sensitivity for <it>ε</it>-alignments, i.e. guarantees to report all local alignments of a given minimal length and maximal error rate. The aligner is composed of two steps, filtering and verification. We apply the SWIFT algorithm for lossless filtering, and have developed a new verification strategy that we prove to be exact. Our results on simulated and real genomic data confirm and quantify the conjecture that heuristic tools like BLAST or BLAT miss a large percentage of significant local alignments.</p> <p>Conclusions</p> <p>STELLAR is very practical and fast on very long sequences which makes it a suitable new tool for finding local alignments between genomic sequences under the edit distance model. Binaries are freely available for Linux, Windows, and Mac OS X at <url>http://www.seqan.de/projects/stellar</url>. The source code is freely distributed with the SeqAn C++ library version 1.3 and later at <url>http://www.seqan.de</url>.</p
Chlamydia trachomatis Co-opts the FGF2 Signaling Pathway to Enhance Infection
The molecular details of Chlamydia trachomatis binding, entry, and spread are incompletely understood, but heparan sulfate proteoglycans (HSPGs) play a role in the initial binding steps. As cell surface HSPGs facilitate the interactions of many growth factors with their receptors, we investigated the role of HSPG-dependent growth factors in C. trachomatis infection. Here, we report a novel finding that Fibroblast Growth Factor 2 (FGF2) is necessary and sufficient to enhance C. trachomatis binding to host cells in an HSPG-dependent manner. FGF2 binds directly to elementary bodies (EBs) where it may function as a bridging molecule to facilitate interactions of EBs with the FGF receptor (FGFR) on the cell surface. Upon EB binding, FGFR is activated locally and contributes to bacterial uptake into non-phagocytic cells. We further show that C. trachomatis infection stimulates fgf2 transcription and enhances production and release of FGF2 through a pathway that requires bacterial protein synthesis and activation of the Erk1/2 signaling pathway but that is independent of FGFR activation. Intracellular replication of the bacteria results in host proteosome-mediated degradation of the high molecular weight (HMW) isoforms of FGF2 and increased amounts of the low molecular weight (LMW) isoforms, which are released upon host cell death. Finally, we demonstrate the in vivo relevance of these findings by showing that conditioned medium from C. trachomatis infected cells is enriched for LMW FGF2, accounting for its ability to enhance C. trachomatis infectivity in additional rounds of infection. Together, these results demonstrate that C. trachomatis utilizes multiple mechanisms to co-opt the host cell FGF2 pathway to enhance bacterial infection and spread
Phyllanthus spp. Induces Selective Growth Inhibition of PC-3 and MeWo Human Cancer Cells through Modulation of Cell Cycle and Induction of Apoptosis
BACKGROUND: Phyllanthus is a traditional medicinal plant that has been used in the treatment of many diseases including hepatitis and diabetes. The main aim of the present work was to investigate the potential cytotoxic effects of aqueous and methanolic extracts of four Phyllanthus species (P.amarus, P.niruri, P.urinaria and P.watsonii) against skin melanoma and prostate cancer cells. METHODOLOGY/PRINCIPAL FINDINGS: Phyllanthus plant appears to possess cytotoxic properties with half-maximal inhibitory concentration (IC(50)) values of 150-300 µg/ml for aqueous extract and 50-150 µg/ml for methanolic extract that were determined using the MTS reduction assay. In comparison, the plant extracts did not show any significant cytotoxicity on normal human skin (CCD-1127Sk) and prostate (RWPE-1) cells. The extracts appeared to act by causing the formation of a clear "ladder" fragmentation of apoptotic DNA on agarose gel, displayed TUNEL-positive cells with an elevation of caspase-3 and -7 activities. The Lactate Dehydrogenase (LDH) level was lower than 15% in Phyllanthus treated-cancer cells. These indicate that Phyllanthus extracts have the ability to induce apoptosis with minimal necrotic effects. Furthermore, cell cycle analysis revealed that Phyllanthus induced a Go/G1-phase arrest on PC-3 cells and a S-phase arrest on MeWo cells and these were accompanied by accumulation of cells in the Sub-G1 (apoptosis) phase. The cytotoxic properties may be due to the presence of polyphenol compounds such as ellagitannins, gallotannins, flavonoids and phenolic acids found both in the water and methanol extract of the plants. CONCLUSIONS/SIGNIFICANCE: Phyllanthus plant exerts its growth inhibition effect in a selective manner towards cancer cells through the modulation of cell cycle and induction of apoptosis via caspases activation in melanoma and prostate cancer cells. Hence, Phyllanthus may be sourced for the development of a potent apoptosis-inducing anticancer agent
The importance of considering community-level effects when selecting insecticidal malaria vector products
BACKGROUND\ud
\ud
Insecticide treatment of nets, curtains or walls and ceilings of houses represent the primary means for malaria prevention worldwide. Direct personal protection of individuals and households arises from deterrent and insecticidal activities which divert or kill mosquitoes before they can feed. However, at high coverage, community-level reductions of mosquito density and survival prevent more transmission exposure than the personal protection acquired by using a net or living in a sprayed house.\ud
\ud
METHODS\ud
\ud
A process-explicit simulation of malaria transmission was applied to results of 4 recent Phase II experimental hut trials comparing a new mosaic long-lasting insecticidal net (LLIN) which combines deltamethrin and piperonyl butoxide with another LLIN product by the same manufacturer relying on deltamethrin alone.\ud
\ud
RESULTS\ud
\ud
Direct estimates of mean personal protection against insecticide-resistant vectors in Vietnam, Cameroon, Burkina Faso and Benin revealed no clear advantage for combination LLINs over deltamethrin-only LLINs (P = 0.973) unless both types of nets were extensively washed (Relative mean entomologic inoculation rate (EIR) ± standard error of the mean (SEM) for users of combination nets compared to users of deltamethrin only nets = 0.853 ± 0.056, P = 0.008). However, simulations of impact at high coverage (80% use) predicted consistently better impact for the combination net across all four sites (Relative mean EIR ± SEM in communities with combination nets, compared with those using deltamethrin only nets = 0.613 ± 0.076, P < 0.001), regardless of whether the nets were washed or not (P = 0.467). Nevertheless, the degree of advantage obtained with the combination varied substantially between sites and their associated resistant vector populations.\ud
\ud
CONCLUSION\ud
\ud
Process-explicit simulations of community-level protection, parameterized using locally-relevant experimental hut studies, should be explicitly considered when choosing vector control products for large-scale epidemiological trials or public health programme procurement, particularly as growing insecticide resistance necessitates the use of multiple active ingredients
- …