92,818 research outputs found
Boilerplate Removal using a Neural Sequence Labeling Model
The extraction of main content from web pages is an important task for
numerous applications, ranging from usability aspects, like reader views for
news articles in web browsers, to information retrieval or natural language
processing. Existing approaches are lacking as they rely on large amounts of
hand-crafted features for classification. This results in models that are
tailored to a specific distribution of web pages, e.g. from a certain time
frame, but lack in generalization power. We propose a neural sequence labeling
model that does not rely on any hand-crafted features but takes only the HTML
tags and words that appear in a web page as input. This allows us to present a
browser extension which highlights the content of arbitrary web pages directly
within the browser using our model. In addition, we create a new, more current
dataset to show that our model is able to adapt to changes in the structure of
web pages and outperform the state-of-the-art model.Comment: WWW20 Demo pape
Measuring gravitational waves from binary black hole coalescences: II. the waves' information and its extraction, with and without templates
We discuss the extraction of information from detected binary black hole
(BBH) coalescence gravitational waves, focusing on the merger phase that occurs
after the gradual inspiral and before the ringdown. Our results are: (1) If
numerical relativity simulations have not produced template merger waveforms
before BBH detections by LIGO/VIRGO, one can band-pass filter the merger waves.
For BBHs smaller than about 40 solar masses detected via their inspiral waves,
the band pass filtering signal to noise ratio indicates that the merger waves
should typically be just barely visible in the noise for initial and advanced
LIGO interferometers. (2) We derive an optimized (maximum likelihood) method
for extracting a best-fit merger waveform from the noisy detector output; one
"perpendicularly projects" this output onto a function space (specified using
wavelets) that incorporates our prior knowledge of the waveforms. An extension
of the method allows one to extract the BBH's two independent waveforms from
outputs of several interferometers. (3) If numerical relativists produce codes
for generating merger templates but running the codes is too expensive to allow
an extensive survey of the merger parameter space, then a coarse survey of this
parameter space, to determine the ranges of the several key parameters and to
explore several qualitative issues which we describe, would be useful for data
analysis purposes. (4) A complete set of templates could be used to test the
nonlinear dynamics of general relativity and to measure some of the binary
parameters. We estimate the number of bits of information obtainable from the
merger waves (about 10 to 60 for LIGO/VIRGO, up to 200 for LISA), estimate the
information loss due to template numerical errors or sparseness in the template
grid, and infer approximate requirements on template accuracy and spacing.Comment: 33 pages, Rextex 3.1 macros, no figures, submitted to Phys Rev
Recommended from our members
Implementation and Validation of the Roche Light Cycler 480 96-Well Plate Platform as a Real-Time PCR Assay for the Quantitative Detection of Cytomegalovirus (CMV) in Clinical Specimens Using the Luminex MultiCode ASRs System.
Allogenic stem-cell therapies benefit patients in the treatment of multiple diseases; however, the side effects of stem-cell therapies (SCT) derived from the concomitant use of immune suppression agents often include triggering infection diseases. Thus, analysis is required to improve the detection of pathogen infections in SCT. We develop a polymerase chain reaction (PCR)-based methodology for the qualitative real-time DNA detection of cytomegalovirus (CMV), with reference to herpes simplex virus types 1 (HSVI), Epstein-Barr virus (EBV), and varicella-zoster virus (VZV) in blood, urine, solid tissues, and cerebrospinal fluid. This real-time PCR of 96-well plate format provides a rapid framework as required by the Food and Drug Administration (FDA) for clinical settings, including the processing of specimens, reagent handling, special safety precautions, quality control criteria and analytical accuracy, precisely reportable range (analyst measurement range), reference range, limit of detection (LOD), analytical specificity established by interference study, and analyte stability. Specifically, we determined the reportable range (analyst measurement range) with the following criteria: CMV copies ≥200 copies/mL; report copy/mL value; CMV copies ≤199 copies/mL; report detected but below quantitative range; CMV copies = 0 with report <200 copies/mL. That is, with reference range, copy numbers (CN) per milliliter (mL) of the LOD were determined by standard curves that correlated Ct value and calibrated standard DNA panels. The three repeats determined that the measuring range was 1E2~1E6 copies/mL. The standard curves show the slopes were within the range -2.99 to -3.65 with R2 ≥ 0.98. High copy (HC) controls were within 0.17-0.18 log differences of DNA copy numbers; (2) low copy (LC) controls were within 0.17-0.18 log differences; (3) LOD was within 0.14-0.15 log differences. As such, we set up a fast, simple, inexpensive, sensitive, and reliable molecular approach for the qualitative detection of CMV pathogens. Conclusion: This real-time PCR of the 96-well plate format provides a rapid framework as required by the FDA for clinical settings
DNA crosslinking and biological activity of a hairpin polyamide–chlorambucil conjugate
A prototype of a novel class of DNA alkylating agents, which combines the DNA crosslinking moiety chlorambucil (Chl) with a sequence-selective hairpin pyrrole-imidazole polyamide ImPy-beta-ImPy-gamma-ImPy-beta-Dp (polyamide 1), was evaluated for its ability to damage DNA and induce biological responses. Polyamide 1-Chl conjugate (1-Chl) alkylates and interstrand crosslinks DNA in cell-free systems. The alkylation occurs predominantly at 5'-AGCTGCA-3' sequence, which represents the polyamide binding site. Conjugate-induced lesions were first detected on DNA treated for 1 h with 0.1 muM 1-Chl, indicating that the conjugate is at least 100-fold more potent than Chl. Prolonged incubation allowed for DNA damage detection even at 0.01 muM concentration. Treatment with 1-Chl decreased DNA template activity in simian virus 40 (SV40) in vitro replication assays. 1-Chl inhibited mammalian cell growth, genomic DNA replication and cell cycle progression, and arrested cells in the G(2)/M phase. Moreover, cellular effects were observed at 1-Chl concentrations similar to those needed for DNA damage in cell-free systems. Neither of the parent compounds, unconjugated Chl or polyamide 1, demonstrated any cellular activity in the same concentration range. The conjugate molecule 1-Chl possesses the sequence-selectivity of a polyamide and the enhanced DNA reactivity of Chl
Automatic detection of change in address blocks for reply forms processing
In this paper, an automatic method to detect the presence of on-line erasures/scribbles/corrections/over-writing in the address block of various types of subscription and utility payment forms is presented. The proposed approach employs bottom-up segmentation of the address block. Heuristic rules based on structural features are used to automate the detection process. The algorithm is applied on a large dataset of 5,780 real world document forms of 200 dots per inch resolution. The proposed algorithm performs well with an average processing time of 108 milliseconds per document with a detection accuracy of 98.96%
MicroRNA-like RNAs from the same miRNA precursors play a role in cassava chilling responses
Abstract MicroRNAs (miRNAs) are known to play important roles in various cellular processes and stress responses. MiRNAs can be identified by analyzing reads from high-throughput deep sequencing. The reads realigned to miRNA precursors besides canonical miRNAs were initially considered as sequencing noise and ignored from further analysis. Here we reported a small-RNA species of phased and half-phased miRNA-like RNAs different from canonical miRNAs from cassava miRNA precursors detected under four distinct chilling conditions. They can form abundant multiple small RNAs arranged along precursors in a tandem and phased or half-phased fashion. Some of these miRNA-like RNAs were experimentally confirmed by re-amplification and re-sequencing, and have a similar qRT-PCR detection ratio as their cognate canonical miRNAs. The target genes of those phased and half-phased miRNA-like RNAs function in process of cell growth metabolism and play roles in protein kinase. Half-phased miR171d.3 was confirmed to have cleavage activities on its target gene P-glycoprotein 11, a broad substrate efflux pump across cellular membranes, which is thought to provide protection for tropical cassava during sharp temperature decease. Our results showed that the RNAs from miRNA precursors are miRNA-like small RNAs that are viable negative gene regulators and may have potential functions in cassava chilling responses
- …