7,643 research outputs found
Identifying Mislabeled Training Data
This paper presents a new approach to identifying and eliminating mislabeled
training instances for supervised learning. The goal of this approach is to
improve classification accuracies produced by learning algorithms by improving
the quality of the training data. Our approach uses a set of learning
algorithms to create classifiers that serve as noise filters for the training
data. We evaluate single algorithm, majority vote and consensus filters on five
datasets that are prone to labeling errors. Our experiments illustrate that
filtering significantly improves classification accuracy for noise levels up to
30 percent. An analytical and empirical evaluation of the precision of our
approach shows that consensus filters are conservative at throwing away good
data at the expense of retaining bad data and that majority filters are better
at detecting bad data at the expense of throwing away good data. This suggests
that for situations in which there is a paucity of data, consensus filters are
preferable, whereas majority vote filters are preferable for situations with an
abundance of data
3-manifold groups are virtually residually p
Given a prime , a group is called residually if the intersection of
its -power index normal subgroups is trivial. A group is called virtually
residually if it has a finite index subgroup which is residually . It is
well-known that finitely generated linear groups over fields of characteristic
zero are virtually residually for all but finitely many . In particular,
fundamental groups of hyperbolic 3-manifolds are virtually residually . It
is also well-known that fundamental groups of 3-manifolds are residually
finite. In this paper we prove a common generalization of these results: every
3-manifold group is virtually residually for all but finitely many .
This gives evidence for the conjecture (Thurston) that fundamental groups of
3-manifolds are linear groups
Hidden Translation and Translating Coset in Quantum Computing
We give efficient quantum algorithms for the problems of Hidden Translation
and Hidden Subgroup in a large class of non-abelian solvable groups including
solvable groups of constant exponent and of constant length derived series. Our
algorithms are recursive. For the base case, we solve efficiently Hidden
Translation in , whenever is a fixed prime. For the induction
step, we introduce the problem Translating Coset generalizing both Hidden
Translation and Hidden Subgroup, and prove a powerful self-reducibility result:
Translating Coset in a finite solvable group is reducible to instances of
Translating Coset in and , for appropriate normal subgroups of
. Our self-reducibility framework combined with Kuperberg's subexponential
quantum algorithm for solving Hidden Translation in any abelian group, leads to
subexponential quantum algorithms for Hidden Translation and Hidden Subgroup in
any solvable group.Comment: Journal version: change of title and several minor update
Time Resolution of a Few Nanoseconds in Silicon Strip Detectors Using the APV25 Chip
The APV25 front-end chip for the CMS Silicon Tracker has a peaking time of 50 ns, but confines the signal to a single clock period (=bunch crossing) with its internal “deconvolution” filter. This method requires a beam-synchronous clock and thus cannot be applied to a (quasi-) continuous beam. Nevertheless, using the multi-peak mode of the APV25, where 3 (or 6,9,12,...) consecutive shaper output samples are read out, the peak time can be reconstructed externally with high precision. Thus, offtime hits can be discarded which results in significant occupancy reduction. We will describe this method, results from beam tests and the intended implementation in an upgrade of the BELLE Silicon Vertex Detector
Construction and Performance of a Double-Sided Silicon Detector Module Using the Origami Concept
The APV25 front-end chip with short shaping time will be used in the Belle II Silicon Vertex Detector (SVD) in order to achive low occupancy. Since fast amplifiers are more susceptible to noise caused by their capacitive input load, they have to be placed as close to the sensor as possible. On the other hand, material budget inside the active volume has to be kept low in order to constrain multiple scattering. We built a low mass sensor module with double-sided readout, where thinned APV25 chips are placed on a single flexible circuit glued onto one side of the sensor. The interconnection to the other side is done by Kapton fanouts, which are wrapped around the edge of the sensor, hence the name Origami. Since all front-end chips are aligned in a row on the top side of the module, cooling can be done by a single aluminum pipe. The performance of the Origami module was evaluated in a beam test at CERN in August 2009, of which first results are presented here
Readout and Data Processing Electronics for the Belle-II Silicon Vertex Detector
A prototype readout system has been developed for the future Belle-II Silicon Vertex Detector at the Super-KEK-B factory in Tsukuba, Japan. It will receive raw data from double-sided sensors with a total of approximately 240,000 strips read out by APV25 chips at a trigger rate of up to 30kHz and perform strip reordering, pedestal subtraction, a two-pass common mode correction and zero suppression in FPGA firmware. Moreover, the APV25 will be operated in multi-peak mode, where (typically) six samples along the shaped waveform are used for precise hit-time reconstruction which will also be implemented in FPGAs using look-up tables
Mapping Crop Cycles in China Using MODIS-EVI Time Series
As the Earth’s population continues to grow and demand for food increases, the need for improved and timely information related to the properties and dynamics of global agricultural systems is becoming increasingly important. Global land cover maps derived from satellite data provide indispensable information regarding the geographic distribution and areal extent of global croplands. However, land use information, such as cropping intensity (defined here as the number of cropping cycles per year), is not routinely available over large areas because mapping this information from remote sensing is challenging. In this study, we present a simple but efficient algorithm for automated mapping of cropping intensity based on data from NASA’s (NASA: The National Aeronautics and Space Administration) MODerate Resolution Imaging Spectroradiometer (MODIS). The proposed algorithm first applies an adaptive Savitzky-Golay filter to smooth Enhanced Vegetation Index (EVI) time series derived from MODIS surface reflectance data. It then uses an iterative moving-window methodology to identify cropping cycles from the smoothed EVI time series. Comparison of results from our algorithm with national survey data at both the provincial and prefectural level in China show that the algorithm provides estimates of gross sown area that agree well with inventory data. Accuracy assessment comparing visually interpreted time series with algorithm results for a random sample of agricultural areas in China indicates an overall accuracy of 91.0% for three classes defined based on the number of cycles observed in EVI time series. The algorithm therefore appears to provide a straightforward and efficient method for mapping cropping intensity from MODIS time series data
Network Model of the CPE
Analysis of fractal systems (i.e. systems described by fractional differential equations) necessitates to create an electrical analog model of a crucial subsystem called Constant Phase Element (CPE). The paper describes a possible realization of such a model, that is quite simple and in spite of its simplicity makes it possible to simulate the properties of ideal CPEs. The paper also deals with the effect of component tolerances on the resultant responses of the model and describes several typical model applications
Computer simulation of pulsed field gel runs allows the quantitation of radiation-induced double-strand breaks in yeast
A procedure for the quantification of double-strand breaks in yeast is presented that utilizes pulsed field gel electrophoresis (PFGE) and a comparison of the observed DNA mass distribution in the gel lanes with calculated distributions. Calculation of profiles is performed as follows. If double-strand breaks are produced by sparsely ionizing radiation, one can assume that they are distributed randomly in the genome, and the resulting DNA mass distribution in molecular length can be predicted by means of a random breakage model. The input data for the computation of molecular length profiles are the breakage frequency per unit length, , as adjustable parameter, and the molecular lengths of the intact chromosomes. The obtained DNA mass distributions in molecular length must then be transformed into distributions of DNA mass in migration distance. This requires a calibration of molecular length vs. migration distance that is specific for the gel lane in question. The computed profiles are then folded with a Lorentz distribution with adjusted spread parameter to account for and broadening. The DNA profiles are calculated for different breakage frequencies and for different values of , and the parameters resulting in the best fit of the calculated to the observed profile are determined
- …