2,136 research outputs found
Approximately Minwise Independence with Twisted Tabulation
A random hash function is -minwise if for any set ,
, and element , .
Minwise hash functions with low bias have widespread applications
within similarity estimation.
Hashing from a universe , the twisted tabulation hashing of
P\v{a}tra\c{s}cu and Thorup [SODA'13] makes lookups in tables of size
. Twisted tabulation was invented to get good concentration for
hashing based sampling. Here we show that twisted tabulation yields -minwise hashing.
In the classic independence paradigm of Wegman and Carter [FOCS'79] -minwise hashing requires -independence [Indyk
SODA'99]. P\v{a}tra\c{s}cu and Thorup [STOC'11] had shown that simple
tabulation, using same space and lookups yields -minwise
independence, which is good for large sets, but useless for small sets. Our
analysis uses some of the same methods, but is much cleaner bypassing a
complicated induction argument.Comment: To appear in Proceedings of SWAT 201
The use of FEP Teflon in solar cell cover technology
FEP plastic film was used as a cover and as an adhesive to bond cover glasses to silicon solar cells. Various anti-reflective coatings were applied to cells and subsequently covered with FEP. Short circuit currents were measured before and after application of the coating and of the FEP. FEP was bonded to seven of the nine differently coated cells, with no change in the total short circuit current in four cases
Flexible, low-cost silicon solar cell arrays
Silicon solar cell arrays are pressure-bonded to flexible backing and protected by fluorinated ethylene propylene cover in one mechanized operation. Arrays packaged by this method are flexible, lightweight, insulated, breakage resistant and less expensive
Method of making silicon solar cell array
A heat sealable transparent plastic film, such as a flourinated ethylene propylene copolymer, is used both as a cover material and as an adhesive for mounting a solar cell array to a flexible substrate
Accelerated growth in outgoing links in evolving networks: deterministic vs. stochastic picture
In several real-world networks like the Internet, WWW etc., the number of
links grow in time in a non-linear fashion. We consider growing networks in
which the number of outgoing links is a non-linear function of time but new
links between older nodes are forbidden. The attachments are made using a
preferential attachment scheme. In the deterministic picture, the number of
outgoing links at any time is taken as where is
the number of nodes present at that time. The continuum theory predicts a power
law decay of the degree distribution: , while the degree of the node introduced at time is given by
when the
network is evolved till time . Numerical results show a growth in the degree
distribution for small values at any non-zero . In the stochastic
picture, is a random variable. As long as is time-dependent, e.g.,
when follows a distribution . The behaviour
of changes significantly as is varied: for , the
network has a scale-free distribution belonging to the BA class as predicted by
the mean field theory, for smaller values of it shows different
behaviour. Characteristic features of the clustering coefficients in both
models have also been discussed.Comment: Revised text, references added, to be published in PR
FLASH: Randomized Algorithms Accelerated over CPU-GPU for Ultra-High Dimensional Similarity Search
We present FLASH (\textbf{F}ast \textbf{L}SH \textbf{A}lgorithm for
\textbf{S}imilarity search accelerated with \textbf{H}PC), a similarity search
system for ultra-high dimensional datasets on a single machine, that does not
require similarity computations and is tailored for high-performance computing
platforms. By leveraging a LSH style randomized indexing procedure and
combining it with several principled techniques, such as reservoir sampling,
recent advances in one-pass minwise hashing, and count based estimations, we
reduce the computational and parallelization costs of similarity search, while
retaining sound theoretical guarantees.
We evaluate FLASH on several real, high-dimensional datasets from different
domains, including text, malicious URL, click-through prediction, social
networks, etc. Our experiments shed new light on the difficulties associated
with datasets having several million dimensions. Current state-of-the-art
implementations either fail on the presented scale or are orders of magnitude
slower than FLASH. FLASH is capable of computing an approximate k-NN graph,
from scratch, over the full webspam dataset (1.3 billion nonzeros) in less than
10 seconds. Computing a full k-NN graph in less than 10 seconds on the webspam
dataset, using brute-force (), will require at least 20 teraflops. We
provide CPU and GPU implementations of FLASH for replicability of our results
Quality Assessment of Linked Datasets using Probabilistic Approximation
With the increasing application of Linked Open Data, assessing the quality of
datasets by computing quality metrics becomes an issue of crucial importance.
For large and evolving datasets, an exact, deterministic computation of the
quality metrics is too time consuming or expensive. We employ probabilistic
techniques such as Reservoir Sampling, Bloom Filters and Clustering Coefficient
estimation for implementing a broad set of data quality metrics in an
approximate but sufficiently accurate way. Our implementation is integrated in
the comprehensive data quality assessment framework Luzzu. We evaluated its
performance and accuracy on Linked Open Datasets of broad relevance.Comment: 15 pages, 2 figures, To appear in ESWC 2015 proceeding
Ionized dopant concentrations at the heavily doped surface of a silicon solar cell
Data are combined with concentrations obtained by a bulk measurement method using successive layer removal with measurements of Hall effect and resistivity. From the MOS (metal-oxide-semiconductor) measurements it is found that the ionized dopant concentration N has the value (1.4 + or - 0.1) x 10 to the 20th power/cu cm at distances between 100 and 220 nm from the n(+) surface. The bulk measurement technique yields average values of N over layers whose thickness is 2000 nm. Results show that, at the higher concentrations encountered at the n(+) surface, the MOS C-V technique, when combined with a bulk measurement method, can be used to evaluate the effects of materials preparation methodologies on the surface and near surface concentrations of silicon cells
Evaluating the social acceptability of voice based smartwatch search
There has been a recent increase in the number of wearable (e.g. smartwatch, interactive glasses, etc.) devices available. Coupled with this there has been a surge in the number of searches that occur on mobile devices. Given these trends it is inevitable that search will become a part of wearable interaction. Given the form factor and display capabilities of wearables this will probably require a different type of search interaction to what is currently used in mobile search. This paper presents the results of a user study focusing on users’ perceptions of the use of smartwatches for search. We pay particular attention to social acceptability of different search scenarios, focussing on in-put method, device form and information need. Our findings indicate that audience and location heavily influence whether people will perform a voice based search. The results will help search system developers to support search on smartwatches
Distribution of sizes of erased loops for loop-erased random walks
We study the distribution of sizes of erased loops for loop-erased random
walks on regular and fractal lattices. We show that for arbitrary graphs the
probability of generating a loop of perimeter is expressible in
terms of the probability of forming a loop of perimeter when a
bond is added to a random spanning tree on the same graph by the simple
relation . On -dimensional hypercubical lattices,
varies as for large , where for , where
z is the fractal dimension of the loop-erased walks on the graph. On
recursively constructed fractals with this relation is modified
to , where is the hausdorff and
is the spectral dimension of the fractal.Comment: 4 pages, RevTex, 3 figure
- …