3,018 research outputs found
Analyze Large Multidimensional Datasets Using Algebraic Topology
This paper presents an efficient algorithm to extract knowledge from high-dimensionality, high- complexity datasets using algebraic topology, namely simplicial complexes. Based on concept of isomorphism of relations, our method turn a relational table into a geometric object (a simplicial complex is a polyhedron). So, conceptually association rule searching is turned into a geometric traversal problem. By leveraging on the core concepts behind Simplicial Complex, we use a new technique (in computer science) that improves the performance over existing methods and uses far less memory. It was designed and developed with a strong emphasis on scalability, reliability, and extensibility. This paper also investigate the possibility of Hadoop integration and the challenges that come with the framework
Computationally designed peptides for zika virus detection: An incremental construction approach
Herein, and in contrast to current production of anti-Zika virus antibodies, we propose a semi-combinatorial virtual strategy to select short peptides as biomimetic antibodies/binding agents for the detection of intact Zika virus (ZIKV) particles. The virtual approach was based on generating different docking cycles of tetra, penta, hexa, and heptapeptide libraries by maximizing the discrimination between the amino acid motif in the ZIKV and dengue virus (DENV) envelope protein glycosylation site. Eight peptides, two for each length (tetra, penta, hexa, and heptapeptide) were then synthesized and tested vs. intact ZIKV particles by using a direct enzyme linked immunosorbent assay (ELISA). As a reference, we employed a well-established anti-ZIKV antibody, the antibody 4G2. Three peptide-based assays had good detection limits with dynamic range starting from 105 copies/mL of intact ZIKV particles; this was one order magnitude lower than the other peptides or antibodies. These three peptides showed slight cross-reactivity against the three serotypes of DENV (DENV-1,-2, and-3) at a concentration of 106 copies/mL of intact virus particles, but the discrimination between the DENV and ZIKV was lost when the coating concentration was increased to 107 copies/mL of the virus. The sensitivity of the peptides was tested in the presence of two biological matrices, serum and urine diluted 1:10 and 1:1, respectively. The detection limits decreased about one order of magnitude for ZIKV detection in serum or urine, albeit still having for two of the three peptides tested a distinct analytical signal starting from 106 copies/mL, the concentration of ZIKV in acute infection
Numerical Evaluation of Algorithmic Complexity for Short Strings: A Glance into the Innermost Structure of Randomness
We describe an alternative method (to compression) that combines several
theoretical and experimental results to numerically approximate the algorithmic
(Kolmogorov-Chaitin) complexity of all bit strings up to 8
bits long, and for some between 9 and 16 bits long. This is done by an
exhaustive execution of all deterministic 2-symbol Turing machines with up to 4
states for which the halting times are known thanks to the Busy Beaver problem,
that is 11019960576 machines. An output frequency distribution is then
computed, from which the algorithmic probability is calculated and the
algorithmic complexity evaluated by way of the (Levin-Zvonkin-Chaitin) coding
theorem.Comment: 29 pages, 5 figures. Version as accepted by the journal Applied
Mathematics and Computatio
Serial-batch scheduling – the special case of laser-cutting machines
The dissertation deals with a problem in the field of short-term production planning, namely the scheduling of laser-cutting machines. The object of decision is the grouping of production orders (batching) and the sequencing of these order groups on one or more machines (scheduling). This problem is also known in the literature as "batch scheduling problem" and belongs to the class of combinatorial optimization problems due to the interdependencies between the batching and the scheduling decisions. The concepts and methods used are mainly from production planning, operations research and machine learning
Recommended from our members
A Pre-Programming Approach to Algorithmic Thinking in High School Mathematics
Given the impact of computers and computing on almost every aspect of society, the ability to develop, analyze, and implement algorithms is gaining more focus. Algorithms are increasingly important in theoretical mathematics, in applications of mathematics, in computer science, as well as in many areas outside of mathematics. In high school, however, algorithms are usually restricted to computer science courses and as a result, the important relationship between mathematics and computer science is often overlooked (Henderson, 1997). The mathematical ideas behind the design, construction and analysis of algorithms, are important for students' mathematical education. In addition, exploring algorithms can help students see mathematics as a meaningful and creative subject.
This study provides a review of the history of algorithms and algorithmic complexity, as well as a technical monograph that illustrates the mathematical aspects of algorithmic complexity in a form that is accessible to mathematics instructors at the high school level. The historical component of this study is broken down into two parts. The first part covers the history of algorithms with an emphasis on how the concept has evolved from 3000 BC through the Middle Ages to the present day. The second part focuses on the history of algorithmic complexity, dating back to the text of Ibn al-majdi, a fourteenth century Egyptian astronomer, through the 20th century. In particular, it highlights the contributions of a group of mathematicians including Alan Turing, Michael Rabin, Juris Hartmanis, Richard Stearns and Alan Cobham, whose work in computability theory and complexity measures was critical to the development of the field of algorithmic complexity.
The technical monograph which follows describes how the complexity of an algorithm can be measured and analyzes different types of algorithms. It includes divide-and-conquer algorithms, search and sort algorithms, greedy algorithms, algorithms for matching, and geometric algorithms. The methods used to analyze the complexity of these algorithms is done without the use of a programming language in order to focus on the mathematical aspects of the algorithms, and to provide knowledge and skills of value that are independent of specific computers or programming languages.
In addition, the study assesses the appropriateness of these topics for use by high school teachers by submitting it for independent review to a panel of experts. The panel, which consists of mathematics and computer science faculty in high school and colleges around the United States, found the material to be interesting and felt that using a pre-programming approach to teaching algorithmic complexity has a great deal of merit. There was some concern, however, that portions of the material may be too advanced for high school mathematics instructors. Additionally, they thought that the material would only appeal to the strongest students. As per the reviewers' suggestions, the monograph was revised to its current form
Overlapping Community Detection in Networks: the State of the Art and Comparative Study
This paper reviews the state of the art in overlapping community detection
algorithms, quality measures, and benchmarks. A thorough comparison of
different algorithms (a total of fourteen) is provided. In addition to
community level evaluation, we propose a framework for evaluating algorithms'
ability to detect overlapping nodes, which helps to assess over-detection and
under-detection. After considering community level detection performance
measured by Normalized Mutual Information, the Omega index, and node level
detection performance measured by F-score, we reached the following
conclusions. For low overlapping density networks, SLPA, OSLOM, Game and COPRA
offer better performance than the other tested algorithms. For networks with
high overlapping density and high overlapping diversity, both SLPA and Game
provide relatively stable performance. However, test results also suggest that
the detection in such networks is still not yet fully resolved. A common
feature observed by various algorithms in real-world networks is the relatively
small fraction of overlapping nodes (typically less than 30%), each of which
belongs to only 2 or 3 communities.Comment: This paper (final version) is accepted in 2012. ACM Computing
Surveys, vol. 45, no. 4, 2013 (In press) Contact: [email protected]
Element-centric clustering comparison unifies overlaps and hierarchy
Clustering is one of the most universal approaches for understanding complex
data. A pivotal aspect of clustering analysis is quantitatively comparing
clusterings; clustering comparison is the basis for many tasks such as
clustering evaluation, consensus clustering, and tracking the temporal
evolution of clusters. In particular, the extrinsic evaluation of clustering
methods requires comparing the uncovered clusterings to planted clusterings or
known metadata. Yet, as we demonstrate, existing clustering comparison measures
have critical biases which undermine their usefulness, and no measure
accommodates both overlapping and hierarchical clusterings. Here we unify the
comparison of disjoint, overlapping, and hierarchically structured clusterings
by proposing a new element-centric framework: elements are compared based on
the relationships induced by the cluster structure, as opposed to the
traditional cluster-centric philosophy. We demonstrate that, in contrast to
standard clustering similarity measures, our framework does not suffer from
critical biases and naturally provides unique insights into how the clusterings
differ. We illustrate the strengths of our framework by revealing new insights
into the organization of clusters in two applications: the improved
classification of schizophrenia based on the overlapping and hierarchical
community structure of fMRI brain networks, and the disentanglement of various
social homophily factors in Facebook social networks. The universality of
clustering suggests far-reaching impact of our framework throughout all areas
of science
- …