34 research outputs found

    A new mixed-integer programming model for irregular strip packing based on vertical slices with a reproducible survey

    Get PDF
    The irregular strip-packing problem, also known as nesting or marker making, is defined as the automatic computation of a non-overlapping placement of a set of non-convex polygons onto a rectangular strip of fixed width and unbounded length, such that the strip length is minimized. Nesting methods based on heuristics are a mature technology, and currently, the only practical solution to this problem. However, recent performance gains of the Mixed-Integer Programming (MIP) solvers, together with the known limitations of the heuristics methods, have encouraged the exploration of exact optimization models for nesting during the last decade. Despite the research effort, the current family of exact MIP models for nesting cannot efficiently solve both large problem instances and instances containing polygons with complex geometries. In order to improve the efficiency of the current MIP models, this work introduces a new family of continuous MIP models based on a novel formulation of the NoFit-Polygon Covering Model (NFP-CM), called NFP-CM based on Vertical Slices (NFP-CM-VS). Our new family of MIP models is based on a new convex decomposition of the feasible space of relative placements between pieces into vertical slices, together with a new family of valid inequalities, symmetry breakings, and variable eliminations derived from the former convex decomposition. Our experiments show that our new NFP-CM-VS models outperform the current state-of-the-art MIP models. Finally, we provide a detailed reproducibility protocol and dataset based on our Java software library as supplementary material to allow the exact replication of our models, experiments, and results

    HESML: A scalable ontology-based semantic similarity measures library with a set of reproducible experiments and a replication dataset

    Get PDF
    This work is a detailed companion reproducibility paper of the methods and experiments proposed by Lastra-Díaz and García-Serrano in (2015, 2016) [56–58], which introduces the following contributions: (1) a new and efficient representation model for taxonomies, called PosetHERep, which is an adaptation of the half-edge data structure commonly used to represent discrete manifolds and planar graphs; (2) a new Java software library called the Half-Edge Semantic Measures Library (HESML) based on PosetHERep, which implements most ontology-based semantic similarity measures and Information Content (IC) models reported in the literature; (3) a set of reproducible experiments on word similarity based on HESML and ReproZip with the aim of exactly reproducing the experimental surveys in the three aforementioned works; (4) a replication framework and dataset, called WNSimRep v1, whose aim is to assist the exact replication of most methods reported in the literature; and finally, (5) a set of scalability and performance benchmarks for semantic measures libraries. PosetHERep and HESML are motivated by several drawbacks in the current semantic measures libraries, especially the performance and scalability, as well as the evaluation of new methods and the replication of most previous methods. The reproducible experiments introduced herein are encouraged by the lack of a set of large, self-contained and easily reproducible experiments with the aim of replicating and confirming previously reported results. Likewise, the WNSimRep v1 dataset is motivated by the discovery of several contradictory results and difficulties in reproducing previously reported methods and experiments. PosetHERep proposes a memory-efficient representation for taxonomies which linearly scales with the size of the taxonomy and provides an efficient implementation of most taxonomy-based algorithms used by the semantic measures and IC models, whilst HESML provides an open framework to aid research into the area by providing a simpler and more efficient software architecture than the current software libraries. Finally, we prove the outperformance of HESML on the state-of-the-art libraries, as well as the possibility of significantly improving their performance and scalability without caching using PosetHERep

    Traditional knowledge of wild edible plants used in the northwest of the Iberian Peninsula (Spain and Portugal): a comparative study

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>We compare traditional knowledge and use of wild edible plants in six rural regions of the northwest of the Iberian Peninsula as follows: Campoo, Picos de Europa, Piloña, Sanabria and Caurel in Spain and Parque Natural de Montesinho in Portugal.</p> <p>Methods</p> <p>Data on the use of 97 species were collected through informed consent semi-structured interviews with local informants. A semi-quantitative approach was used to document the relative importance of each species and to indicate differences in selection criteria for consuming wild food species in the regions studied.</p> <p>Results and discussion</p> <p>The most significant species include many wild berries and nuts (e.g. <it>Castanea sativa, Rubus ulmifolius, Fragaria vesca</it>) and the most popular species in each food-category (e.g. fruits or herbs used to prepare liqueurs such as <it>Prunus spinosa</it>, vegetables such as <it>Rumex acetosa</it>, condiments such as <it>Origanum vulgare</it>, or plants used to prepare herbal teas such as <it>Chamaemelum nobile</it>). The most important species in the study area as a whole are consumed at five or all six of the survey sites.</p> <p>Conclusion</p> <p>Social, economic and cultural factors, such as poor communications, fads and direct contact with nature in everyday life should be taken into account in determining why some wild foods and traditional vegetables have been consumed, but others not. They may be even more important than biological factors such as richness and abundance of wild edible flora. Although most are no longer consumed, demand is growing for those regarded as local specialties that reflect regional identity.</p

    Protocol for a reproducible experimental survey on biomedical sentence similarity.

    No full text
    Measuring semantic similarity between sentences is a significant task in the fields of Natural Language Processing (NLP), Information Retrieval (IR), and biomedical text mining. For this reason, the proposal of sentence similarity methods for the biomedical domain has attracted a lot of attention in recent years. However, most sentence similarity methods and experimental results reported in the biomedical domain cannot be reproduced for multiple reasons as follows: the copying of previous results without confirmation, the lack of source code and data to replicate both methods and experiments, and the lack of a detailed definition of the experimental setup, among others. As a consequence of this reproducibility gap, the state of the problem can be neither elucidated nor new lines of research be soundly set. On the other hand, there are other significant gaps in the literature on biomedical sentence similarity as follows: (1) the evaluation of several unexplored sentence similarity methods which deserve to be studied; (2) the evaluation of an unexplored benchmark on biomedical sentence similarity, called Corpus-Transcriptional-Regulation (CTR); (3) a study on the impact of the pre-processing stage and Named Entity Recognition (NER) tools on the performance of the sentence similarity methods; and finally, (4) the lack of software and data resources for the reproducibility of methods and experiments in this line of research. Identified these open problems, this registered report introduces a detailed experimental setup, together with a categorization of the literature, to develop the largest, updated, and for the first time, reproducible experimental survey on biomedical sentence similarity. Our aforementioned experimental survey will be based on our own software replication and the evaluation of all methods being studied on the same software platform, which will be specially developed for this work, and it will become the first publicly available software library for biomedical sentence similarity. Finally, we will provide a very detailed reproducibility protocol and dataset as supplementary material to allow the exact replication of all our experiments and results

    Pearson (r) and Spearman (<i>ρ</i>) correlation values, harmonic score (<i>h</i>), and harmonic average (AVG) score obtained by the LiBlock method in combination with each NER tool using the best pre-processing configuration detailed in Table 7.

    No full text
    In addition, the last column (p-val) shows the p-values for the comparison of the LiBlock method with cTAKES and the remaining NER combinations.</p

    The statistical significance results.

    No full text
    We provide a series of tables reporting the p-values for each pair of methods evaluated in this work as supplementary material. (PDF)</p

    Detailed setup for the ontology-based sentence similarity measures evaluated in this work.

    No full text
    The evaluation of the methods using Rada [69], coswJ&C [46], and Cai [68] word similarity measures use a reformulation of the original path-based measures based on the new Ancestors-based Shortest-Path Length (AncSPL) algorithm [42].</p

    Pearson (r), Spearman (<i>ρ</i>) and harmonic (<i>h</i>) values obtained in our experiments from the evaluation of ontology similarity methods detailed below in the MedSTS<sub><i>full</i></sub> [52] dataset for each NER tool.

    No full text
    Pearson (r), Spearman (ρ) and harmonic (h) values obtained in our experiments from the evaluation of ontology similarity methods detailed below in the MedSTSfull [52] dataset for each NER tool.</p

    Detail of the pre-processing configurations that are evaluated in this work.

    No full text
    (*) WordPieceTokenizer [91] is used only for BERT-based methods [30, 31, 34, 62, 91–94, 99].</p

    Detailed setup for the sentence similarity methods based on pre-trained character, word (WE) and sentence (SE) embedding models evaluated herein.

    No full text
    Detailed setup for the sentence similarity methods based on pre-trained character, word (WE) and sentence (SE) embedding models evaluated herein.</p
    corecore