    Text based classification of companies in CrunchBase

    This paper introduces two fuzzy fingerprint based text classification techniques that were successfully applied to automatically label companies from CrunchBase, based purely on their unstructured textual description. This is a real and very challenging problem due to the large set of possible labels (more than 40) and also to the fact that the textual descriptions do not have to abide by any criteria and are, therefore, extremely heterogeneous. Fuzzy fingerprints are a recently introduced technique that can be used for performing fast classification. They perform well in the presence of unbalanced datasets and can cope with a very large number of classes. In the paper, a comparison is performed against some of the best text classification techniques commonly used to address similar problems. When applied to the CrunchBase dataset, the fuzzy fingerprint based approach outperformed the other techniques.

    Creating classification models from textual descriptions of companies using crunchbase

    This paper compares different models for multilabel text classification, using information collected from Crunchbase, a large database that holds information about more than 600000 companies. Each company is labeled with one or more categories, from a subset of 46 possible categories, and the proposed models predict the categories based solely on the company textual description. A number of natural language processing strategies have been tested for feature extraction, including stemming, lemmatization, and part-of-speech tags. This is a highly unbalanced dataset, where the frequency of each category ranges from 0.7% to 28%. Our findings reveal that the description text of each company contain features that allow to predict its area of activity, expressed by its corresponding categories, with about 70% precision, and 42% recall. In a second set of experiments, a multiclass problem that attempts to find the most probable category, we obtained about 67% accuracy using SVM and Fuzzy Fingerprints. The resulting models may constitute an important asset for automatic classification of texts, not only consisting of company descriptions, but also other texts, such as web pages, text blogs, news pages, etc.

    Quality control and productivity in oak timber - from forest to the primary processing

    Oak timber is valuated for its beauty, good mechanical properties and natural durability and may have multiple uses. An understanding of the factors that affect oak timber quality is essential. It is important to have quality control of physical, mechanical and technological wood characteristics in order to define the better primary processing and end-use. Silviculture may significantly impact wood quality and final stand value. Specific prescriptions will depend on species, site conditions, desired end product and management options. An appropriate silviculture with optimized technological operations allows a well use of wood even with small diameters. Adequate wood classification is required in order to optimize industrial processes and improve product quality. Quality criteria and procedures for round and sawtimber are referenced

    Estudos de neotectónica na falha de Carcavai, Algarve Oriental

    Estudos levados a cabo na zona da falha de Carcavai revelaram a ocorrência de deformações (fracturação e filões detríticos) em sedimentos plio-quaternários, indicando actividade neotectónica. A generalidade das fracturas aparenta corresponder à expressão superficial secundária da actividade naquela zona de falha. Os dados adquiridos apontampara uma zona de falha complexa de desligamento esquerdo, com componente vertical inversa, desde o final do Mesozóico ou início do Cenozóico. Os filões detríticos foram interpretados como estruturas resultantes de liquefacção induzida sismicamente

    Looking for earthquake sources in the Lisbon area

    The Lisbon and surrounding areas have suffered the effect of historical earthquakes that caused important damages and loss of lives. Some of these earthquake sources are local but they are still poorly known. The knowledge of these sources is important for seismic hazard studies. The use of geophysical methods in the area is required due to the difficulty in finding geological outcrops, together with low-slip rates and erosion/sedimentation processes that erase surface ruptures. Furthermore, most of earthquake occurs at great depth, emphasizing the need for the application of the latter methods. In this paper we present a revised structural interpretation of the area using newly reprocessed and reinterpreted seismic reflection and potential-field data, relocated epicentres, geological outcrop and well data. This interpretation differs in some aspects from previous ones. Well known active faults zones like the Azambuja fault and the Pinhal Novo-Setúbal fault have new interpretations, while other previously unknown structures, like the Ota-V. F. de Xira-Lisbon-Sesimbra fault zone, for example, have been interpreted. These studies, together with shallow geophysical data, which has been and will be acquired over selected targets from this work, will constitute na improvement to the seismic hazard evaluation of the area

    Excess Thermodynamics of Mixtures Involving Xenon and Light Linear Alkanes by Computer Simulation

    Excess molar enthalpies and excess molar volumes as a function of composition for liquid mixtures of xenon + ethane (at 161.40 K), xenon + propane (at 161.40 K) and xenon + n-butane (at 182.34 K) have been obtained by Monte Carlo computer simulations and compared with available experimental data. Simulation conditions were chosen to closely match those of the corresponding experimental results. The TraPPE-UA force field was selected among other force fields to model all the alkanes studied, whereas the one-center Lennard−Jones potential from Bohn et al. was used for xenon. The calculated and for all systems are negative, increasing in magnitude as the alkane chain length increases. The results for these systems were compared with experimental data and with other theoretical calculations using the SAFT approach. An excellent agreement between simulation and experimental results was found for xenon + ethane system, whereas for the remaining two systems, some deviations that become progressively more significant as the alkane chain length increases were observed

    Potassium Ferrite for Biomedical Applications

    Ferrites have been widely studied for their use in the biomedical area, mostly due to their magnetic properties, which gives them the potential to be used in diagnostics, drug delivery, and in treatment with magnetic hyperthermia, for example. In this work, KFeO2 particles were synthesized with a proteic sol-gel method using powdered coconut water as a precursor; this method is based on the principles of green chemistry. To improve its properties, the base powder obtained was subjected to multiple heat treatments at temperatures between 350 and 1300 °C. The samples obtained underwent structural, morphological, biocompatibility, and magnetic characterization. The results show that upon raising the heat treatment temperature, not only is the wanted phase detected, but also the secondary phases. To overcome these secondary phases, several different heat treatments were carried out. Using scanning electron microscopy, grains in the micrometric range were observed. Saturation magnetizations between 15.5 and 24.1 emu/g were observed for the samples containing KFeO2 with an applied field of 50 kOe at 300 K. From cellular compatibility (cytotoxicity) assays, for concentrations up to 5 mg/mL, only the samples treated at 350 °C were cytotoxic. However, the samples containing KFeO2, while being biocompatible, had low specific absorption rates (1.55–5.76 W/g).

    Aplicação de método multi-critério para avaliação acústica de Bibliotecas Públicas

    Apresenta-se neste trabalho a quantificação da qualidade acústica de bibliotecas através dum algoritmo baseado na metodologia multi-critério. Os critérios e a sua importância relativa foram determinados com apoio de questionário a funcionários de bibliotecas. Os critérios que integram o algoritmo são: tempo de reverberação, RASTI, absorção sonora da entrada, isolamento átrio/sala de leitura, índice de isolamento sonoro a ruídos aéreos, nível sonoro do ruído de equipamentos, índice de isolamento sonoro a ruídos aéreos padronizado com o exterior e LnT,w. O algoritmo desenvolvido foi testado sob uma grande amostra de bibliotecas públicas obtendo-se assim um Índice de Qualidade Acústica das Bibliotecas (IQAB).The acoustic quality of libraries is presented through a multi-criterion algorithm. The criterion and its relative importance were gotten through a questionnaire to the libraries employees. The acoustic parameters used are: reverberation time, RASTI, sound absorption of the library entrance, sound isolation of the entrance to the reading room, weighted standardized level difference between the reading room and contiguous rooms, standardized sound level of the equipment noise, weighted standardized level difference of a façade and weighted standardized impact sound pressure level. The algorithm was tested on a sample of libraries in Portugal to obtain the Index of Acoustic Quality for Libraries