48,116 research outputs found

    Comparing the writing style of real and artificial papers

    Full text link
    Recent years have witnessed the increase of competition in science. While promoting the quality of research in many cases, an intense competition among scientists can also trigger unethical scientific behaviors. To increase the total number of published papers, some authors even resort to software tools that are able to produce grammatical, but meaningless scientific manuscripts. Because automatically generated papers can be misunderstood as real papers, it becomes of paramount importance to develop means to identify these scientific frauds. In this paper, I devise a methodology to distinguish real manuscripts from those generated with SCIGen, an automatic paper generator. Upon modeling texts as complex networks (CN), it was possible to discriminate real from fake papers with at least 89\% of accuracy. A systematic analysis of features relevance revealed that the accessibility and betweenness were useful in particular cases, even though the relevance depended upon the dataset. The successful application of the methods described here show, as a proof of principle, that network features can be used to identify scientific gibberish papers. In addition, the CN-based approach can be combined in a straightforward fashion with traditional statistical language processing methods to improve the performance in identifying artificially generated papers.Comment: To appear in Scientometrics (2015

    Automatic Classification and Speaker Identification of African Elephant (\u3cem\u3eLoxodonta africana\u3c/em\u3e) Vocalizations

    Get PDF
    A hidden Markov model (HMM) system is presented for automatically classifying African elephant vocalizations. The development of the system is motivated by successful models from human speech analysis and recognition. Classification features include frequency-shifted Mel-frequency cepstral coefficients (MFCCs) and log energy, spectrally motivated features which are commonly used in human speech processing. Experiments, including vocalization type classification and speaker identification, are performed on vocalizations collected from captive elephants in a naturalistic environment. The system classified vocalizations with accuracies of 94.3% and 82.5% for type classification and speaker identification classification experiments, respectively. Classification accuracy, statistical significance tests on the model parameters, and qualitative analysis support the effectiveness and robustness of this approach for vocalization analysis in nonhuman species

    Neuroimaging of structural pathology and connectomics in traumatic brain injury: Toward personalized outcome prediction.

    Get PDF
    Recent contributions to the body of knowledge on traumatic brain injury (TBI) favor the view that multimodal neuroimaging using structural and functional magnetic resonance imaging (MRI and fMRI, respectively) as well as diffusion tensor imaging (DTI) has excellent potential to identify novel biomarkers and predictors of TBI outcome. This is particularly the case when such methods are appropriately combined with volumetric/morphometric analysis of brain structures and with the exploration of TBI-related changes in brain network properties at the level of the connectome. In this context, our present review summarizes recent developments on the roles of these two techniques in the search for novel structural neuroimaging biomarkers that have TBI outcome prognostication value. The themes being explored cover notable trends in this area of research, including (1) the role of advanced MRI processing methods in the analysis of structural pathology, (2) the use of brain connectomics and network analysis to identify outcome biomarkers, and (3) the application of multivariate statistics to predict outcome using neuroimaging metrics. The goal of the review is to draw the community's attention to these recent advances on TBI outcome prediction methods and to encourage the development of new methodologies whereby structural neuroimaging can be used to identify biomarkers of TBI outcome

    Logopenic and nonfluent variants of primary progressive aphasia are differentiated by acoustic measures of speech production

    Get PDF
    Differentiation of logopenic (lvPPA) and nonfluent/agrammatic (nfvPPA) variants of Primary Progressive Aphasia is important yet remains challenging since it hinges on expert based evaluation of speech and language production. In this study acoustic measures of speech in conjunction with voxel-based morphometry were used to determine the success of the measures as an adjunct to diagnosis and to explore the neural basis of apraxia of speech in nfvPPA. Forty-one patients (21 lvPPA, 20 nfvPPA) were recruited from a consecutive sample with suspected frontotemporal dementia. Patients were diagnosed using the current gold-standard of expert perceptual judgment, based on presence/absence of particular speech features during speaking tasks. Seventeen healthy age-matched adults served as controls. MRI scans were available for 11 control and 37 PPA cases; 23 of the PPA cases underwent amyloid ligand PET imaging. Measures, corresponding to perceptual features of apraxia of speech, were periods of silence during reading and relative vowel duration and intensity in polysyllable word repetition. Discriminant function analyses revealed that a measure of relative vowel duration differentiated nfvPPA cases from both control and lvPPA cases (r2 = 0.47) with 88% agreement with expert judgment of presence of apraxia of speech in nfvPPA cases. VBM analysis showed that relative vowel duration covaried with grey matter intensity in areas critical for speech motor planning and programming: precentral gyrus, supplementary motor area and inferior frontal gyrus bilaterally, only affected in the nfvPPA group. This bilateral involvement of frontal speech networks in nfvPPA potentially affects access to compensatory mechanisms involving right hemisphere homologues. Measures of silences during reading also discriminated the PPA and control groups, but did not increase predictive accuracy. Findings suggest that a measure of relative vowel duration from of a polysyllable word repetition task may be sufficient for detecting most cases of apraxia of speech and distinguishing between nfvPPA and lvPPA

    Updates in metabolomics tools and resources: 2014-2015

    Get PDF
    Data processing and interpretation represent the most challenging and time-consuming steps in high-throughput metabolomic experiments, regardless of the analytical platforms (MS or NMR spectroscopy based) used for data acquisition. Improved machinery in metabolomics generates increasingly complex datasets that create the need for more and better processing and analysis software and in silico approaches to understand the resulting data. However, a comprehensive source of information describing the utility of the most recently developed and released metabolomics resources—in the form of tools, software, and databases—is currently lacking. Thus, here we provide an overview of freely-available, and open-source, tools, algorithms, and frameworks to make both upcoming and established metabolomics researchers aware of the recent developments in an attempt to advance and facilitate data processing workflows in their metabolomics research. The major topics include tools and researches for data processing, data annotation, and data visualization in MS and NMR-based metabolomics. Most in this review described tools are dedicated to untargeted metabolomics workflows; however, some more specialist tools are described as well. All tools and resources described including their analytical and computational platform dependencies are summarized in an overview Table

    Speaker segmentation and clustering

    Get PDF
    This survey focuses on two challenging speech processing topics, namely: speaker segmentation and speaker clustering. Speaker segmentation aims at finding speaker change points in an audio stream, whereas speaker clustering aims at grouping speech segments based on speaker characteristics. Model-based, metric-based, and hybrid speaker segmentation algorithms are reviewed. Concerning speaker clustering, deterministic and probabilistic algorithms are examined. A comparative assessment of the reviewed algorithms is undertaken, the algorithm advantages and disadvantages are indicated, insight to the algorithms is offered, and deductions as well as recommendations are given. Rich transcription and movie analysis are candidate applications that benefit from combined speaker segmentation and clustering. © 2007 Elsevier B.V. All rights reserved

    Computationally Efficient and Robust BIC-Based Speaker Segmentation

    Get PDF
    An algorithm for automatic speaker segmentation based on the Bayesian information criterion (BIC) is presented. BIC tests are not performed for every window shift, as previously, but when a speaker change is most probable to occur. This is done by estimating the next probable change point thanks to a model of utterance durations. It is found that the inverse Gaussian fits best the distribution of utterance durations. As a result, less BIC tests are needed, making the proposed system less computationally demanding in time and memory, and considerably more efficient with respect to missed speaker change points. A feature selection algorithm based on branch and bound search strategy is applied in order to identify the most efficient features for speaker segmentation. Furthermore, a new theoretical formulation of BIC is derived by applying centering and simultaneous diagonalization. This formulation is considerably more computationally efficient than the standard BIC, when the covariance matrices are estimated by other estimators than the usual maximum-likelihood ones. Two commonly used pairs of figures of merit are employed and their relationship is established. Computational efficiency is achieved through the speaker utterance modeling, whereas robustness is achieved by feature selection and application of BIC tests at appropriately selected time instants. Experimental results indicate that the proposed modifications yield a superior performance compared to existing approaches
    corecore