159 research outputs found
A Speaker De-Identification System Based on Sound Processing
In the context of products employing speech recognition, where the speech signal is sent from the device to centralized servers that process data, or simply products that involve data storage on servers, privacy for audio data is an important issue, just as it is for other types of data. Ignoring privacy has consequences for both, speakers (information leaks) and server administrators (legal issues). In this paper, we propose a speaker de-identification solution based on sound processing, altering voice characteristics, along with an API. Our solution consisting of pitch shift and noise mix (the latter is an optional augmentation method) has a great speaker de-identification performance, without an important loss in terms of word intelligibility. It is worth mentioning that sometimes the recordings may not be easy to understand in the initial (i.e., not de-identified) form, due to the speaker’s pronunciation, talking speed, and other related factors
A Large‐Scale E‐voting System Based on Blockchain
E-voting systems are increasingly used, considering the various facilities they offer: casting and counting votes in real time. The current voting systems are currently the target of attempted fraud and this is a major problem globally, which has not been solved even to this day. In the field of computer science, these e-voting platforms need to provide integrated security, thus enhancing the scalability and performance of the blockchain‐based e‐voting system. Our aim is to develop a secure internet-based voting system to maximize user participation, by allowing them to vote from anywhere. This paper proposes a system architecture based on blockchain technology along with a web interface in order to securely authenticate the voters on the platform. It should be noted in addition that these two components can be used together or separately, depending on the application’s needs
What Makes Your Writing Style Unique? Significant Differences Between Two Famous Romanian Orators
This paper introduces a novel, in-depth approach of analyzing the
differences in writing style between two famous Romanian orators, based on
automated textual complexity indices for Romanian language. The considered
authors are: (a) Mihai Eminescu, Romania’s national poet and a remarkable
journalist of his time, and (b) Ion C. Brătianu, one of the most important
Romanian politicians from the middle of the 18th century. Both orators have a
common journalistic interest consisting in their desire to spread the word about
political issues in Romania via the printing press, the most important public
voice at that time. In addition, both authors exhibit writing style particularities,
and our aim is to explore these differences through our ReaderBench framework
that computes a wide range of lexical and semantic textual complexity indices
for Romanian and other languages. The used corpus contains two collections of
speeches for each orator that cover the period 1857–1880. The results of this
study highlight the lexical and cohesive textual complexity indices that reflect
very well the differences in writing style, measures relying on Latent Semantic
Analysis (LSA) and Latent Dirichlet Allocation (LDA) semantic models.This study is part of the RAGE project. The RAGE project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 644187. This publication reflects only the author's view. The European Commission is not responsible for any use that may be made of the information it contains
Opinion and Sentiment Analysis of Italian print press
As it is known, the success of a newspaper article for the public opinion can be measured by the degree in which the journalist is able to report and modify (if needed) attitudes, opinions, feelings and political beliefs. We present a symbolic system for Italian, derived from GETARUNS, which integrates a range of natural language processing tools with the intent to characterise the print press discourse. The system is multilingual and can produce deep text understanding. This has been done on some 500K words of text, extracted from three Italian newspaper in order to characterize their stance on a deep political crisis situation. We tried two different approaches: a lexicon-based approach for semantic polarity using off-the-shelf dictionaries with the addition of manually supervised domain related concepts; another one is a feature-based semantic and pragmatic approach, which computes propositional level analysis with the intent to better characterize important component like factuality and subjectivity. Results are quite revealing and confirm the otherwise common knowledge about the political stance of each newspaper on such topic as the change of government, that took placeatthe end of lastyear,2011
Opinion and Factivity Analysis of Italian political discourse
The success of a newspaper article for the public opinion can be measured by the degree in which the journalist is able to report and modify (if needed) attitudes, opinions, feelings and political beliefs. We present a symbolic system for Italian, derived from GETARUNS, which integrates a range of natural language processing tools with the intent to characterise the print press discourse from a semantic and pragmatic point of view. This has been done on some 500K words of text, extracted from three Italian newspapers in order to characterize their stance on a deep political crisis situation. We tried two different approaches: a lexicon-based approach for semantic polarity using off-the-shelf dictionaries with the addition of manually supervised domain related concepts; another one is a feature-based semantic and pragmatic approach, which computes propositional level analysis with the intent to better characterize important component like factuality and subjectivity. Results are quite revealing and confirm the otherwise common knowledge about the political stance of each newspaper on such topic as the change of government that took place at the end of last year, 2011
A survey of guidelines and best practices for the generation, interlinking, publication, and validation of linguistic linked data
This article discusses a survey carried out within the NexusLinguarum COST Action which aimed to give an overview of existing guidelines (GLs) and best practices (BPs) in linguistic linked data. In particular it focused on four core tasks in the production/publication of linked data: generation, interlinking, publication, and validation. We discuss the importance of GLs and BPs for LLD before describing the survey and its results in full. Finally we offer a number of directions for future work in order to address the findings of the survey
The Effect of Clay Type on the Physicochemical Properties of New Hydrogel Clay Nanocomposites
This study focuses on the investigation of clay type effect on the final properties of semi-interpenetrated Salecan/poly(methacrylic acid)/clay hydrogel nanocomposites. Previous studies have indicated that the presence of clay in polymer composites leads to better swelling capacity and mechanical properties as functions of clay type. On the other hand, Salecan, which is a water soluble extracellular polysaccharide, was proved to assure greater flexibility to hydrogels. These properties recommend clay and Salecan for semi-interpenetrated hydrogels preparation with specific application in biomedicine. The purpose was to determine the most suitable type of clay as well as Salecan influence for developing the desired water retention/delivery ability and mechanically enhanced semi-interpenetrating polymer network (SIPN) nanocomposites. For our investigations, we have chosen commercially available montmorillonite (ClNa) and different commercial organomodified clay (Cl30B, Cl20A and Cl15A). Several analyses results (FTIR, TGA, DMA, XRD, microscopy and swelling studies) demonstrated that not only the presence of Salecan but also the clay type influenced the structure and properties of the final nanocomposites
LLODIA: A Linguistic Linked Open Data Model for Diachronic Analysis
editorial reviewedThis article proposes a linguistic linked open data model for diachronic analysis (LLODIA) that combines data derived from diachronic analysis of multilingual corpora with dictionary-based evidence. A humanities use case was devised as a proof of concept that includes examples in five languages (French, Hebrew, Latin, Lithuanian and Romanian) related to various meanings of the term “revolution” considered at different time intervals. The examples were compiled through diachronic word embedding and dictionary alignment
- …