30 research outputs found
Recognition of pen-based music notation with finite-state machines
This work presents a statistical model to recognize pen-based music compositions using stroke recognition algorithms and finite-state machines. The series of strokes received as input is mapped onto a stochastic representation, which is combined with a formal language that describes musical symbols in terms of stroke primitives. Then, a Probabilistic Finite-State Automaton is obtained, which defines probabilities over the set of musical sequences. This model is eventually crossed with a semantic language to avoid sequences that does not make musical sense. Finally, a decoding strategy is applied in order to output a hypothesis about the musical sequence actually written. Comprehensive experimentation with several decoding algorithms, stroke similarity measures and probability density estimators are tested and evaluated following different metrics of interest. Results found have shown the goodness of the proposed model, obtaining competitive performances in all metrics and scenarios considered.This work was supported by the Spanish Ministerio de EducaciĂłn, Cultura y Deporte through a FPU Fellowship (Ref. AP2012–0939) and the Spanish Ministerio de EconomĂa y Competitividad through the TIMuL Project (No. TIN2013-48152-C2-1-R, supported by UE FEDER funds)
FEATURE EXTRACTION AND CLASSIFICATION THROUGH ENTROPY MEASURES
Entropy is a universal concept that represents the uncertainty of a series of random
events. The notion \u201centropy" is differently understood in different disciplines. In physics,
it represents the thermodynamical state variable; in statistics it measures the degree of
disorder. On the other hand, in computer science, it is used as a powerful tool for measuring
the regularity (or complexity) in signals or time series. In this work, we have
studied entropy based features in the context of signal processing.
The purpose of feature extraction is to select the relevant features from an entity. The
type of features depends on the signal characteristics and classification purpose. Many
real world signals are nonlinear and nonstationary and they contain information that
cannot be described by time and frequency domain parameters, instead they might be
described well by entropy.
However, in practice, estimation of entropy suffers from some limitations and is
highly dependent on series length. To reduce this dependence, we have proposed parametric
estimation of various entropy indices and have derived analytical expressions
(when possible) as well. Then we have studied the feasibility of parametric estimations
of entropy measures on both synthetic and real signals. The entropy based features
have been finally employed for classification problems related to clinical applications,
activity recognition, and handwritten character recognition. Thus, from a methodological
point of view our study deals with feature extraction, machine learning, and classification
methods.
The different versions of entropy measures are found in the literature for signals analysis.
Among them, approximate entropy (ApEn), sample entropy (SampEn) followed
by corrected conditional entropy (CcEn) are mostly used for physiological signals analysis.
Recently, entropy features are used also for image segmentation. A related measure
of entropy is Lempel-Ziv complexity (LZC), which measures the complexity of a
time-series, signal, or sequences. The estimation of LZC also relies on the series length.
In particular, in this study, analytical expressions have been derived for ApEn, SampEn,
and CcEn of an auto-regressive (AR) models. It should be mentioned that AR
models have been employed for maximum entropy spectral estimation since many
years. The feasibility of parametric estimates of these entropy measures have been
studied on both synthetic series and real data. In feasibility study, the agreement between
numeral estimates of entropy and estimates obtained through a certain number
of realizations of the AR model using Montecarlo simulations has been observed. This
agreement or disagreement provides information about nonlinearity, nonstationarity,
or nonGaussinaity presents in the series. In some classification problems, the probability
of agreement or disagreement have been proved as one of the most relevant
features.
VII
After feasibility study of the parametric entropy estimates, the entropy and related
measures have been applied in heart rate and arterial blood pressure variability analysis.
The use of entropy and related features have been proved more relevant in developing
sleep classification, handwritten character recognition, and physical activity
recognition systems.
The novel methods for feature extraction researched in this thesis give a good classification
or recognition accuracy, in many cases superior to the features reported in the
literature of concerned application domains, even with less computational costs
A New Species of Frog (Anura: Dicroglossidae) Discovered from the Mega City of Dhaka
We describe a new species of frog of the genus Zakerana discovered from the urban core of Dhaka, Bangladesh, one of the most densely populated cities in the world. Although the new species is morphologically similar to the geographically proximate congeners in the Bangladeshi cricket frog group, we show that it can be distinguished from all congeners on the basis of morphological characters, advertisement calls and variation in two mitochondrial DNA genes (12S rRNA and 16S rRNA). Apart from several diagnostic differences in body proportions, the new species differs from other Zakerana species in having a flattened snout (from ventral view) projecting over the lower jaw, and diagnostic trapezoid-shaped red markings on the vocal sac in males. Molecular genetic analyses show that the new species is highly divergent (3.1-20.1% sequence divergence) from all congeneric species, and forms a well-supported clade with its sister species, Zakerana asmati. The discovery of a new amphibian species from the urban core of Dhaka together with several recent descriptions of new amphibian species from Bangladesh may indicate that more amphibian species remain to be discovered from this country.Peer reviewe
Citizen-driven humanitarianism and the Bangladesh liberation war : Australian aid during the 1971 refugee crisis
This open access book presents an international history of humanitarianism during the Bangladesh Liberation War in 1971. Examining the motivations, actions and competing interests of multiple humanitarian actors such as the Red Cross, Oxfam, grassroots NGOs and individuals, it analyses the impact of humanitarianism for refugees in the camps. With western governments indifferent or slow to respond to India's pleas to assistance, Stevens shows how international aid to Bangladeshi refugees during the 1971 crisis was citizen-driven. Focusing on the actions of individuals and NGOs in Australia, Stevens shows how they rallied community support, fundraised at record levels and effectively lobbied the Australian government to increase aid and recognise Bangladesh's independence. Using archival materials from Australia, the UK, Switzerland and the US, Citizen-driven Humanitarianism and the Bangladeshi Liberation War provides an account of how civil society was galvanized, even radicalized, in their pursuit to remedy systemic problems such as ethnic persecution, militarism and poverty. Documenting the myriad forces at play during the refugee crisis of 1971, it shows how broader social and cultural developments coalesced to create the citizen-driven humanitarianism of the late 20th century
Satellite Workshop On Language, Artificial Intelligence and Computer Science for Natural Language Processing Applications (LAICS-NLP): Discovery of Meaning from Text
This paper proposes a novel method to disambiguate important words from a collection of documents. The
hypothesis that underlies this approach is that there is a
minimal set of senses that are significant in characterizing a context. We extend Yarowsky’s one sense
per discourse [13] further to a collection of related
documents rather than a single document. We perform
distributed clustering on a set of features representing
each of the top ten categories of documents in the
Reuters-21578 dataset. Groups of terms that have a
similar term distributional pattern across documents were
identified. WordNet-based similarity measurement was
then computed for terms within each cluster. An
aggregation of the associations in WordNet that was
employed to ascertain term similarity within clusters has
provided a means of identifying clusters’ root senses
Sentiment analysis in a resource scarce language: Hindi
A common human behavior is to take other’s opinion before taking any decision. With the tremendous availability of documents which express opinions on different issues, the challenge arises to analyze it and produce useful knowledge from it. Many works in the area of Sentiment Analysis is available for English language. From last few years, opinion-rich resources are booming in other languages and hence there is a need to perform Sentiment Analysis in those languages. In this paper, a Sentiment Analysis in Hindi Languag
STIXnet: entity and relation extraction from unstructured CTI reports
The increased frequency of cyber attacks against organizations and their potentially devastating effects has raised awareness on the severity of these threats. In order to proactively harden their defences, organizations have started to invest in Cyber Threat Intelligence (CTI), the field of Cybersecurity that deals with the collection, analysis and organization of intelligence on the attackers and their techniques. By being able to profile the activity of a particular threat actor, thus knowing the types of organizations that it targets and the kind of vulnerabilities that it exploits, it is possible not only to mitigate their attacks, but also to prevent them.
Although the sharing of this type of intelligence is facilitated by several standards such as STIX (Structured Threat Information eXpression), most of the data still consists of reports written in natural language. This particular format can be highly time-consuming for Cyber Threat Intelligence analysts, which may need to read the entire report and label entities and relations in order to generate an interconnected graph from which the intel can be extracted.
In this thesis, done in collaboration with Leonardo S.p.A., we provide a modular and extensible system called STIXnet for the extraction of entities and relations from natural language CTI reports. The tool is embedded in a larger platform, developed by Leonardo, called Cyber Threat Intelligence System (CTIS) and therefore inherits some of its features, such as an extensible knowledge base which also acts as a database for the entities to extract.
STIXnet uses techniques from Natural Language Processing (NLP), the branch of computer science that studies the ability of a computer program to process and analyze natural language data. This field of study has been recently revolutionized by the increasing popularity of Machine Learning, which allows for more efficient algorithms and better results. After looking for known entities retrieved from the knowledge base, STIXnet analyzes the semantic structure of the sentences in order to extract new possible entities and predicts Tactics, Techniques, and Procedures (TTPs) used by the attacker. Finally, an NLP model extracts relations between these entities and converts them to be compliant with the STIX 2.1 standard, thus generating an interconnected graph which can be exported and shared. STIXnet is also able to be constantly and automatically improved with some feedback from a human analyzer, which by highlighting false positives and false negatives in the processing of the report, can trigger a fine-tuning process that will increase the tool's overall accuracy and precision.
This framework can help defenders to immediately know at a glace all the gathered intelligence on a particular threat actor and thus deploy effective threat detection, perform attack simulations and strengthen their defenses, and together with the Cyber Threat Intelligence System platform organizations can be always one step ahead of the attacker and be secure against Advanced Persistent Threats (APTs).The increased frequency of cyber attacks against organizations and their potentially devastating effects has raised awareness on the severity of these threats. In order to proactively harden their defences, organizations have started to invest in Cyber Threat Intelligence (CTI), the field of Cybersecurity that deals with the collection, analysis and organization of intelligence on the attackers and their techniques. By being able to profile the activity of a particular threat actor, thus knowing the types of organizations that it targets and the kind of vulnerabilities that it exploits, it is possible not only to mitigate their attacks, but also to prevent them.
Although the sharing of this type of intelligence is facilitated by several standards such as STIX (Structured Threat Information eXpression), most of the data still consists of reports written in natural language. This particular format can be highly time-consuming for Cyber Threat Intelligence analysts, which may need to read the entire report and label entities and relations in order to generate an interconnected graph from which the intel can be extracted.
In this thesis, done in collaboration with Leonardo S.p.A., we provide a modular and extensible system called STIXnet for the extraction of entities and relations from natural language CTI reports. The tool is embedded in a larger platform, developed by Leonardo, called Cyber Threat Intelligence System (CTIS) and therefore inherits some of its features, such as an extensible knowledge base which also acts as a database for the entities to extract.
STIXnet uses techniques from Natural Language Processing (NLP), the branch of computer science that studies the ability of a computer program to process and analyze natural language data. This field of study has been recently revolutionized by the increasing popularity of Machine Learning, which allows for more efficient algorithms and better results. After looking for known entities retrieved from the knowledge base, STIXnet analyzes the semantic structure of the sentences in order to extract new possible entities and predicts Tactics, Techniques, and Procedures (TTPs) used by the attacker. Finally, an NLP model extracts relations between these entities and converts them to be compliant with the STIX 2.1 standard, thus generating an interconnected graph which can be exported and shared. STIXnet is also able to be constantly and automatically improved with some feedback from a human analyzer, which by highlighting false positives and false negatives in the processing of the report, can trigger a fine-tuning process that will increase the tool's overall accuracy and precision.
This framework can help defenders to immediately know at a glace all the gathered intelligence on a particular threat actor and thus deploy effective threat detection, perform attack simulations and strengthen their defenses, and together with the Cyber Threat Intelligence System platform organizations can be always one step ahead of the attacker and be secure against Advanced Persistent Threats (APTs)
Global information society watch 2007
This publication, the first in a series of reports covering the state of the information society on an annual basis, focuses on the theme of participation. The report has three interrelated goals: surveying the state of the field of ICT policy at the local and global levels; encouraging critical debate; and strengthening networking and advocacy for a just, inclusive information society. It discusses the WSIS process and a range of international institutions, regulatory agencies and monitoring instruments from the perspective of civil society and stakeholders in the global South. The..
Asian Yearbook of International Law, Volume 23 (2017)
The Yearbook aims to promote research, studies and writings in the field of international law in Asia, as well as to provide an intellectual platform for the discussion and dissemination of Asian views and practices on contemporary international legal issues. ; Readership: All interested in International Law and Asian Law