314 research outputs found
Vision systems with the human in the loop
The emerging cognitive vision paradigm deals with vision systems that apply machine learning and automatic reasoning in order to learn from what they perceive. Cognitive vision systems can rate the relevance and consistency of newly acquired knowledge, they can adapt to their environment and thus will exhibit high robustness. This contribution presents vision systems that aim at flexibility and robustness. One is tailored for content-based image retrieval, the others are cognitive vision systems that constitute prototypes of visual active memories which evaluate, gather, and integrate contextual knowledge for visual analysis. All three systems are designed to interact with human users. After we will have discussed adaptive content-based image retrieval and object and action recognition in an office environment, the issue of assessing cognitive systems will be raised. Experiences from psychologically evaluated human-machine interactions will be reported and the promising potential of psychologically-based usability experiments will be stressed
Data Analysis Methods for Software Systems
Using statistics, econometrics, machine learning, and functional data analysis methods, we evaluate the consequences of the lockdown during the COVID-19 pandemics for wage inequality and unemployment. We deduce that these two indicators mostly reacted to the first lockdown from March till June 2020. Also, analysing wage inequality, we conduct analysis separately for males and females and different age groups.We noticed that young females were affected mostly by the lockdown.Nevertheless, all the groups reacted to the lockdown at some level
Preface
DAMSS-2018 is the jubilee 10th international workshop on data analysis methods for software systems, organized in Druskininkai, Lithuania, at the end of the year. The same place and the same time every year.
Ten years passed from the first workshop. History of the workshop starts from 2009 with 16 presentations. The idea of such workshop came up at the Institute of Mathematics and Informatics. Lithuanian Academy of Sciences and the Lithuanian Computer Society supported this idea. This idea got approval both in the Lithuanian research community and abroad. The number of this year presentations is 81. The number of registered participants is 113 from 13 countries.
In 2010, the Institute of Mathematics and Informatics became a member of Vilnius University, the largest university of Lithuania. In 2017, the institute changes its name into the Institute of Data Science and Digital Technologies. This name reflects recent activities of the institute. The renewed institute has eight research groups: Cognitive Computing, Image and Signal Analysis, Cyber-Social Systems Engineering, Statistics and Probability, Global Optimization, Intelligent Technologies, Education Systems, Blockchain Technologies.
The main goal of the workshop is to introduce the research undertaken at Lithuanian and foreign universities in the fields of data science and software engineering. Annual organization of the workshop allows the fast interchanging of new ideas among the research community.
Even 11 companies supported the workshop this year. This means that the topics of the workshop are actual for business, too. Topics of the workshop cover big data, bioinformatics, data science, blockchain technologies, deep learning, digital technologies, high-performance computing, visualization methods for multidimensional data, machine learning, medical informatics, ontological engineering, optimization in data science, business rules, and software engineering. Seeking to facilitate relations between science and business, a special session and panel discussion is organized this year about topical business problems that may be solved together with the research community.
This book gives an overview of all presentations of DAMSS-2018.DAMSS-2018 is the jubilee 10th international workshop on data analysis methods for software systems, organized in Druskininkai, Lithuania, at the end of the year. The same place and the same time every year.
Ten years passed from the first workshop. History of the workshop starts from 2009 with 16 presentations. The idea of such workshop came up at the Institute of Mathematics and Informatics. Lithuanian Academy of Sciences and the Lithuanian Computer Society supported this idea. This idea got approval both in the Lithuanian research community and abroad. The number of this year presentations is 81. The number of registered participants is 113 from 13 countries.
In 2010, the Institute of Mathematics and Informatics became a member of Vilnius University, the largest university of Lithuania. In 2017, the institute changes its name into the Institute of Data Science and Digital Technologies. This name reflects recent activities of the institute. The renewed institute has eight research groups: Cognitive Computing, Image and Signal Analysis, Cyber-Social Systems Engineering, Statistics and Probability, Global Optimization, Intelligent Technologies, Education Systems, Blockchain Technologies.
The main goal of the workshop is to introduce the research undertaken at Lithuanian and foreign universities in the fields of data science and software engineering. Annual organization of the workshop allows the fast interchanging of new ideas among the research community.
Even 11 companies supported the workshop this year. This means that the topics of the workshop are actual for business, too. Topics of the workshop cover big data, bioinformatics, data science, blockchain technologies, deep learning, digital technologies, high-performance computing, visualization methods for multidimensional data, machine learning, medical informatics, ontological engineering, optimization in data science, business rules, and software engineering. Seeking to facilitate relations between science and business, a special session and panel discussion is organized this year about topical business problems that may be solved together with the research community.
This book gives an overview of all presentations of DAMSS-2018
Text-based Sentiment Analysis and Music Emotion Recognition
Nowadays, with the expansion of social media, large amounts of user-generated
texts like tweets, blog posts or product reviews are shared online. Sentiment polarity
analysis of such texts has become highly attractive and is utilized in recommender
systems, market predictions, business intelligence and more. We also witness deep
learning techniques becoming top performers on those types of tasks. There are
however several problems that need to be solved for efficient use of deep neural
networks on text mining and text polarity analysis.
First of all, deep neural networks are data hungry. They need to be fed with
datasets that are big in size, cleaned and preprocessed as well as properly labeled.
Second, the modern natural language processing concept of word embeddings as a
dense and distributed text feature representation solves sparsity and dimensionality
problems of the traditional bag-of-words model. Still, there are various uncertainties
regarding the use of word vectors: should they be generated from the same dataset
that is used to train the model or it is better to source them from big and popular
collections that work as generic text feature representations? Third, it is not easy for
practitioners to find a simple and highly effective deep learning setup for various
document lengths and types. Recurrent neural networks are weak with longer texts
and optimal convolution-pooling combinations are not easily conceived. It is thus
convenient to have generic neural network architectures that are effective and can
adapt to various texts, encapsulating much of design complexity.
This thesis addresses the above problems to provide methodological and practical
insights for utilizing neural networks on sentiment analysis of texts and achieving
state of the art results. Regarding the first problem, the effectiveness of various
crowdsourcing alternatives is explored and two medium-sized and emotion-labeled
song datasets are created utilizing social tags. One of the research interests of Telecom
Italia was the exploration of relations between music emotional stimulation and
driving style. Consequently, a context-aware music recommender system that aims
to enhance driving comfort and safety was also designed. To address the second
problem, a series of experiments with large text collections of various contents and
domains were conducted. Word embeddings of different parameters were exercised
and results revealed that their quality is influenced (mostly but not only) by the
size of texts they were created from. When working with small text datasets, it is
thus important to source word features from popular and generic word embedding
collections. Regarding the third problem, a series of experiments involving convolutional
and max-pooling neural layers were conducted. Various patterns relating
text properties and network parameters with optimal classification accuracy were
observed. Combining convolutions of words, bigrams, and trigrams with regional
max-pooling layers in a couple of stacks produced the best results. The derived
architecture achieves competitive performance on sentiment polarity analysis of
movie, business and product reviews.
Given that labeled data are becoming the bottleneck of the current deep learning
systems, a future research direction could be the exploration of various data programming
possibilities for constructing even bigger labeled datasets. Investigation
of feature-level or decision-level ensemble techniques in the context of deep neural
networks could also be fruitful. Different feature types do usually represent complementary
characteristics of data. Combining word embedding and traditional text
features or utilizing recurrent networks on document splits and then aggregating the
predictions could further increase prediction accuracy of such models
A DATA DRIVEN APPROACH TO IDENTIFY JOURNALISTIC 5WS FROM TEXT DOCUMENTS
Textual understanding is the process of automatically extracting accurate high-quality information from text. The amount of textual data available from different sources such as news, blogs and social media is growing exponentially. These data encode significant latent information which if extracted accurately can be valuable in a variety of applications such as medical report analyses, news understanding and societal studies. Natural language processing techniques are often employed to develop customized algorithms to extract such latent information from text.
Journalistic 5Ws refer to the basic information in news articles that describes an event and include where, when, who, what and why. Extracting them accurately may facilitate better understanding of many social processes including social unrest, human rights violations, propaganda spread, and population migration. Furthermore, the 5Ws information can be combined with socio-economic and demographic data to analyze state and trajectory of these processes.
In this thesis, a data driven pipeline has been developed to extract the 5Ws from text using syntactic and semantic cues in the text. First, a classifier is developed to identify articles specifically related to social unrest. The classifier has been trained with a dataset of over 80K news articles. We then use NLP algorithms to generate a set of candidates for the 5Ws. Then, a series of algorithms to extract the 5Ws are developed. These algorithms based on heuristics leverage specific words and parts-of-speech customized for individual Ws to compute their scores. The heuristics are based on the syntactic structure of the document as well as syntactic and semantic representations of individual words and sentences. These scores are then combined and ranked to obtain the best answers to Journalistic 5Ws. The classification accuracy of the algorithms is validated using a manually annotated dataset of news articles
Constructing an understanding of mind : the development of children's social understanding within social interaction
Theories of children's developing understanding of mind tend to emphasize either individualistic processes of theory formation, maturation, or introspection, or the process of enculturation. However, such theories must be able to account for the accumulating evidence of the role of social interaction in the development of social understanding. We propose an alternative account, according to which the development of children's social understanding occurs within triadic interaction involving the child's experience of the world as well as communicative interaction with others about their experience and beliefs (Chapman 1991; 1999). It is through such triadic interaction that children gradually construct knowledge of the world as well as knowledge of other people. We contend that the extent and nature of the social interaction children experience will influence the development of children's social understanding. Increased opportunity to engage in cooperative social interaction and exposure to talk about mental states should facilitate the development of social understanding. We review evidence suggesting that children's understanding of mind develops gradually in the context of social interaction. Therefore, we need a theory of development in this area that accords a fundamental role to social interaction, yet does not assume that children simply adopt socially available knowledge but rather that children construct an understanding of mind within social interaction
- âŠ