15 research outputs found
Integrating Prosodic and Lexical Cues for Automatic Topic Segmentation
We present a probabilistic model that uses both prosodic and lexical cues for
the automatic segmentation of speech into topically coherent units. We propose
two methods for combining lexical and prosodic information using hidden Markov
models and decision trees. Lexical information is obtained from a speech
recognizer, and prosodic features are extracted automatically from speech
waveforms. We evaluate our approach on the Broadcast News corpus, using the
DARPA-TDT evaluation metrics. Results show that the prosodic model alone is
competitive with word-based segmentation methods. Furthermore, we achieve a
significant reduction in error by combining the prosodic and word-based
knowledge sources.Comment: 27 pages, 8 figure
Mining Hidden Markov Models in Sequences of Characters Using Recurrent Neural Networks
Restoring damaged historical manuscripts and making them available to the large public has been of great interest for humanities researchers long before computers provided assistance for this task. Current technologies and models make this process easier, more accurate, and capable of discovering parts that were previously unknown. I use Recurrent Neural Networks for uncovering hidden Markov models in sequences of characters from historic manuscripts. Such manuscripts are typically written in some archaic language, which makes the underlying machine learning problem inherently difficult, as not much training data is available, in general. I use bidirectional, hierarchical models for sequences of one or more characters, trained on the existent manuscript data. I tested my model and present experimental results using an Old English manuscript
Automated Monitoring of Online News Streams: Topic Detection and Tracking Considerations
This paper describes the term frequency patterns found in online news summaries published over a seven-week period. The patterns are analyzed qualitatively and quantitatively to facilitate the refinement of algorithms used for the automatic detection and tracking of important topics appearing in streams of text. It is shown that a term's importance cannot be measured in raw frequency counts or significant increases in volume alone. The impact of these findings on existing algorithms is discussed, and new approaches for automated story detection and presentation are considered
Understanding musical genre preference evolution within a social network
Dissertation presented as partial requirement for obtaining the Master’s degree in Information Management, specialization in Knowledge Management and Business IntelligenceA música é um campo que simplesmente não pode ser desassociado dos aspetos
sociais da vida. Durante a história da humanidade, a música mais popular consistiu sempre
num reflexo dos diferentes aspetos da sociedade. Como tal, diferentes estudos foram feitos
anteriormente que demonstram este reflexo e obtiveram diversas conclusões.
Nesta tese, iremos contribuir para este campo através de uma análise da evolução das
preferências de géneros musicais ao longo do tempo através de uma rede social. Usando
dados obtidos através de uma experiência de evolução social com cerca de 80 participantes
faremos uma análise dos dados existentes. De seguida, esta análise é tida em conta para
definir os princípios necessários para representar e analisar a rede social existente. Após esta
definição, iremos avaliar a homogeneização da rede social ao longo do tempo. Isto é, iremos
avaliar a evolução das diferenças de preferências musicais entre indivíduos que estão ligados
na rede social, de forma a perceber se existe alguma tendência de estas diminuírem ao longo
do tempo.
Um Sequential Algorithm, conhecido como Hidden Markov Model, é aplicado para
prever mudanças nas preferências de géneros musicais, considerando as próprias
preferências de cada individuo, bem como as preferências dos indivíduos com que este se
encontra ligado na nossa rede social. O algoritmo Support Vector Machines é também
utilizado para fazer o mesmo tipo de previsão que o modelo anterior servindo como
comparação.
Por último, discutimos o processo e as limitações que conduziram à definição final do
nosso modelo e de forma a contextualizar os resultados que foram obtidos através deste. Em
suma, esta tese procurar acrescentar ao trabalho existente em termos de preferências de
géneros musicais através de uma avaliação destes dentro do contexto de uma rede social e
tendo também em conta a evolução destas ao longo do tempo.Music is a field that simply cannot be disassociated with the social aspects of life.
Throughout human history, popular music has always been a reflection of the different
aspects of society. As such, there is an interesting amount of studies available that showcase
this reflection and draw multiple types of insights.
In this thesis, we will look to contribute to this field by assessing the evolution of
musical genre preferences over time throughout a social network. Using data obtained
through a social evolution experiment of around 80 different individuals we will make an
initial assessment of our existing data. This evaluation is then taken into consideration in the
next phase of our work where we define the principles necessary to represent and analyse
the existing social network. Afterwards, we will showcase a representation of this network,
as well as analyse it using various metrics and sub-structures commonly applied in Social
Network Analysis. After this, we will evaluate the homogenisation of a network as time goes
on. In other words, we will assess the evolution of differences in preferences between
individuals that were connected in the social network, in order to understand if there is a
trend of these differences diminishing over time.
A Sequential-Based algorithm, more specifically, a Hidden Markov Model is used to
predict the change in musical genre preferences. This was done by considering each
individual’s own preferences as well as the preferences of his connections within the social
network with the ultimate goal of assessing how influential the network is in the evolution of
a person’s musical genre preferences. To tackle the same research question and provide an
alternative approach, as well as a comparison model, we used a Support Vector Machine
model.
Finally, we discuss the results and limitations that led to our model definition. Overall,
this thesis seeks to build upon previous work regarding musical genre preferences by
assessing these within the context of a network and taking into account the evolution of these
over time
Sistema autonômico para detecção de mudanças em eventos a partir de notícias
Topic Detection and Tracking (TDT) has been a topic of many researches since it was defined in the late 90’s and early 2000’s and the main goal is to identify real-world events from non-structured information. Autonomic Computing, in the same way, has been growing since the early 2000’s and is designated for systems which are capable of measuring its own performance automatically, used in latest and modern technologies. Many works were developed in both topics, nevertheless only a few unite these two important concepts, minimizing human intervention to analyze non-structured information. The present work aims to create an autonomic system for change detection in events from news articles.Detecção e Rastreio de Tópicos (TDT) tem sido um tema de bastante pesquisas desde que foi definido no final dos anos 90 e começo dos anos 2000 e tem por objetivo identificar eventos do mundo real a partir de informação não-estruturada. Computação Autonômica, do mesmo modo, também tem crescido bastante à partir dos anos 2000 e é designado para sistemas que tem capacidade de medir seu próprio desempenho automaticamente, sendo aplicado nas mais modernas tecnologias. Muitos trabalhos foram desenvolvidos em ambos os temas, porém poucos que unissem estes dois importantes conceitos, reduzindo assim a necessidade de intervenção humana na importante tarefa de analisar informações não-estruturadas. O presente trabalho tem por objetivo criar um sistema autonômico para detecção de modificações em eventos a partir de notícias
A Framework for Logical Structure Extraction from Software Requirements Documents
General purpose rich-text editors, such as MS Word are often used to author software requirements specifications. These requirements specifications contain many different logical structures, such as use cases, business rules and functional requirements. Automated recognition and extraction of these logical structures is necessary to provide useful automated requirements management features, such as automated traceability, template conformance checking, guided editing and interoperability with sophisticated requirements management tools like Requisite Pro. The variability among instances of these logical structures and their attributes poses many challenges for their accurate recognition and extraction. The thesis provides a framework for the extraction of logical structures from software requirements documents. The framework models information about style, structure, and attributes of the logical structures and uses the defined meta-model to extract instances of logical structures. A meta-model also incorporates information about the variability present in the instances. The framework includes an extraction tool, ET, that reads the meta-model and extracts instances of modelled logical structures from the documents. The framework is evaluated on a collection of real-world software requirements documents. Using the framework, different logical structures can be extracted with high precision and recall, each close to 100%. The performance of the extraction tool is acceptable for fast extraction of logical structures from documents with extraction times ranging from a few milliseconds to a few seconds