Data mining allows the exploration of sequences of phenomena, whereas one
usually tends to focus on isolated phenomena or on the relation between two
phenomena. It offers invaluable tools for theoretical analyses and exploration
of the structure of sentences, texts, dialogues, and speech. We report here the
results of an attempt at using it for inspecting sequences of verbs from French
accounts of road accidents. This analysis comes from an original approach of
unsupervised training allowing the discovery of the structure of sequential
data. The entries of the analyzer were only made of the verbs appearing in the
sentences. It provided a classification of the links between two successive
verbs into four distinct clusters, allowing thus text segmentation. We give
here an interpretation of these clusters by applying a statistical analysis to
independent semantic annotations

Bennani, Younès

Recanati, Catherine

Rogovschi, Nicoleta

English

arXiv

International audienceData mining allows the exploration of sequences of phenomena, whereas one usually tends to focus on isolated phenomena or on the relation between two phenomena. It offers invaluable tools for theoretical analyses and exploration of the structure of sentences, texts, dialogues, and speech. We report here the results of an attempt at using it for inspecting sequences of verbs from French accounts of road accidents. This analysis comes from an original approach of unsupervised training allowing the discovery of the structure of sequential data. The entries of the analyzer were only made of the verbs appearing in the sentences. It provided a classification of the links between two successive verbs into four distinct clusters, allowing thus text segmentation. We give here an interpretation of these clusters by applying a statistical analysis to independent semantic annotations

The structure of verbal sequences analyzed with unsupervised learning techniques

Abstract

Similar works

Full text

Available Versions

HAL-Paris 13