29,449 research outputs found
Towards the Automatic Classification of Documents in User-generated Classifications
There is a huge amount of information scattered on the World Wide Web. As the information flow occurs at a high speed in the WWW, there is a need to organize it in the right manner so that a user can access it very easily. Previously the organization of information was generally done manually, by matching the document contents to some pre-defined categories. There are two approaches for this text-based categorization: manual and automatic. In the manual approach, a human expert performs the classification task, and in the second case supervised classifiers are used to automatically classify resources. In a supervised classification, manual interaction is required to create some training data before the automatic classification task takes place. In our new approach, we intend to propose automatic classification of documents through semantic keywords and building the formulas generation by these keywords. Thus we can reduce this human participation by combining the knowledge of a given classification and the knowledge extracted from the data. The main focus of this PhD thesis, supervised by Prof. Fausto Giunchiglia, is the automatic classification of documents into user-generated classifications. The key benefits foreseen from this automatic document classification is not only related to search engines, but also to many other fields like, document organization, text filtering, semantic index managing
Survey of data mining approaches to user modeling for adaptive hypermedia
The ability of an adaptive hypermedia system to create tailored environments depends mainly on the amount and accuracy of information stored in each user model. Some of the difficulties that user modeling faces are the amount of data available to create user models, the adequacy of the data, the noise within that data, and the necessity of capturing the imprecise nature of human behavior. Data mining and machine learning techniques have the ability to handle large amounts of data and to process uncertainty. These characteristics make these techniques suitable for automatic generation of user models that simulate human decision making. This paper surveys different data mining techniques that can be used to efficiently and accurately capture user behavior. The paper also presents guidelines that show which techniques may be used more efficiently according to the task implemented by the applicatio
Satirical News Detection and Analysis using Attention Mechanism and Linguistic Features
Satirical news is considered to be entertainment, but it is potentially
deceptive and harmful. Despite the embedded genre in the article, not everyone
can recognize the satirical cues and therefore believe the news as true news.
We observe that satirical cues are often reflected in certain paragraphs rather
than the whole document. Existing works only consider document-level features
to detect the satire, which could be limited. We consider paragraph-level
linguistic features to unveil the satire by incorporating neural network and
attention mechanism. We investigate the difference between paragraph-level
features and document-level features, and analyze them on a large satirical
news dataset. The evaluation shows that the proposed model detects satirical
news effectively and reveals what features are important at which level.Comment: EMNLP 2017, 11 page
- …