    Descoberta de conhecimento em textos baseada em conceitos

    A proposta deste trabalho é aplicar técnicas de Descoberta de Conhecimento sobre características extraídas de textos. Ao invés de aplicar as técnicas sobre termos (como faz [LIN 98]) ou sobre palavras-chave associadas aos textos (como faz [FEL 98]), a proposta é identificar conceitos presentes nos textos e depois aplicar as técnicas de descoberta sobre estes conceitos. Assim, seria possível diminuir o problema do vocabulário e permitir descobertas a nível de conceitos e não de palavras ou valores de atributos

    Exploring Terms and Taxonomies Relating to the Cyber International Relations Research Field: or are "Cyberspace" and "Cyber Space" the same?

    This project has at least two facets to it: (1) advancing the algorithms in the sub-field of bibliometrics often referred to as "text mining" whereby hundreds of thousands of documents (such as journal articles) are scanned and relationships amongst words and phrases are established and (2) applying these tools in support of the Explorations in Cyber International Relations (ECIR) research effort. In international relations, it is important that all the parties understand each other. Although dictionaries, glossaries, and other sources tell you what words/phrases are supposed to mean (somewhat complicated by the fact that they often contradict each other), they do not tell you how people are actually using them. As an example, when we started, we assumed that "cyberspace" and "cyber space" were essentially the same word with just a minor variation in punctuation (i.e., the space, or lack thereof, between "cyber" and "space") and that the choice of the punctuation was a rather random occurrence. With that assumption in mind, we would expect that the taxonomies that would be constructed by our algorithms using "cyberspace" and "cyber space" as seed terms would be basically the same. As it turned out, they were quite different, both in overall shape and groupings within the taxonomy. Since the overall field of cyber international relations is so new, understanding the field and how people think about (as evidenced by their actual usage of terminology, and how usage changes over time) is an important goal as part of the overall ECIR project

    The use of user-generated content for business intelligence in tourism: insights from an analysis of Croatian hotels

    Web-based peer review sites are gaining importance in travellers’ decision-making and provide information for destinations’ management. Textual reviews are especially important, but very extensive and hard to process. This article discusses the benefits of recent developments in computational linguistics and shows it can be used, based on a study of 18,000 reviews of Croatian hotels. Results show that numerical evaluation rarely provides sufficient information, while textual reviews reveal details about facilities’ competitive (dis)advantages. Being very extensive, the reviews are difficult to use. By applying computational linguistics the study illustrates how the information can be summarised and used in decision-making. The study extends the application of computational linguistics methodology to tourism literature and provides the first extensive analysis of TripAdvisor data for Croatia

    Text Mining Dan Pola Algoritma Dalam Penyelesaian Masalah Informasi : (Sebuah Ulasan)

    AbstrakText Mining bertujuan untuk menemukan informasi berharga yang tersembunyi baik dari sumber informasi terstruktur dan tidak terstruktur. Web merupakan sumber utama tempat keberadaan text yang menyimpan informasi tekstual yang tersedia bagi kita. Jumlah text ini seiring waktu terus mengalami peningkatan secara terus menerus. Text Mining merupakan suatu penemuan baru yang sebelumnya informasinya tidak diketahui. Informasi yang diekstrak dari berbagai sumber daya tertulis dilakukan secara otomatis. Elemen kuncinya adalah menghubungkan beberapa informasi yang diekstraksi menjadi satu sehingga dapat membentuk fakta baru atau hipotesis baru untuk dieksplorasi lebih lanjut. Pada makalah ini hanya fokus untuk meninjau dan memberikan ulasan tentang pola dan konsep dasar dari berbagai teknik text mining yang banyak digunakan untuk penyelesaian masalah informasi yang sudah dipulikasi oleh beberapa orang penulis. Selain itu penulis juga menunjukkan ulasan utama tentang teks mining dan tema utamanya pada sekitar tahun 1980 hingga 2010-an. yang semula berawal dari ilmu informasi menjadi sistem informasi serta merambah ke manajemen teknologi, di bidang-bidang arsitektur, dan ekologi sosial. Kata Kunci : Teks Mining, Tren, Pola, Algoritma, Informasi  AbstractText Mining to find aims valuable hidden information from both structured and unstructured sources of information. The web is the main source of textual existence which stores the textual information available to us. The amount of text is continuously increasing over time. Text Mining is a new discovery whose information was previously unknown. Information extracted from various written resources is done automatically. The key element is to connect some of the extracted information into one so that it can form new facts or new hypotheses for further exploration. This paper only focuses on reviewing and providing an overview of the basic patterns and concepts of various text mining techniques that are widely used for solving information problems that have been published by several authors. In addition, the author also shows a major review of text mining and its main themes from around the 1980s to 2010s. which originally started from information science into information systems and penetrated into technology management, in the fields of architecture, and social ecology. Keywords : Text Mining, Trends, Patterns, Algorithms, Informatio

    Mobile Operating Systems’ Impact on Customer Value: IOS vs. Android

    Amidst the growing focus on media engagement and customer value in retail marketing literature, mobile commerce (MC) research has gained prominence. This research explores how customers employ mobile operating systems to engage with retailers and extract value within the context of Fast Moving Consumer Goods (FMCG) in retail. In Study 1, a survey involving 398 users uncovered that the customer handset OS moderates the effects of social media, traditional media engagement, and retail “place” on customer value. In Study 2, leveraging data from a foreign FMCG brand deeply immersed in social media platforms, we scrutinize how such engagement dynamics affect the influence of “place” on product sales across e-commerce and conventional retail channels. Our findings make significant theoretical contributions to comprehending customer value in MC, with practical implications for marketers, emphasizing the potential of customer mobile operating system as a valuable tool for effective marketing strategies

    SWEETS: um Sistema de Recomendação de Especialistas aplicado a uma plataforma de Gestão de Conhecimento

    As organizações, com o intuito de aumentarem o seu grau de competitividade no mercado, vêm a cada instante buscando novas formas de evoluir a produtividade e a qualidade dos produtos desenvolvidos, além da diminuição de custos – que está diretamente relacionada ao aumento do faturamento líquido. Para que tais objetivos possam ser alcançados é primordial explorar ao máximo o potencial de seus colaboradores e os possíveis relacionamentos que esses colaboradores têm uns com os outros, ou seja, encontrar e partilhar conhecimento tácito. Como o conhecimento tático está na mente das pessoas, é difícil de ser formalizado e documentado, por isso, o ideal seria identificar e recomendar a pessoa que detém  o conhecimento. Diante disso, a presente dissertação apresenta o Sistema de Recomendação de Especialistas SWEETS e a sua implantação no ambiente a.m.i.g.o.s., uma plataforma de gestão de conhecimento baseada em conceitos voltados às redes sociais. O SWEETS foi desenvolvido em duas versões, 1.0 e 2.0.     A versão 1.0, de forma pró-ativa, aproxima pessoas com especialidades em comum, ora pelos seus conhecimentos (perfil de escrita), ora pelos seus interesses (perfil de leitura). Já a versão 2.0 do SWEETS não atua de forma pró-ativa, ou seja, é necessário que haja a requisição de um usuário especialista em determinada área, e é baseada em folksonomia para extração de uma ontologia, fundamental para identificar as especialidades das pessoas de forma mais eficaz. Esta ontologia é refletida pela co-ocorrência das tags (conceitos) em relação aos itens (instâncias) e é independente de domínio – principal contribuição dessa dissertação. A implantação do SWEETS no a.m.i.g.o.s. visa trazer benefícios como: minimizar o problema de comunicação na corporação, prover um incentivo ao conhecimento social e partilhar conhecimento; proporcionando, assim, à empresa, a utilização mais eficaz dos conhecimentos de seus colaboradores.Palavras Chave: Sistemas de Recomendação, Redes Sociais Web, folsonomia e ontologia

    A Visual Enhancement for Metadata Generation Tools: A Semi-Automatic Approach via KWIC and Highlighting

    This paper reports on a study that examined a visual enhancement for NC Health Info, an online health information portal for NC residents. The research goal was to improve the Health Topic assignment with a semi-automatic approach via KWIC and highlighting. The study had three components: a contextual inquiry investigating improvable areas; a prototype developed according to the contextual inquiry findings; and a comparative user study evaluating the effects of the proposed approach on the assignment of Health Topics and users' perceptions of two systems. The experiment results proved that the prototype significantly reduced the cataloging time and may potentially improve metadata quality. Additionally, measured users' perceptions of the proposed system were positive. This approach is expected not only to improve NC Health Info services but further enhance metadata generation tools in the future

    Text mining with exploitation of user\u27s background knowledge : discovering novel association rules from text

    The goal of text mining is to find interesting and non-trivial patterns or knowledge from unstructured documents. Both objective and subjective measures have been proposed in the literature to evaluate the interestingness of discovered patterns. However, objective measures alone are insufficient because such measures do not consider knowledge and interests of the users. Subjective measures require explicit input of user expectations which is difficult or even impossible to obtain in text mining environments. This study proposes a user-oriented text-mining framework and applies it to the problem of discovering novel association rules from documents. The developed system, uMining, consists of two major components: a background knowledge developer and a novel association rules miner. The background knowledge developer learns a user\u27s background knowledge by extracting keywords from documents already known to the user (background documents) and developing a concept hierarchy to organize popular keywords. The novel association rule miner discovers association rules among noun phrases extracted from relevant documents (target documents) and compares the rules with the background knowledge to predict the rule novelty to the particular user (useroriented novelty). The user-oriented novelty measure is defined as the semantic distance between the antecedent and the consequent of a rule in the background knowledge. It consists of two components: occurrence distance and connection distance. The former considers the co-occurrences of two keywords in the background documents: the more the shorter the distance. The latter considers the common connections of with others in the concept hierarchy. It is defined as the length of the connecting the two keywords in the concept hierarchy: the longer the path, distance. The user-oriented novelty measure is evaluated from two perspectives: novelty prediction accuracy and usefulness indication power. The results show that the useroriented novelty measure outperforms the WordNet novelty measure and the compared objective measures in term of predicting novel rules and identifying useful rules