6 research outputs found

    Knowledge Discovery and Management within Service Centers

    Get PDF
    These days, most enterprise service centers deploy Knowledge Discovery and Management (KDM) systems to address the challenge of timely delivery of a resourceful service request resolution while efficiently utilizing the huge amount of data. These KDM systems facilitate prompt response to the critical service requests and if possible then try to prevent the service requests getting triggered in the first place. Nevertheless, in most cases, information required for a request resolution is dispersed and suppressed under the mountain of irrelevant information over the Internet in unstructured and heterogeneous formats. These heterogeneous data sources and formats complicate the access to reusable knowledge and increase the response time required to reach a resolution. Moreover, the state-of-the art methods neither support effective integration of domain knowledge with the KDM systems nor promote the assimilation of reusable knowledge or Intellectual Capital (IC). With the goal of providing an improved service request resolution within the shortest possible time, this research proposes an IC Management System. The proposed tool efficiently utilizes domain knowledge in the form of semantic web technology to extract the most valuable information from those raw unstructured data and uses that knowledge to formulate service resolution model as a combination of efficient data search, classification, clustering, and recommendation methods. Our proposed solution also handles the technology categorization of a service request which is very crucial in the request resolution process. The system has been extensively evaluated with several experiments and has been used in a real enterprise customer service center

    CARACTERIZACI脫N DE ORACIONES CLAVE DE RES脷MENES MEDIANTE MEDIDAS DE CALIDAD DE AGRUPACI脫N INTERNA

    Get PDF
    El gran aumento de informaci贸n digital compartida a trav茅s de internet y de otros medios ha hecho necesaria la creaci贸n de sistemas que permitan la generaci贸n de res煤menes autom谩ticos con el objetivo de presentar a los usuarios la informaci贸n m谩s relevante del texto o el documento, lo que permite reducir los tiempos de b煤squeda y obtenci贸n de la informaci贸n. Los res煤menes se pueden generar por diversos m茅todos, pero de forma general se clasifican en dos m茅todos. Los m茅todos abstractivos y los m茅todos extractivos. Estos 煤ltimos son los que vamos a utilizar para el prop贸sito de este trabajo. Existen t茅cnicas de generaci贸n de res煤menes extractivos que difieren en la forma de generar el resumen. Algunas de estas t茅cnicas se basan en la selecci贸n de frases similares al t铆tulo del documento, otras por la posici贸n de frases u oraciones en el texto o asignando pesos a las oraciones. Generalmente, estas t茅cnicas de generaci贸n de res煤menes son dependientes del idioma o del dominio. Por esta raz贸n se han desarrollado t茅cnicas de generaci贸n de res煤menes independientes del idioma y del dominio, estas t茅cnicas tambi茅n difieren en la forma de generar el resumen. En este trabajo se va estudiar la generaci贸n de res煤menes extractivos por agrupamiento ya que existe gran incertidumbre sobre la relaci贸n que existe entre la calidad de las agrupaciones generadas y la calidad del resumen obtenido. Debido a que estos res煤menes son generados por agrupamiento obtienen caracter铆sticas propias de los grupos, como pueden ser: compactaci贸n, separaci贸n, distribuci贸n y densidad. Por lo que algunos algoritmos de agrupaci贸n son incapaces de evaluar caracter铆sticas propias de los grupos. Por esta raz贸n en este trabajo se utilizan medidas de calidad interna de agrupaci贸n, las cuales mantienen independencia del algoritmo empleado. A trav茅s de estas medidas se eval煤a la relaci贸n que existe entre la calidad de los grupos y la calidad de los res煤menes obtenidos. Adem谩s, en este trabajo se hace un estudio para saber c贸mo afectan las caracter铆sticas de los grupos en la calidad de la agrupaci贸n. A trav茅s de los experimentos realizados se determina que dos medidas de calidad interna de agrupaci贸n pueden evaluar correctamente la relaci贸n entre la calidad de los grupos generados con la calidad de los res煤menes utilizados, as铆 como las caracter铆sticas de los grupos que son: separaci贸n, compactaci贸n, ruido, densidad y distribuci贸n. Estas medidas son el 铆ndice Silhouette y el 铆ndice Davies Bouldin

    Semantically enhanced document clustering

    Get PDF
    This thesis advocates the view that traditional document clustering could be significantly improved by representing documents at different levels of abstraction at which the similarity between documents is considered. The improvement is with regard to the alignment of the clustering solutions to human judgement. The proposed methodology employs semantics with which the conceptual similarity be-tween documents is measured. The goal is to design algorithms which implement the meth-odology, in order to solve the following research problems: (i) how to obtain multiple deter-ministic clustering solutions; (ii) how to produce coherent large-scale clustering solutions across domains, regardless of the number of clusters; (iii) how to obtain clustering solutions which align well with human judgement; and (iv) how to produce specific clustering solu-tions from the perspective of the user鈥檚 understanding for the domain of interest. The developed clustering methodology enhances separation between and improved coher-ence within clusters generated across several domains by using levels of abstraction. The methodology employs a semantically enhanced text stemmer, which is developed for the pur-pose of producing coherent clustering, and a concept index that provides generic document representation and reduced dimensionality of document representation. These characteristics of the methodology enable addressing the limitations of traditional text document clustering by employing computationally expensive similarity measures such as Earth Mover鈥檚 Distance (EMD), which theoretically aligns the clustering solutions closer to human judgement. A threshold for similarity between documents that employs many-to-many similarity matching is proposed and experimentally proven to benefit the traditional clustering algorithms in pro-ducing clustering solutions aligned closer to human judgement. 4 The experimental validation demonstrates the scalability of the semantically enhanced document clustering methodology and supports the contributions: (i) multiple deterministic clustering solutions and different viewpoints to a document collection are obtained; (ii) the use of concept indexing as a document representation technique in the domain of document clustering is beneficial for producing coherent clusters across domains; (ii) SETS algorithm provides an improved text normalisation by using external knowledge; (iv) a method for measuring similarity between documents on a large scale by using many-to-many matching; (v) a semantically enhanced methodology that employs levels of abstraction that correspond to a user鈥檚 background, understanding and motivation. The achieved results will benefit the research community working in the area of document management, information retrieval, data mining and knowledge management

    Document Clustering with Semantic Analysis

    No full text
    corecore