    ExpFinder: An Ensemble Expert Finding Model Integrating NN-gram Vector Space Model and μ\muCO-HITS

    Finding an expert plays a crucial role in driving successful collaborations and speeding up high-quality research development and innovations. However, the rapid growth of scientific publications and digital expertise data makes identifying the right experts a challenging problem. Existing approaches for finding experts given a topic can be categorised into information retrieval techniques based on vector space models, document language models, and graph-based models. In this paper, we propose ExpFinder\textit{ExpFinder}, a new ensemble model for expert finding, that integrates a novel NN-gram vector space model, denoted as nnVSM, and a graph-based model, denoted as \textit{\muCO-HITS}, that is a proposed variation of the CO-HITS algorithm. The key of nnVSM is to exploit recent inverse document frequency weighting method for NN-gram words and ExpFinder\textit{ExpFinder} incorporates nnVSM into \textit{\muCO-HITS} to achieve expert finding. We comprehensively evaluate ExpFinder\textit{ExpFinder} on four different datasets from the academic domains in comparison with six different expert finding models. The evaluation results show that ExpFinder\textit{ExpFinder} is a highly effective model for expert finding, substantially outperforming all the compared models in 19% to 160.2%.Comment: 15 pages, 18 figures, "for source code on Github, see https://github.com/Yongbinkang/ExpFinder", "Submitted to IEEE Transactions on Knowledge and Data Engineering

    Reliability assessment of rock slopes by evidence theory

    El objetivo de este proyecto de investigación es desarrollar una metodología para efectuar análisis de confiabilidad de la estabilidad de taludes rocosos, teniendo en cuenta la incertidumbre cuando la información sobre los parámetros geomecánicos de entrada es limitada. En mecánica de rocas, los métodos determinísticos y probabilísticos son ampliamente utilizados en el proceso de toma decisiones. No obstante, el primero no considera la incertidumbre y el segundo tiene limitaciones para representar la incertidumbre epistémica y tiene que asumir la distribución de probabilidad de las variables de entrada. Por lo tanto, se recurre a la Teoría de la Evidencia como una herramienta para describir la incertidumbre aleatoria y epistémica de los parámetros geomecánicos y propagarla a través de modelos de equilibrio límite, en los que la geometría es controlada por la orientación de las discontinuidades. Para llevar a cabo una mejor descripción de la variabilidad en el macizo, el proyecto utilizó fotogrametría de corto alcance, lo que permitió obtener series de datos robustas y confiables de la geometría de las discontinuidades, que fue modelada como una variable aleatoria con distribución Kent. Además, se desarrolló un procedimiento para actualizar los análisis de confiabilidad teniendo en cuenta la distribución de probabilidad de la orientación de las discontinuidades. La aplicación de la metodología en un talud rocoso de una mina de arenisca mostró su aplicabilidad a proyectos reales. Consecuentemente, la principal contribución de este trabajo es la generación de un marco de referencia para efectuar la evolución de confiabilidad de taludes rocoso basado en la teoría de la evidencia que permite combinar las series robustas de la orientación de los planos de discontinuidad, con información limitada de sus parámetros de resistencia, que puede ser actualizada a medida que se genera nueva información.This research project aims to develop a methodology to perform rock slope stability analysis considering the aleatory and epistemic uncertainty when the information on geomechanical parameters is limited. In rock mechanics, deterministic and probabilistic approaches are widely used in the decision-making process. However, the earlier does not consider the uncertainty, and the latter has limitations to account for the epistemic uncertainty and requires assumptions on probability distributions when robust data sets are not available. Therefore, we resorted to the Evidence Theory as a tool to describe the epistemic and aleatory uncertainty of input geomechanical variables and propagate them trough limit equilibrium models, in which the geometry is controlled by the joints orientation. To perform a better description of the variability of the rock mas properties, the project utilized a short-range photogrammetry system, which allowed us to have robust and reliable data sets on joints geometry to be modeled as Kent distributed variables. Besides, we suggested a procedure to update the reliability analysis acknowledging that orientations follow a Kent distribution. The application of the methodology to a rock slope in a sandstone mine showed its suitability to be applied in actual engineering projects. Consequently, the main contribution of this project is an rock slope evidence theory reliability-based framework for combining robust data sets on joints orientation, with limited information on geomechanical parameters, that can be updated as new information is available.ColcienciasAnalisis Cuantitativo de Riesgo en Taludes MinerosLínea de Investigación: Geotecnia y Riesgos Geo ambientalesDoctorad

    Modeling Scholar Profile in Expert Recommendation based on Multi-Layered Bibliographic Graph

    A recommendation system requires the profile of researchers which called here as Scholar Profile for suggestions based on expertise. This dissertation contributes on modeling unbiased scholar profile for more objective expertise evidence that consider interest changes and less focused on citations. Interest changes lead to diverse topics and make the expertise levels on topics differ. Scholar profile is expected to capture expertise in terms of productivity aspect which often signified from the volume of publications and citations. We include researcher behavior in publishing articles to avoid misleading citation. Therefore, the expertise levels of researchers on topics is influenced by interest evolution, productivity, dynamicity, and behavior extracted from bibliographic data of published scholarly articles. As this dissertation output, the scholar profile model employed within a recommendation system for recommending productive researchers who provide academic guidance. The scholar profile is generated from multi layers of bibliographic data, such as layers of author, topic, and relations between those layers to represent academic social network. There is no predefined information of topics in a cold-start situation, such that procedures of topic mapping are necessary. Then, features of productivity, dynamicity and behavior of researchers within those layers are taken from some observed years to accommodate the behavior aspect. We experimented with AMiner dataset often used in the following bibliographic data related studies to empirically investigate: (a) topic mapping strategies to obtain interest of researchers, (b) feature extraction model for productivity, dynamicity, and behavior aspects based on the mapped topics, and (c) expertise rank that considers interest changes and less focused on citations from the scholar profile. Ensuring the validity results, our experiments worked on standard expert list of AMiner researchers. We selected Natural Language Processing and Information Extraction (NLP-IE) domains because of their familiarity and interrelated context to make it easier for introducing cases of interest changes. Using the mapped topics, we also made minor contributions on transformation procedures for visualizing researchers on maps of Scopus subjects and investigating the possibilities of conflict of interest