63 research outputs found
Penggabungan Keputusan Pada Klasifikasi Multi-label
Klasifikasi adalah bagian dari sistem pembelajar yang fokus pada pemahaman pola melalui representasi dan generalisasi data. Penentuan prediksi hasil klasifikasi terbaik menjadi masalah jika terdapat beberapa masukan dari metode yang berbeda-beda pada lingkungan data yang heterogen. Penggabungan keputusan dapat digunakan untuk menentukan rekomendasi keluaran beberapa metode klasifikasi. Kami memilih pendekatan voting dan meta-learning sebagai metode penggabungan keputusan. Ada dua fase yang dilakukan pada penelitian ini, yaitu fase pembangunan prediksi oleh metode klasifikasi yang heterogen dan fase penggabungan rekomendasi metode-metode tersebut menjadi satu kesimpulan jawaban. Karakteristik klasifikasi yang menjadi fokus adalah klasifikasi multi-label. Binary Relevance (BR), Classifier Chains (CC), Hierarchichal of Multi-label Classifier (HOMER), dan Multi-label k Nearest Neighbors (MLkNN) adalah metode klasifikasi yang digunakan sebagai penyedia rekomendasi prediksi melalui pendekatan yang berbeda-beda. Pada fase penggabungan keputusan, metode Ignore diajukan sebagai pendekatan meta-learning. Ignore menggabungkan keputusan dengan cara mempelajari pola masukan dari sistem pembelajar. Untuk membandingkan kinerja Ignore, metode konsensus digunakan sebagai pendekatan voting. Hasil akhir menunjukkan bahwa Ignore memberikan hasil terbaik untuk parameter recall. Ignore memprediksi nilai false negative lebih sedikit dibandingkan dengan metode konsensus 0,5 dan 0,75. Hasil studi ini menunjukkan bahwa Ignore dapat digunakan sebagai meta-learning, meskipun kinerja Ignore harus diperbaiki agar dapat beradaptasi dengan data yang heterogen
Web Services Discovery and Recommendation Based on Information Extraction and Symbolic Reputation
This paper shows that the problem of web services representation is crucial
and analyzes the various factors that influence on it. It presents the
traditional representation of web services considering traditional textual
descriptions based on the information contained in WSDL files. Unfortunately,
textual web services descriptions are dirty and need significant cleaning to
keep only useful information. To deal with this problem, we introduce rules
based text tagging method, which allows filtering web service description to
keep only significant information. A new representation based on such filtered
data is then introduced. Many web services have empty descriptions. Also, we
consider web services representations based on the WSDL file structure (types,
attributes, etc.). Alternatively, we introduce a new representation called
symbolic reputation, which is computed from relationships between web services.
The impact of the use of these representations on web service discovery and
recommendation is studied and discussed in the experimentation using real world
web services
Expansion de requêtes à base de motifs et de Word Embeddings pour améliorer la recherche de microblogs
International audienceSocial microblogging services have an especially significant role in our society. Twitter is one of the most popular microblogging sites used by people to find relevant information (e.g., breaking news, popular trends, information about people of interest, etc). In this context, retrieving information from such data has recently gained growing attention and opening new challenges. However, the size of such data and queries is usually short and may impact the search result. Query Expansion (QE) has the main task in this issue. In fact, words can have different meanings where only one is used for a given context. In this paper, we propose a QE method by considering the meaning of the context. Thus, we use patterns and Word Embeddings to expand users' queries. We experiment and evaluate the proposed method on the TREC dataset. Results show the effectiveness of the proposed approach and signify the combination of patterns and word embedding for enhanced microblog retrieval.Les services sociaux de microblogging jouent un rôle important dans notre société. Twitter est l'une des plateformes de microblogging les plus populaires, utilisées par les internautes pour trouver des informations pertinentes (sujets d'actualité, tendances populaires, informations sur certains internautes, etc.). Dans ce contexte, la recherche d'information provenant de telles données a récemment gagné un intérêt majeur et ouvert de nouveaux défis. Cependant, la taille de ces données ainsi que des requêtes est généralement courte et peut avoir un impact sur le résultat de la recherche. Cette dernière peut être améliorée à l'aide de l'expansion de requêtes. En effet, les mots peuvent avoir plusieurs sens dont un seul est utilisé pour un contexte donné. Dans cet article, nous proposons une méthode d'expansion de requêtes prenant en compte le sens du contexte. Nous utilisons les motifs et les plongements de mots pour étendre les requêtes des utilisateurs. L'évaluation expérimentale de la méthode proposée est menée sur la collection TREC. Les résultats montrent l'efficacité de l'approche en combinant des motifs avec des plongements de mots pour améliorer significativement la recherche de microblog
OLGA SÁNCHEZ RODRÍGUEZ [Material gráfico]
ÁLBUM FAMILIAR CASA DE COLÓNCopia digital. Madrid : Ministerio de Educación, Cultura y Deporte. Subdirección General de Coordinación Bibliotecaria, 201
An Efficient Architecture for Information Retrieval in P2P Context Using Hypergraph
Peer-to-peer (P2P) Data-sharing systems now generate a significant portion of
Internet traffic. P2P systems have emerged as an accepted way to share enormous
volumes of data. Needs for widely distributed information systems supporting
virtual organizations have given rise to a new category of P2P systems called
schema-based. In such systems each peer is a database management system in
itself, ex-posing its own schema. In such a setting, the main objective is the
efficient search across peer databases by processing each incoming query
without overly consuming bandwidth. The usability of these systems depends on
successful techniques to find and retrieve data; however, efficient and
effective routing of content-based queries is an emerging problem in P2P
networks. This work was attended as an attempt to motivate the use of mining
algorithms in the P2P context may improve the significantly the efficiency of
such methods. Our proposed method based respectively on combination of
clustering with hypergraphs. We use ECCLAT to build approximate clustering and
discovering meaningful clusters with slight overlapping. We use an algorithm
MTMINER to extract all minimal transversals of a hypergraph (clusters) for
query routing. The set of clusters improves the robustness in queries routing
mechanism and scalability in P2P Network. We compare the performance of our
method with the baseline one considering the queries routing problem. Our
experimental results prove that our proposed methods generate impressive levels
of performance and scalability with with respect to important criteria such as
response time, precision and recall.Comment: 2o pages, 8 figure
Conférence Internationale Francophone sur la Science des Données (CIFSD) Actes de la 9e édition
International audienceLes actes de la 9e édition de la Conférence Internationale Francophone sur la Science des Données (CIFSD, https://cifsd-2021.sciencesconf.org) regroupe l'ensemble des contributions présentées à la conférence entre le 9 et le 11 juin 2021. Cette édition a été organisée par Aix-Marseille Université et le Laboratoire d'Informatique et Systèmes (LIS UMR 7020). En raison de la situation sanitaire, elle s'est déroulée en distanciel depuis Marseille (France). La thématique mise en avant pour cette édition a été la science de données pour la santé
- …