Search CORE

469,010 research outputs found

Fragment-based Pretraining and Finetuning on Molecular Graphs

Author: Luong Kha-Dinh
Singh Ambuj
Publication venue
Publication date: 27/10/2023
Field of study

Property prediction on molecular graphs is an important application of Graph Neural Networks. Recently, unlabeled molecular data has become abundant, which facilitates the rapid development of self-supervised learning for GNNs in the chemical domain. In this work, we propose pretraining GNNs at the fragment level, a promising middle ground to overcome the limitations of node-level and graph-level pretraining. Borrowing techniques from recent work on principal subgraph mining, we obtain a compact vocabulary of prevalent fragments from a large pretraining dataset. From the extracted vocabulary, we introduce several fragment-based contrastive and predictive pretraining tasks. The contrastive learning task jointly pretrains two different GNNs: one on molecular graphs and the other on fragment graphs, which represents higher-order connectivity within molecules. By enforcing consistency between the fragment embedding and the aggregated embedding of the corresponding atoms from the molecular graphs, we ensure that the embeddings capture structural information at multiple resolutions. The structural information of fragment graphs is further exploited to extract auxiliary labels for graph-level predictive pretraining. We employ both the pretrained molecular-based and fragment-based GNNs for downstream prediction, thus utilizing the fragment information during finetuning. Our graph fragment-based pretraining (GraphFP) advances the performances on 5 out of 8 common molecular benchmarks and improves the performances on long-range biological benchmarks by at least 11.5%. Code is available at: https://github.com/lvkd84/GraphFP.Comment: 18 pages, 4 figures, published in NeurIPS 202

arXiv.org e-Print Archive

Intersection theorems on structures

Author: Abbott
de Bruijn
Deza
Deza
Erdös
Erdös Chao Ko
Fisher
Frankl
Katona
Lubell
Ray-Chaudhuri
Simonovits
Sperner
Sós
Publication venue: 'Elsevier BV'
Publication date: 01/01/1980
Field of study

All the graphs considered are simple, i.e., without loops or multiple edges. The intersection of two graphs is just the graph formed by the edges common to both of them. Let K be a family of graphs and n a positive integer. Then f(n,K) denotes the maximum number of distinct (labeled) graphs on n vertices such that the interesection of any two is in K. The authors first investigated this function in a previous paper [Combinatorics, II (Keszthely, 1976), pp. 1017–1030, North-Holland, Amsterdam, 1978; MR0519324 (80i:05062b)]. In particular, they proved that if K consists of all the subdivisions of K for given finite graphs, then f(n,K) is polynomially bounded. The present paper reviews the main results and reports on further progress, but contains no proofs. Let us mention a very recent open problem, due to the second author: Let T be the class of all graphs which contain a triangle. Prove or disprove f(n,T)=2(n2)−3. Apart from graphs, the authors consider intervals and arithmetic progressions. Those problems were introduced by R. L. Graham and the authors [J. Combin. Theory Ser. A 28 (1980), no. 1, 106–110]. Among other results, they have proved that one cannot do better than taking all the subsets of size ≤3 of the integers {1,⋯,n}, if the pairwise intersections have to be arithmetic progressions. The authors prove that if the intersection has to be an arithmetic progression of length at least k, k≥2, then the optimal bound is (π2/24+o(1))n2. The case k=1 is still open

Crossref

Repository of the Academy's Library

Robust state estimation methods for robotics applications

Author: Das Shounak
Publication venue: The Research Repository @ WVU
Publication date: 01/01/2023
Field of study

State estimation is an integral component of any autonomous robotic system. Finding the correct position, velocity, and orientation of an agent in its environment enables it to do other tasks like mapping and interacting with the environment, and collaborating with other agents. State estimation is achieved by using data obtained from multiple sensors and fusing them in a probabilistic framework. These include inertial data from Inertial Measurement Unit (IMU), images from camera, range data from lidars, and positioning data from Global Navigation Satellite Systems (GNSS) receivers. The main challenge faced in sensor-based state estimation is the presence of noisy, erroneous, and even lack of informative data. Some common examples of such situations include wrong feature matching between images or point clouds, false loop-closures due to perceptual aliasing (different places that look similar can confuse the robot), presence of dynamic objects in the environment (odometry algorithms assume a static environment), multipath errors for GNSS (signals for satellites jumping off tall structures like buildings before reaching receivers) and more. This work studies existing and new ways of how standard estimation algorithms like the Kalman filter and factor graphs can be made robust to such adverse conditions without losing performance in ideal outlier-free conditions. The first part of this work demonstrates the importance of robust Kalman filters on wheel-inertial odometry for high-slip terrain. Next, inertial data is integrated into GNSS factor graphs to improve the accuracy and robustness of GNSS factor graphs. Lastly, a combined framework for improving the robustness of non-linear least squares and estimating the inlier noise threshold is proposed and tested with point cloud registration and lidar-inertial odometry algorithms followed by an algorithmic analysis of optimizing generalized robust cost functions with factor graphs for GNSS positioning problem

The Research Repository @ WVU (West Virginia University)

Blind identification of an unknown interleaved convolutional code

Author: Tixier Audrey
Publication venue
Publication date: 15/01/2015
Field of study

We give here an efficient method to reconstruct the block interleaver and recover the convolutional code when several noisy interleaved codewords are given. We reconstruct the block interleaver without assumption on its structure. By running some experimental tests we show the efficiency of this method even with moderate noise

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

Tur\`an numbers of Multiple Paths and Equibipartite Trees

Author: Bollobás
Kopylov
NATHAN KETTLE
NEAL BUSHAW
Simonovits
Turán
Turán
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 21/08/2011
Field of study

The Tur\'an number of a graph H, ex(n;H), is the maximum number of edges in any graph on n vertices which does not contain H as a subgraph. Let P_l denote a path on l vertices, and kP_l denote k vertex-disjoint copies of P_l. We determine ex(n, kP_3) for n appropriately large, answering in the positive a conjecture of Gorgol. Further, we determine ex (n, kP_l) for arbitrary l, and n appropriately large relative to k and l. We provide some background on the famous Erd\H{o}s-S\'os conjecture, and conditional on its truth we determine ex(n;H) when H is an equibipartite forest, for appropriately large n.Comment: 17 pages, 13 figures; Updated to incorporate referee's suggestions; minor structural change

arXiv.org e-Print Archive

Crossref