12,752 research outputs found
Constraints for Semistructured Data and XML
Integrity constraints play a fundamental role in database design. We review initial work on the expression of integrity constraints for semistructured data and XML
A Visual Language for Web Querying and Reasoning
As XML is increasingly being used to represent information on the Web, query and reasoning languages for such data are needed. This article argues that in contrast to the navigational approach taken in particular by XPath and XQuery, a positional approach as used in the language Xcerpt is better suited for a straightforward visual representation. The constructs of the pattern- and rule-based query language Xcerpt are introduced and it is shown how the visual representation visXcerpt renders these constructs to form a visual query language for XML
Web and Semantic Web Query Languages
A number of techniques have been developed to facilitate
powerful data retrieval on the Web and Semantic Web. Three categories
of Web query languages can be distinguished, according to the format
of the data they can retrieve: XML, RDF and Topic Maps. This article
introduces the spectrum of languages falling into these categories
and summarises their salient aspects. The languages are introduced using
common sample data and query types. Key aspects of the query
languages considered are stressed in a conclusion
FAIR data representation in times of eScience: a comparison of instance-based and class-based semantic representations of empirical data using phenotype descriptions as example
Background: The size, velocity, and heterogeneity of Big Data outclasses conventional data management tools and requires data and metadata to be fully machine-actionable (i.e., eScience-compliant) and thus findable, accessible, interoperable, and reusable (FAIR). This can be achieved by using ontologies and through representing them as semantic graphs. Here, we discuss two different semantic graph approaches of representing empirical data and metadata in a knowledge graph, with phenotype descriptions as an example. Almost all phenotype descriptions are still being published as unstructured natural language texts, with far-reaching consequences for their FAIRness, substantially impeding their overall usability within the life sciences. However, with an increasing amount of anatomy ontologies becoming available and semantic applications emerging, a solution to this problem becomes available. Researchers are starting to document and communicate phenotype descriptions through the Web in the form of highly formalized and structured semantic graphs that use ontology terms and Uniform Resource Identifiers (URIs) to circumvent the problems connected with unstructured texts. Results: Using phenotype descriptions as an example, we compare and evaluate two basic representations of empirical data and their accompanying metadata in the form of semantic graphs: the class-based TBox semantic graph approach called Semantic Phenotype and the instance-based ABox semantic graph approach called Phenotype Knowledge Graph. Their main difference is that only the ABox approach allows for identifying every individual part and property mentioned in the description in a knowledge graph. This technical difference results in substantial practical consequences that significantly affect the overall usability of empirical data. The consequences affect findability, accessibility, and explorability of empirical data as well as their comparability, expandability, universal usability and reusability, and overall machine-actionability. Moreover, TBox semantic graphs often require querying under entailment regimes, which is computationally more complex. Conclusions: We conclude that, from a conceptual point of view, the advantages of the instance-based ABox semantic graph approach outweigh its shortcomings and outweigh the advantages of the class-based TBox semantic graph approach. Therefore, we recommend the instance-based ABox approach as a FAIR approach for documenting and communicating empirical data and metadata in a knowledge graph
Contrastive Grouping with Transformer for Referring Image Segmentation
Referring image segmentation aims to segment the target referent in an image
conditioning on a natural language expression. Existing one-stage methods
employ per-pixel classification frameworks, which attempt straightforwardly to
align vision and language at the pixel level, thus failing to capture critical
object-level information. In this paper, we propose a mask classification
framework, Contrastive Grouping with Transformer network (CGFormer), which
explicitly captures object-level information via token-based querying and
grouping strategy. Specifically, CGFormer first introduces learnable query
tokens to represent objects and then alternately queries linguistic features
and groups visual features into the query tokens for object-aware cross-modal
reasoning. In addition, CGFormer achieves cross-level interaction by jointly
updating the query tokens and decoding masks in every two consecutive layers.
Finally, CGFormer cooperates contrastive learning to the grouping strategy to
identify the token and its mask corresponding to the referent. Experimental
results demonstrate that CGFormer outperforms state-of-the-art methods in both
segmentation and generalization settings consistently and significantly.Comment: Accepted by CVPR 202
Context Disentangling and Prototype Inheriting for Robust Visual Grounding
Visual grounding (VG) aims to locate a specific target in an image based on a
given language query. The discriminative information from context is important
for distinguishing the target from other objects, particularly for the targets
that have the same category as others. However, most previous methods
underestimate such information. Moreover, they are usually designed for the
standard scene (without any novel object), which limits their generalization to
the open-vocabulary scene. In this paper, we propose a novel framework with
context disentangling and prototype inheriting for robust visual grounding to
handle both scenes. Specifically, the context disentangling disentangles the
referent and context features, which achieves better discrimination between
them. The prototype inheriting inherits the prototypes discovered from the
disentangled visual features by a prototype bank to fully utilize the seen
data, especially for the open-vocabulary scene. The fused features, obtained by
leveraging Hadamard product on disentangled linguistic and visual features of
prototypes to avoid sharp adjusting the importance between the two types of
features, are then attached with a special token and feed to a vision
Transformer encoder for bounding box regression. Extensive experiments are
conducted on both standard and open-vocabulary scenes. The performance
comparisons indicate that our method outperforms the state-of-the-art methods
in both scenarios. {The code is available at
https://github.com/WayneTomas/TransCP
One for All: One-stage Referring Expression Comprehension with Dynamic Reasoning
Referring Expression Comprehension (REC) is one of the most important tasks
in visual reasoning that requires a model to detect the target object referred
by a natural language expression. Among the proposed pipelines, the one-stage
Referring Expression Comprehension (OSREC) has become the dominant trend since
it merges the region proposal and selection stages. Many state-of-the-art OSREC
models adopt a multi-hop reasoning strategy because a sequence of objects is
frequently mentioned in a single expression which needs multi-hop reasoning to
analyze the semantic relation. However, one unsolved issue of these models is
that the number of reasoning steps needs to be pre-defined and fixed before
inference, ignoring the varying complexity of expressions. In this paper, we
propose a Dynamic Multi-step Reasoning Network, which allows the reasoning
steps to be dynamically adjusted based on the reasoning state and expression
complexity. Specifically, we adopt a Transformer module to memorize & process
the reasoning state and a Reinforcement Learning strategy to dynamically infer
the reasoning steps. The work achieves the state-of-the-art performance or
significant improvements on several REC datasets, ranging from RefCOCO (+, g)
with short expressions, to Ref-Reasoning, a dataset with long and complex
compositional expressions.Comment: 27 pages, 6 figure
- …