238 research outputs found
Web-scale web table to knowledge base matching
Millions of relational HTML tables are found on the World Wide Web. In contrast
to unstructured text, relational web tables provide a compact representation of entities
described by attributes. The data within these tables covers a broad topical
range. Web table data is used for question answering, augmentation of search results,
and knowledge base completion. Until a few years ago, only search engines
companies like Google and Microsoft owned large web crawls from which web
tables are extracted. Thus, researches outside the companies have not been able to
work with web tables.
In this thesis, the first publicly available web table corpus containing millions of
web tables is introduced. The corpus enables interested researchers to experiment
with web tables. A profile of the corpus is created to give insights to the characteristics
and topics. Further, the potential of web tables for augmenting cross-domain
knowledge bases is investigated. For the use case of knowledge base augmentation,
it is necessary to understand the web table content. For this reason, web
tables are matched to a knowledge base. The matching comprises three matching
tasks: instance, property, and class matching. Existing web table to knowledge
base matching systems either focus on a subset of these matching tasks or are evaluated
using gold standards which also only cover a subset of the challenges that
arise when matching web tables to knowledge bases.
This thesis systematically evaluates the utility of a wide range of different features
for the web table to knowledge base matching task using a single gold standard.
The results of the evaluation are used afterwards to design a holistic matching
method which covers all matching tasks and outperforms state-of-the-art web table
to knowledge base matching systems. In order to achieve these goals, we first
propose the T2K Match algorithm which addresses all three matching tasks in an
integrated fashion. In addition, we introduce the T2D gold standard which covers
a wide variety of challenges. By evaluating T2K Match against the T2D gold standard,
we identify that only considering the table content is insufficient. Hence, we
include features of three categories: features found in the table, in the table context
like the page title, and features that base on external resources like a synonym dictionary.
We analyze the utility of the features for each matching task. The analysis
shows that certain problems cannot be overcome by matching each table in isolation
to the knowledge base. In addition, relying on the features is not enough for the
property matching task. Based on these findings, we extend T2K Match into T2K
Match++ which exploits indirect matches to web tables about the same topic and
uses knowledge derived from the knowledge base. We show that T2K Match++
outperforms all state-of-the-art web table to knowledge base matching approaches
on the T2D and Limaye gold standard. Most systems show good results on one
matching task but T2K Match++ is the only system that achieves F-measure scores
above 0:8 for all tasks. Compared to results of the best performing system TableMiner+,
the F-measure for the difficult property matching task is increased by 0.08,
for the class and instance matching task by 0.05 and 0.03, respectively
Data Enrichment in Discovery Systems using Linked Data : Vortrag im Rahmen der Jahrestagung der Deutschen Gesellschaft fĂŒr Klassifikation (GfKl)
Recommended from our members
Redesigning a Student Success Course for Sustained Impact: Early Outcomes Findings
Many community colleges offer a "student success" courseâalso known as College 101 or Introduction to Collegeâas a means to help incoming students transition to college and become successful. The typical course is meant to provide key information and address important noncognitive skills and behavioral expectations with the goal of familiarizing students with the college environment and giving them the tools they need to build important competencies, persist in college, and earn a credential. This paper examines the efforts of Bronx Community College in implementing a redesigned student success course called First Year Seminar (FYS), which is intended to better support students than a typical student success course by incorporating academic content, skill-building exercises, and applied teaching pedagogies, among other features, into the course.
Based on both qualitative and quantitative analysis, this study finds that FYS participation is associated with positive student outcomes that appear to be sustained for a longer period of time than what is typically found for students taking a traditional student success course. The focus of FYS on student-centered pedagogy and on integrated course content appears to be beneficial. The findings also suggest that when students have the opportunity to practice student success and basic academic skills within the context of an improved student success course, they are likely to apply those skills in future courses, potentially increasing their long-term educational attainment
Extending Tables with Data from over a Million Websites
Abstract. This Big Data Track submission demonstrates how the BTC 2014 dataset, Microdata annotations from thousands of websites, as well as millions of HTML tables are used to extend local tables with additional columns. Ta-ble extension is a useful operation within a wide range of application scenarios: Imagine you are an analyst having a local table describing companies and you want to extend this table with the headquarter of each company. Or imagine you are a film enthusiast and want to extend a table describing films with attributes like director, genre, and release date of each film. The Mannheim Search Joins Engine automatically performs such table extension operations based on a large data corpus gathered from over a million websites that publish structured data in various formats. Given a local table, the Mannheim Search Joins Engine searches the corpus for additional data describing the entities of the input table. The dis-covered data are then joined with the local table and their content is consolidated using schema matching and data fusion methods. As result, the user is presented with an extended table and given the opportunity to examine the provenance o
Automated scoring of pre-REM sleep in mice with deep learning
Reliable automation of the labor-intensive manual task of scoring animal
sleep can facilitate the analysis of long-term sleep studies. In recent years,
deep-learning-based systems, which learn optimal features from the data,
increased scoring accuracies for the classical sleep stages of Wake, REM, and
Non-REM. Meanwhile, it has been recognized that the statistics of transitional
stages such as pre-REM, found between Non-REM and REM, may hold additional
insight into the physiology of sleep and are now under vivid investigation. We
propose a classification system based on a simple neural network architecture
that scores the classical stages as well as pre-REM sleep in mice. When
restricted to the classical stages, the optimized network showed
state-of-the-art classification performance with an out-of-sample F1 score of
0.95 in male C57BL/6J mice. When unrestricted, the network showed lower F1
scores on pre-REM (0.5) compared to the classical stages. The result is
comparable to previous attempts to score transitional stages in other species
such as transition sleep in rats or N1 sleep in humans. Nevertheless, we
observed that the sequence of predictions including pre-REM typically
transitioned from Non-REM to REM reflecting sleep dynamics observed by human
scorers. Our findings provide further evidence for the difficulty of scoring
transitional sleep stages, likely because such stages of sleep are
under-represented in typical data sets or show large inter-scorer variability.
We further provide our source code and an online platform to run predictions
with our trained network.Comment: 14 pages, 5 figure
Role of alkali cations for the excited state dynamics of liquid water near the surface
Time-resolved liquid jet photoelectron spectroscopy was used to explore the excited state dynamics at the liquid water surface in the presence of alkali cations. The data were evaluated with the help of ab initio calculations on alkali-water clusters and an extension of these results on the basis of the dielectric continuum model: 160nm, sub-20fs vacuum ultraviolet pulses excite water molecules in the solvent shell of Na + or K + cations and evolve into a transient hydrated complex of alkali-ion and electron. The vertical ionization energy of this transient is about 2.5eV, significantly smaller than that of the solvated electron. ??? 2012 American Institute of Physics.open1
The Effects of 5-Hydroxytryptophan in Combination with Different Fatty Acids on Gastrointestinal Functions: A Pilot Experiment
Background. Fat affects gastric emptying (GE). 5-Hydroxythryptophan (5-HTP) is involved in central and peripheral satiety mechanisms. Influence of 5-HTP in addition to saturated or monounsaturated fatty acids (FA) on GE and hormone release was investigated. Subjects/Methods. 24 healthy individuals (12f : 12m, 22-29 years, BMI 19-25.7 kg/m(2)) were tested on 4 days with either 5-HTP + short-chain saturated FA (butter), placebo + butter, 5-HTP + monounsaturated FA (olive oil), or placebo + olive oil in double-blinded randomized order. Two hours after FA/5-HTP or placebo intake, a C-13 octanoid acid test was conducted. Cortisol, serotonin, cholecystokinin (CCK), and ghrelin were measured, as were mood and GE. Results. GE was delayed with butter and was normal with olive (P < 0.05) but not affected by 5-HTP. 5-HTP supplementation did not affect serotonin levels. Food intake increased plasma CCK (F = 6.136; P < 0.05) irrespective of the FA. Ghrelin levels significantly decreased with oil/5-HTP (F = 9.166; P < 0.001). The diurnal cortisol profile was unaffected by FA or 5-HTP, as were ratings of mood, hunger, and stool urgency. Conclusion. Diverse FAs have different effects on GE and secretion of orexigenic and anorexigenic hormones. Supplementation of 5-HTP had no effect on plasma serotonin and central functions. Further studies are needed to explain the complex interplay
Serotonin Receptor Type 3 Antagonists Improve Obesity-Associated Fatty Liver Disease in Mice
Roux-En-Y Gastric Bypass (RYGB) Surgery during High Liquid Sucrose Diet Leads to Gut Microbiota-Related Systematic Alterations
Roux-en-Y gastric bypass (RYGB) surgery has been proven successful in weight loss and improvement of co-morbidities associated with obesity. Chronic complications such as malabsorption of micronutrients in up to 50% of patients underline the need for additional therapeutic approaches. We investigated systemic RYGB surgery effects in a liquid sucrose diet-induced rat obesity model. After consuming a diet supplemented with high liquid sucrose for eight weeks, rats underwent RYGB or control sham surgery. RYGB, sham pair-fed, and sham ad libitum-fed groups further continued on the diet after recovery. Notable alterations were revealed in microbiota composition, inflammatory markers, feces, liver, and plasma metabolites, as well as in brain neuronal activity post-surgery. Higher fecal 4-aminobutyrate (GABA) correlated with higher Bacteroidota and Enterococcus abundances in RYGB animals, pointing towards the altered enteric nervous system (ENS) and gut signaling. Favorable C-reactive protein (CRP), serine, glycine, and 3-hydroxybutyrate plasma profiles in RYGB rats were suggestive of reverted obesity risk. The impact of liquid sucrose diet and caloric restriction mainly manifested in fatty acid changes in the liver. Our multi-modal approach reveals complex systemic changes after RYGB surgery and points towards potential therapeutic targets in the gut-brain system to mimic the surgery mode of action
- âŠ