Many applications rely on Web data and extraction systems to accomplish
knowledge-driven tasks. Web information is not curated, so many sources provide
inaccurate, or conflicting information. Moreover, extraction systems introduce
additional noise to the data. We wish to automatically distinguish correct data
and erroneous data for creating a cleaner set of integrated data. Previous work
has shown that a na\"ive voting strategy that trusts data provided by the
majority or at least a certain number of sources may not work well in the
presence of copying between the sources. However, correlation between sources
can be much broader than copying: sources may provide data from complementary
domains (\emph{negative correlation}), extractors may focus on different types
of information (\emph{negative correlation}), and extractors may apply common
rules in extraction (\emph{positive correlation, without copying}). In this
paper we present novel techniques modeling correlations between sources and
applying it in truth finding.Comment: Sigmod'201