1 research outputs found
Discovering Domain Orders through Order Dependencies
Much real-world data come with explicitly defined domain orders; e.g.,
lexicographic order for strings, numeric for integers, and chronological for
time. Our goal is to discover implicit domain orders that we do not already
know; for instance, that the order of months in the Lunar calendar is Corner <
Apricot < Peach, and so on. To do so, we enhance data profiling methods by
discovering implicit domain orders in data through order dependencies (ODs). We
first identify tractable special cases and then proceed towards the most
general case, which we prove is NP-complete. Nevertheless, we show that the
general case can be effectively handled by a SAT solver. We also propose an
interestingness measure to rank the discovered implicit domain orders. Finally,
we report on the results of an experimental evaluation using real-world
datasets