498 research outputs found
Similarity Matching Techniques For Fault Diagnosis In Automotive Infotainment Electronics
Fault diagnosis has become a very important area of research during the last decade due to the advancement of mechanical and electrical systems in industries. The automobile is a crucial field where fault diagnosis is given a special attention. Due to the increasing complexity and newly added features in vehicles, a comprehensive study has to be performed in order to achieve an appropriate diagnosis model. A diagnosis system is capable of identifying the faults of a system by investigating the observable effects (or symptoms). The system categorizes the fault into a diagnosis class and identifies a probable cause based on the supplied fault symptoms. Fault categorization and identification are done using similarity matching techniques. The development of diagnosis classes is done by making use of previous experience, knowledge or information within an application area. The necessary information used may come from several sources of knowledge, such as from system analysis. In this paper similarity matching techniques for fault diagnosis in automotive infotainment applications are discussed
INDEPENDENT DE-DUPLICATION IN DATA CLEANING
Many organizations collect large amounts of data to support their business and decision-making processes. The data originate from a variety of sources that may have inherent data-quality problems. These problems become more pronounced when heterogeneous data sources are integrated (for example, in data warehouses). A major problem that arises from integrating different databases is the existence of duplicates. The challenge of de-duplication is identifying “equivalent” records within the database. Most published research in de-duplication propose techniques that rely heavily on domain knowledge. A few others propose solutions that are partially domain-independent. This paper identifies two levels of domain-independence in de-duplication namely: domain-independence at the attribute level, and domain-independence at the record level. The paper then proposes a positional algorithm that achieves domain-independent de-duplication at the attribute level, and a technique for field weighting by data profiling, which, when used with the positional algorithm, achieves domain-independence at the record level. Experiments show that the proposed techniques achieve more accurate de-duplication than the existing algorithms
IndustReal: A Dataset for Procedure Step Recognition Handling Execution Errors in Egocentric Videos in an Industrial-Like Setting
Although action recognition for procedural tasks has received notable
attention, it has a fundamental flaw in that no measure of success for actions
is provided. This limits the applicability of such systems especially within
the industrial domain, since the outcome of procedural actions is often
significantly more important than the mere execution. To address this
limitation, we define the novel task of procedure step recognition (PSR),
focusing on recognizing the correct completion and order of procedural steps.
Alongside the new task, we also present the multi-modal IndustReal dataset.
Unlike currently available datasets, IndustReal contains procedural errors
(such as omissions) as well as execution errors. A significant part of these
errors are exclusively present in the validation and test sets, making
IndustReal suitable to evaluate robustness of algorithms to new, unseen
mistakes. Additionally, to encourage reproducibility and allow for scalable
approaches trained on synthetic data, the 3D models of all parts are publicly
available. Annotations and benchmark performance are provided for action
recognition and assembly state detection, as well as the new PSR task.
IndustReal, along with the code and model weights, is available at:
https://github.com/TimSchoonbeek/IndustReal .Comment: Accepted for WACV 2024. 15 pages, 9 figures, including supplementary
material
Pairwise sequence alignment with block and character edit operations
Pairwise sequence comparison is one of the most fundamental problems in
string processing. The most common metric to quantify the similarity between
sequences S and T is edit distance, d(S,T), which corresponds to the number of
characters that need to be substituted, deleted from, or inserted into S to
generate T. However, fewer edit operations may be sufficient for some string
pairs to transform one string to the other if larger rearrangements are
permitted. Block edit distance refers to such changes in substring level (i.e.,
blocks) that "penalizes" entire block removals, insertions, copies, and
reversals with the same cost as single-character edits (Lopresti & Tomkins,
1997). Most studies to calculate block edit distance to date aimed only to
characterize the distance itself for applications in sequence nearest neighbor
search without reporting the full alignment details. Although a few tools try
to solve block edit distance for genomic sequences, such as GR-Aligner, they
have limited functionality and are no longer maintained.
Here, we present SABER, an algorithm to solve block edit distance that
supports block deletions, block moves, and block reversals in addition to the
classical single-character edit operations. Our algorithm runs in
O(m^2.n.l_range) time for |S|=m, |T|=n and the permitted block size range of
l_range; and can report all breakpoints for the block operations. We also
provide an implementation of SABER currently optimized for genomic sequences
(i.e., generated by the DNA alphabet), although the algorithm can theoretically
be used for any alphabet.
SABER is available at http://github.com/BilkentCompGen/sabe
- …