Tools used to apply knowledge discovery to relational databases are focused on single tables. Unfortunately, the data needed for knowledge discovery is rarely isolated to a single relation. Rather, the data is spread out over several relations. Relevant data relations are to be joined in order to create a single relation called a Universal Relation (UR). However, from a data mining point of view, this could lead to many issues such as universal relations of unmanageable sizes. In this thesis, we consider the ...
Tools used to apply knowledge discovery to relational databases are focused on single tables. Unfortunately, the data needed for knowledge discovery is rarely isolated to a single relation. Rather, the data is spread out over several relations. Relevant data relations are to be joined in order to create a single relation called a Universal Relation (UR). However, from a data mining point of view, this could lead to many issues such as universal relations of unmanageable sizes. In this thesis, we consider the problem of knowledge discovery in multi-relation databases. In particular, we examine a knowledge discovery algorithm for multiple databases based on distributed decision tree induction, knowledge discovery algorithms based on primary and foreign keys, peculiar and surprising data, and the foreign set - which allows multi-relations mining without a primary or foreign key. Lastly, we propose extensions of these methods with the foreign set.
Overview