Y is lossless-‐join w.r.t. a set of FDs F if, for every instance r that saBsfies F: πX (r) join πY (r) = r • In general one direcBon πX (r) join πY (r) ⊆ r is always true, but the other may not hold. • DefiniBon extended to decomposiBon into 3 or more relaBons in a straigh]orward way. • It is essen-al that all decomposi-ons used to deal with redundancy be lossless! (Avoids Problem (2).) Lipyeow Lim -‐-‐ University of Hawaii at Manoa 3
into X and Y is lossless-‐ join wrt F if and only if the closure of F contains: – X ∩ Y → X, or – X ∩ Y → Y • In parBcular, the decomposiBon of R into UV and R -‐ V is lossless-‐ join if U → V holds over R. Lipyeow Lim -‐-‐ University of Hawaii at Manoa 4 A B C 1 2 3 4 5 6 7 2 8 A B 1 2 4 5 7 2 B C 2 3 5 6 2 8 A B C 1 2 3 4 5 6 7 2 8 1 2 8 7 2 3
into S1={A,D}, S2={A,C}, S3={B,C,D} • Construct a Tableau – One row for each decomposed relaBon – For each row i, subscript an afribute with i if it does not occur in Si. • FDs: A→B, B →C, CD →A • Rules for “equaBng two rows” using FDs: – If one is unsubscripted, make the other the same – If both are subscripted, make the subscripts the same • Goal: one unsubscripted row Lipyeow Lim -‐-‐ University of Hawaii at Manoa 5 A B C D a b1 c1 d a b2 c d2 a3 b c d S1 S2 S3 A B C D a b1 c1 d a b2 c d2 a3 b c d X b1 c a X X one unsubscripted row imply lossless join
– If R is decomposed into X, Y and Z, and we enforce the FDs that hold on X, on Y and on Z, then all FDs that were given to hold on R must also hold. (Avoids Problem (3).) • Projec-on of set of FDs F: If R is decomposed into X, ... projecBon of F onto X (denoted FX ) is the set of FDs U → V in F+ (closure of F ) such that U, V are in X. Lipyeow Lim -‐-‐ University of Hawaii at Manoa 6 Student Course Instructor Smith OS Mark F= { SC → I, I→C } Student Instructor Smith Mark Course Instructor OS Mark Checking SC → I requires a join!
Y is dependency preserving if (FX union FY ) + = F + • If we consider only dependencies in the closure F + that can be checked in X without considering Y, and in Y without considering X, these imply all dependencies in F +. • Example: ABC decomposed into AB and BC. – F={A → B, B → C, C → A}. – Is this dependency preserving? • Dependency preserving does not imply lossless join: – Eg. ABC, A → B, decomposed into AB and BC. – And vice-‐versa! (Example?) Lipyeow Lim -‐-‐ University of Hawaii at Manoa 7 Important : F +, not F
F. How do we decompose R into a set of small rela-ons that are in BCNF ? • Algorithm: – If X → Y violates BCNF, decompose R into R-‐Y and XY – Repeat unBl all relaBons are in BCNF. • Example: ABCD, B → C, C → D, C → A. • Order in which we deal with the violaBng FD can lead to different relaBons! Lipyeow Lim -‐-‐ University of Hawaii at Manoa 8
of FDs S0 • Output: A decomposiBon of R0 into a collecBon of relaBons, all of which are in BCNF • IniBally R = R0 , S=S0 1. If R is in BCNF, return {R} 2. Let X→Y be a violaBon. a. Compute X+. b. Choose R1 =X+ c. Let R2 = X union (R-‐X+) 3. Compute FD projecBons S1 and S2 for R1 and R2 4. Recursively decompose R1 and R2 and return the union of the results Lipyeow Lim -‐-‐ University of Hawaii at Manoa 9
3.20 is lossless join • BUT in general there may not be a dependency preserving decomposiBon into BCNF – Example: Bookings(Title, City, Theater), with FD’s : Th→C, TiC→Th – Not in BCNF. – Can’t decompose while preserving 2nd FD; • This is the moBvaBon for 3NF. Lipyeow Lim -‐-‐ University of Hawaii at Manoa 10
join decomp into BCNF can be used to obtain a lossless join decomp into 3NF (typically, can stop earlier). • How can we ensure dependency preservaBon ? – If X→Y is not preserved, add relaBon XY. – Problem is that XY may violate 3NF! – Example: JP→C is not preserved, so add JPC. What if FDs also include J→C ? • Refinement: Instead of the given set of FDs F, use a minimal cover for F. Lipyeow Lim -‐-‐ University of Hawaii at Manoa 11
cover or basis G for a set of FDs F: – Closure of F = closure of G. – Right hand side of each FD in G is a single afribute. – If we modify G by deleBng an FD or by deleBng afributes from an FD in G, the closure changes. • IntuiBvely, every FD in G is needed, and ``as small as possible’’ in order to get the same closure as F. • e.g., A →B, ABCD →E, EF→GH, ACDF →EG has the following minimal cover: – A→B, ACD→E, EF→ G and EF→H Lipyeow Lim -‐-‐ University of Hawaii at Manoa 12
the FDs into standard form X → A. RHS is a single afribute. 2. Minimize the LHS of each FD. For each FD, check if we can delete an afribute from LHS while preserving F+. 3. Delete redundant FDs. • Minimal covers are not unique. Different order of computaBon can give different covers. • e.g., A →B, ABCD →E, EF→GH, ACDF →EG has the following minimal cover: – A→B, ACD→E, EF→ G and EF→H Lipyeow Lim -‐-‐ University of Hawaii at Manoa 13
FDs F • Output: A decomposiBon of R into a collecBon of relaBons, all of which are in BCNF 1. Find a minimal basis/cover for F, say G 2. For each FD X → A in G, use XA as one of the decomposed relaBons. 3. If none of the relaBons from Step 2 is a superkey for R, add another relaBon whose schema is a key for R. Lipyeow Lim -‐-‐ University of Hawaii at Manoa 14
in BCNF, it is free of redundancies that can be detected using FDs. Thus, trying to ensure that all relaBons are in BCNF is a good heurisBc • If a relaBon is not in BCNF, we can try to decompose it into a collecBon of BCNF relaBons. – Must consider whether all FDs are preserved. – If a lossless-‐join, dependency preserving decomposiBon into BCNF is not possible (or unsuitable, given typical queries), should consider decomposiBon into 3NF. – DecomposiBons should be carried out and/or re-‐examined while keeping performance requirements in mind. Lipyeow Lim -‐-‐ University of Hawaii at Manoa 15