How is the database normalized? quick look

To comment on the database paradigm， The first thing that has to be unpacked is the concept of function dependence。 set upR(U) It is the feature setU Forms of contact on。X harmonyY beU subset of。 ifR(U) Any of the possible contacts on ther, ifr It is impossible for two tuples to exist in, They are inX The value of the characteristic on the flat, and inY The value of the characteristics on the unequal， then calledX Function ResolutionY perhapsY The function relies on theX， writtenX->Y。

In summary, the passage can be summarized as when the characteristic value of X resolves the characteristic value of Y, then X is said to resolve Y, or Y is dependent on X.

To give you a simple example:

When planning a student form, the student's school number can determine the student's name, and the student's name relies on the student's school number.

There are also several special function dependencies among function dependencies such as ordinary function dependencies and non-ordinary function dependencies, complete function dependencies and partial function dependencies, and transfer function dependencies.

(1) Ordinary and non-ordinary functions rely on

set upR(U) is a contact form on the feature set。X harmonyY beU subset of。 ifX->Y, yetY not be attributed toX then it is said to be a non-ordinary function relying on。 ifX->Y yetY goX， then calledX->Y are ordinary functions that rely on。

It follows that the general function dependencies are ordinary function dependencies.

(2) Thorough function dependence vs. partial function dependence.

(located) atR(U) in， ifX->Y, And aboutX Any of the true subsets ofX', all haveX'->Y, then calledY rightX Thorough function reliance， written:X-f>Y.

ifX->Y, howeverY Incomplete functions rely onX， then calledY rightX Some functions rely on， writtenX-p>Y。

For example, in the student table (course number, course number, year, dormitory) contact, some functions rely on: (course number, course number) student dormitory because the student number student dormitory create

---

Next, start looking at what a paradigm is.

The different specifications established for different levels of planning requirements in the liaison database normalization process are called paradigms.

first paradigm(1NF):

definition: If the contact formR All characteristics of the inseparable data items， then calledR go first paradigm， denoted byR go1NF。

less than first paradigm

Student (name, gender age) ---- (because the gender age column includes two characteristics)

first paradigm

Student (name, gender, age) - (everything that characterizes R is not divisible)

Second paradigm (2NF):

R is classified as 2NF if the contact form R is classified as 1NF,and every non-principal feature is thoroughly functionally dependent on the bonds of R. That is, the primary intention of the second paradigm is to eliminate the partial functional dependence of non-primary features on primary features.

So why the need to eliminate some of that dependence?

There is such a form of contact,

Student (number, name, department, residence, course number, outcome)

About the above contact， student number-> name， student number-> department

The following problems exist:

1,Data redundancy

Because names and other parts rely on (student number, results), adding data to each department's department name and student residence repeats and spoils space.

2, update anomaly

Because of the redundancy of data, when updating data in the database, the system needs to pay a large value to protect the integrity of the database, otherwise inconsistencies in the data can develop.

3, stabbing into anomalies

If a student has not taken a course then the information about the student cannot be entered.

4, delete anomaly

If all students in a department graduate, the information about that department is deleted along with the information about the students in that department. Yet the fact remains that the department still exists.

A contact form in which there is a partial functional dependence must have irrelevant terms with no contact between the principal and non-principal features, and the above problem must occur. So we need to eliminate partial function dependence, which is what the second paradigm requires to do.

The elimination of partial function dependence can be done by using the projection differentiation method, which differentiates partial function dependence from it.

The differentiated contact form should have non-primary features that are all thoroughly functionally dependent on the primary features.

Third paradigm (3NF):

Contact formR If such a key does not exist inX, Feature groupY and non-principal characteristicsZ(Z not be attributed toY)， feasibleX->Y,Y->Z establish， yetY->X, then calledR go3NF。

from this, it can be seen that...， The intent of the third paradigm is to eliminate the transfer function from relying on， The reason for eliminating transfer function reliance is because， assuming thatX->Y,Y->Z, yetZ It is not directly dependent on theX， So forX Some of the operations performed do not need to affect theZ， but (not)Y i.e. relying onX， Also being relied upon， i.e.Y Some degree of independence， Can stand alone and not be affected by the resolution， Some of them are aimed atY The operation is bound to affect together theX together withZ， furthermoreY not be attributed toX， in connection withX Some of the operations do not need to affect theY， in connection withY The operation of the sometimes undemanding impactX, But there is no need to create such an impact， So let's eliminate the transfer function reliance。

For example:

Student Number Hostel Fee

062201 A 900

062230 B 1200

062240 B 1200

The academic number determines the dormitory, the dormitory determines the cost, and there is an academic number that does not include the dormitory, and the dormitory does not determine the academic number, satisfying the transfer function dependence condition.

So the above contact R has an increase anomaly (dorm C was built but no one lives in it anymore) delete anomaly (student 062201 dropped out of dorm A also delete it).

In summary to prevent the anomaly, we need to eliminate it.

BC Paradigm:

Because 3NF only rules the non-primary characteristic of the pairwise key-dependent contact. There is no restriction on the main feature to rely on keys. If there is a partial function dependence of the main feature on the key with a transfer function dependence, a problem similar to the one above is bound to occur. Therefore, the BC paradigm is introduced again.

Let the contact form R ∈ 1NF and if every function on R relies on XY , and if Y does not fall under X, then X must contain a candidate code, then R ∈ BCNF.

Related to the third paradigm, the BC paradigm is more demanding. The third paradigm simply requires that R be second paradigm and that the non-bond features do not pass candidate bonds that depend on R, whereas the BC paradigm requires every feature of R.

In the liaison form STJ (S, T, J), S indicates the student, T indicates the teacher, and J indicates the course.

Each teacher teaches only one course. Each course is taught by one teacher, and a particular student who selects a particular course concludes a regular teacher. A student taking a class with a particular instructor determines the designation of the class taken: (S, J)T, (S, T)J, TJ

From the bounds of the contact form one can conclude that if R is attributed to BCNF, then R has.

1.All non-primary features are thoroughly functionally dependent on each code.

2.The main feature of everything is a thorough functional dependence on every code that does not include it, too.

3.There is no feature thorough function that relies on any set of features of a non-code.

Since R ∈ BCNF, by definition, excludes any feature of transfer dependence and partial dependence on the code, so R ∈ 3NF. But if R ∈ 3NF,then R may not be attributed to BCNF.

---

The fourth paradigm and multi-valued dependencies are not yet well understood; generally if one reaches the BC paradigm, data redundancy, spiking, and deletion anomalies have now been eliminated on the function-dependent domain.