An example of an entity-relationship data model. Conceptual structure of information


1.5 ER modeling

Data modeling is the first step in database design, it is the transition from real world objects to computer model DB.

The ER model serves to bring together different representations of data at a conceptual level. Based on the ER-model, ER-diagrams are built, which display the three main components of the ER-model: entities, attributes, relationships.

1.5.1 Entities

Since an entity is a real-world object, the words "entity" and "object" often mean the same thing.

At the level of ER modeling, an entity actually means a set of entities (entity set), and not a single entity. In other words, an entity in ER modeling corresponds to a table, not a row in a relational environment, a single row in an ER model is called an entity instance (entity occurrence). An entity is represented by a rectangle containing the name of the entity.

1.5.2 Attributes

Attributes describe the properties of an entity. For example, the STUDENT entity includes the attributes NSTBIL (student ID number), FIO (student name), KURS (course), and so on.

Rice. 1.24. Attributes of the STUDENT entity in the ER model.

Attributes have domains. A domain is a set of possible values ​​for an attribute. For example, the domain for numerical value The average student grade can be written as an interval.

Primary keys in the ER model are underlined. If there are multiple primary keys, all are underlined.

Attributes can be simple or compound. A composite attribute is an attribute that can be further divided into multiple attributes. For example, the attribute ADRESS (address) can be divided into STREET (street), CITY (city), etc.

Attributes can be single-valued or multi-valued. A unique attribute is an attribute that can have only one value. For example, a TIN can have a single value for each person. Single-valued attributes are not necessarily simple. For example, serial number 78-03-06-137846 is a unique attribute, but at the same time it is a compound attribute, because it can be divided into the region in which the product was produced (78), city code (03), manufacturing shift (06), product number (137846).

A multivalued attribute is an attribute that can have multiple values. For example, a person can graduate from several universities, have several phone numbers.

In a relational DBMS, multivalued attributes cannot be used. If there are multi-valued attributes, then it is necessary to create several new attributes within this entity or create a new entity consisting of the components of a multi-valued attribute.

A derived attribute is an attribute that does not need to be stored in the database, it is obtained using some algorithm. For example, the age of an employee can be obtained as an integer value of the difference between the current date and the date of birth.

1.5.3. Connections

Relationships are associations. The entities that participate in a relationship are called participants. A verb or a document can be used as a link name. For example, a department is managed by an employee, goods are received on the basis of a concluded contract, etc.

Relationships between entities in a quantitative ratio can be "one-to-one", "one-to-many". The term "connectivity" is used to designate the types of connections.

Cardinality expresses a certain number of entity instances associated with one associated entity instance. On the ER diagram, the strength of the connection is not indicated, but in application programming, information about the max and min numbers of entity instances can be useful. For example, a group cannot start classes if it has less than 10 students.

Relationships are established between entities. If an entity depends on the existence of one or more other entities, then it is existence dependent. For example, if employees have dependents, then for calculating taxes, you can establish the relationship “employee has dependents”. In this case, the dependent entity depends on the employee entity.

If an entity can exist outside of other entities, then it is existence-independent. For example, the entity "part" may exist independently of the entity "supplier".

If one entity is independent of the existence of another entity, the relationship between them is called a weak relationship or a non-identifying relationship. Weak ties occur when the primary key of the related entity does not contain the primary components of the parent entity. For example, there are two entities COURSE (course) and CLASS (group), described as

COURSE ( CRS CODE, DEPT_CODE,…)

CLASS( CLASS CODE, CRS_CODE,…)

There is a weak connection between these entities, because the CLASS_CODE attribute is the primary key of the CLASS entity, while the CRS_CODE attribute of the CLASS entity is the foreign key. primary key the CLASS entity does not inherit the primary key component from the COURSE entity. A weak link is depicted on the ER diagram by a dashed line.

A strong relationship, also called an identifying relationship, occurs when related entities are dependent on existence. A strong relationship between two entities occurs when the primary key of the related entity contains the primary key component of the parent entity. For example, entities

COURSE ( CRS CODE, DEPT_CODE,…)

CLASS( CRS_CODE, CLASS-SECTION,…)

They have a strong connection, because the composite key of the CLASS entity includes the primary COURSE entity. On the ER diagram, strong ties are shown as a solid line.

Keep in mind that the order in which tables are created and loaded is significant. For data, for example, it is impossible for the foreign key of the CLASS table to refer to a COURSE table that does not yet exist. The problem after the table creation sequence in some DBMS does not occur until the data is loaded. In a 1:M relationship, the "1" side must be loaded, regardless of whether it is strong or weak, to avoid link-level integrity violations.

An entity's participation in a relationship can be required or optional. Entity participation is optional if one entity instance does not require a corresponding entity instance in a separate relationship. For example, in the connection on the course (COURSE), groups (CLASS) are created by at least, some courses may not create groups. Those. entity instance (row) in the COURSE table does not require mandatory availability the corresponding entity instance in the CLASS table. Therefore, the CLASS entity is considered optional with respect to the COURSE entity. An optional relationship on an ER diagram is shown as a small circle on the side of the optional entity. The existence of an optional indicates that for an optional entity min value link power is 0.

An entity's participation in a relationship is mandatory if one entity instance necessarily requires a corresponding entity instance in a separate relationship. If no additional symbol is displayed near the entity, then this means that given entity participates in a mandatory relationship with the related entity. Min cardinality for a required entity is 1.

a) The CLASS entity is optional for the COURSE entity

b) COURE and CLASS entities in a mandatory relationship.

Fig.1.25. Depiction of mandatory and optional relationships in the ER model.

In database design terms, the existence of a strong relationship between a parent entity and its associated entity or entities is associated with weak entities.

A weak entity is an entity that satisfies two conditions:

condition of dependence on existence, i.e. it cannot exist without the entity with which it is associated;

its primary key is partially or entirely derived from the parent entity of this relationship.

In the ER model, weak entities are represented by small segments at each of the four corners of the entity rectangle.

Rice. 1.26. Weak entity in ER diagrams.

A weak entity inherits all parts of the primary key of its strong relationship partner. It is the database designer who decides whether or not an entity should be declared weak.

The relationship degree indicates the number of associated entities. A unary relationship exists when an association is maintained within a single entity. A binary relationship exists when two entities are associated. A ternary relationship occurs when three entities are connected. Although there are higher degrees of connection, they are quite rare and do not have special names.

If an entity has links with itself, then such a link is called recursive.

Rice. 1.27. ER representation of a recursive relationship

The generalization hierarchy displays parent-child relationships. In the context of relational databases, a generic view hierarchy displays relationships between top-level entity supertypes and lower-level entity subtypes. Those. the supertype contains shared attributes while the subtype contains unique attributes.

Rice. 1.28. Hierarchy of generalized representations.

Relationships are inherited, i.e. An entity subtype inherits attributes and relationships from an entity supertype. For example, all pilots, mechanics, and accountants have personnel numbers, full names, home addresses, etc., but they may have attributes unique to their specialty. In other words, an entity set supertype is usually associated with several unique and non-overlapping entity set subtypes. Such non-overlapping links are denoted by the letter ‘G’.

The supertype and subtype(s) maintain a 1:1 relationship. For example, the structure of the EMPLOYEE table can be replaced by two tables, one representing the supertype EMPLOYEE and the other a subtype PILOT.

Some supertypes contain overlapping subtypes. For example, some employee may be a teacher, but at the same time, an administrator.

Intersecting links are shown as 'Gs'.

Rice. 1.29. A hierarchy of generic representations with overlapping subtypes.

The conceptual model of a database is

A conceptual model is a kind of visual diagram drawn in the accepted notation and showing in detail the relationship between objects and their characteristics. A conceptual model is created for further database design and its translation, for example, into a relational database. Relationships between data objects and their characteristics are written on the conceptual model in a visually convenient way.

Accepted definitions in the conceptual database

For the uniformity of database programming, the following concepts have been introduced for conceptual databases:

  1. Object or entity. This is the actual thing or object (for people) that the user (customer) wants to observe. For example, Ivanov Ivan Ivanovich;
  2. Attribute it is a characteristic of an object corresponding to its essence. For example. We ask ourselves the question: What information should be kept about Ivanov Ivan Ivanovich? The answers to this question will be the attributes of the object Ivanov Ivan Ivanovich;
  3. The third concept in conceptual database design is connection or relationship between objects.

Lexically, it is more correct to say the relationship between CBD objects and the relationship between CBD entities (conceptual database), but you can meet the most various combinations essence, object, connections and relationships (translation flaws).


Conceptual database model convention

Conceptual database model: accepted graphic symbols

An entity/relationship diagram (object/relationship) is called an ER diagram or EDR (entity-relationship diagram). The entity-relationship model itself was proposed by Professor Peter Pin-Shen Chen in 1976. The writing rules and conventions of an ER diagram are called notation. There are two main notations for ER diagrams that are common:

  • Notation by Peter Chen;
  • Notation Gordon Everest (Gordon Everst). Under the name Crow's Foot or Fork (fork).

ER diagram notation by Peter Chen

Chen proposed and adopted the following conventions for ER diagrams:

  • An entity or object is denoted by a rectangle;
  • Relationships are denoted by a diamond;
  • Attributes of objects are indicated by an oval;
  • If an entity is associated with a relationship, then their relationship is indicated by a straight line with an arrow. An optional relationship is indicated by a dotted line. A strong connection is indicated by a double line.

Each attribute can be associated with one object (entity).

Gordon Everest notation

Gordon Everest introduced a new designation of connections, which were called a fork or a crow's paw. He also introduced that an object should be denoted by a rectangle with the name of the object type as a noun inside the rectangle. Moreover, this name must be unique within the created database.

Attributes are not separated into a separate figure, but fit into the rectangle of the object with a noun name with a qualifying word.

The relationship between objects is indicated by a straight line. Multiple links are indicated by a fork at the end. The link itself is signed with a verb, such as "Includes" or "Belongs".


ERD Fork database conceptual model

Add-ons

Attributes in an ER diagram can have their own (composite) attribute.

It is quite easy to draw a simple ER diagram. Another thing is a rich, voluminous ER diagram. Below are some tips to help you build effective ER schemas:

  • Identify all objects in a given system and define relationships between these objects;
  • An object must appear only once at a particular location in the schema;
  • Define an exact and appropriate name for each object, attribute, and relationship in the diagram. Choose simple and understandable words. Terms that are simple and familiar always win out over vague, technical-sounding words. For objects, nouns are nouns, for links, verbs (possible with explanations). Do not forget about the uniqueness of object names;
  • Remove implicit, redundant or unnecessary relationships between objects;
  • Never connect relationships to other relationships;
  • Use colors to classify objects of the same type or highlight key areas in a diagram.

ER database model

The entity-relationship model (ERM, ER-model) allows you to describe conceptual schemes subject area.

The ER model is used in high-level database design. With its help, you can highlight the key entities and designate the relationships that can be established between these entities. ER model is formal construction, not defining graphic tools her presentations. As a standard graphical representation of the ER-model, the entity-relationship diagram ER-diagram (Entity Relationship Diagram - ER - diagram) was developed. When designing databases, the ER model is transformed into a specific database schema.

As you know, the basic concept of relational databases is a table (relationship). The table is used to structure and store information. In relational databases, each table cell contains one value. In addition, within the same database, there are relationships between tables, each of which specifies the sharing of table data.

The ER diagram graphically represents the data structure of the database being designed. Entities are displayed using rectangles, tables containing the name of the entity - database tables. Entity relationships are displayed as lines connecting individual entities.

A relationship indicates that the data of one entity references or is related to the data of another.

The extent of the end of the connection is indicated graphically, the multiplicity of the connection is depicted as a "fork" at the end of the connection. The modality of the connection is also depicted graphically - the optional connection is marked with a circle at the end of the connection. The naming of the connection is expressed by one verb (Fig. 13).

Fig.13.

Attributes entities are written inside a rectangle depicting an entity and are expressed as a singular noun.

The entity key stands out among the attributes. When creating an entity, it is necessary to select a group of attributes that can be a primary key, then select attributes to include in the primary key:

· The primary key must be selected in such a way that the values ​​of the attributes included in it can accurately identify an entity instance.

· None of the primary key attributes must be null.

· The values ​​of the primary key attributes must not change. If the value has changed, then this is a different instance of the entity.

When choosing a primary key, as a rule, an additional attribute is introduced into the entity, which becomes the key.

So, to determine the primary key, unique numbers are often used, which can be automatically generated by the system when an entity instance is added to the database. The use of unique numbers facilitates the process of indexing and searching in the database.

The first step in creating logical model The database is the construction of an ER diagram, consisting of three parts: entities, attributes and relationships.

Many tools developed visual creation ER - diagrams for various platforms. In this project, the development was used MySQL Workbench.

This tool simplifies database design and maintenance, automates time-consuming and error-prone tasks, and improves communication among development teams and database architects. It allows data architects to visualize requirements, communicate with stakeholders on design issues before investments are made in the project. It provides model-driven database design, which is the most efficient methodology for creating real and well thought out databases. ER - remote testing system database diagram ( control_tests) is shown in the figure (Fig. 14). Two vertical dashes on one side and a "trident" on the other indicate, respectively, a one-to-many relationship.

Rice. 14. ER - control_test database diagram of a remote testing system


Database normalization

In relational database theory, there are no universal prescriptions, laws for designing a reliable database designed once and for all. Developers choose their database design decisions based largely on experience, intuition, and various design tools and methods.

Nevertheless, some canons, rules still exist. These rules include normalization rules, i.e. bringing the relationship to normal form.

Introduction

The central ideas of modern information technology are based on the concept that data must be formed into databases in order to reflect the changing real world and meet the information needs of users. These databases are formed and operate under the control of special software systems(sets of programming languages ​​and software tools), called database management systems (DBMS). The database itself is the repository for a large number systematized data with which you can perform certain actions: adding, deleting, changing, copying, ordering.

Increasing the amount of stored data, expanding the circle of users information systems led to the widespread use of the most convenient and relatively easy to understand relational (tabular) DBMS. To ensure simultaneous access to data by many users, often located quite far from each other and from the place where the databases are stored, network multi-user versions of databases based on relational structure. They solve specific problems in one way or another. parallel processes, integrity (correctness) and security of data, as well as authorization of access.

Over the past few years, there has been a trend towards more complex data structures. Simple types of information represented in the form of numbers and text strings, without losing their significance, are supplemented today by numerous multimedia documents, graphic images, chronological series, procedural or active data and a myriad of other complex information forms. In this regard, a whole galaxy of DBMSs have emerged that support new data collections and are able to realize the advantages of modern hardware. One such DBMS is MS Access 2003 (2007), which is included in the software package Microsoft office, and is one of the popular relational DBMS for local computers.

Popularity of DBMS Microsoft Access due to the following reasons:

    accessibility in study and understandability make it possible to be one of the the best systems rapid creation of database management applications;

    completely Russified;

    opportunity using OLE technologies;

    support of WWW-ideology;

    visual technology allows you to constantly see the results of your actions and correct them, in addition, working with the form designer can greatly facilitate the further study of such programming systems as Visual Basic or Delphi;

    the help system is widely and clearly presented;

    the presence of a large set of "masters" for the development of objects.

Currently, automated information systems (AIS) are widely used in various large organizations.

Target this project: application in practice of the knowledge gained in the process of studying the course "Databases" and the acquisition of practical skills in the design and creation of information systems (IS) based on databases.

Basic concepts of the ER model: entity, entity instance, attribute, key, relationship, relationship types

The main concepts of the ER-model are entity, relationship and attribute.

An entity is a real or imagined object about which information must be stored and made available. In ER model diagrams, an entity is represented as a rectangle containing the name of the entity. In this case, the name of the entity is the name of the type, and not some specific instance of this type. For more expression and better understanding the entity name may be followed by examples of specific entities of that type.

Each instance of an entity must be distinguishable from every other instance of the same entity (this requirement is somewhat analogous to the requirement that there be no duplicate tuples in relational tables).

An attribute of an entity is any detail that serves to clarify, identify, classify, quantify, or express the state of an entity. Attribute names are placed in the entity box below the entity name and are shown in small letters, possibly with examples.

An entity key is a non-redundant set of attributes whose values ​​together are unique for each entity instance. Non-redundancy is that removing any attribute from a key breaks its uniqueness. An entity can have several different keys. Key attributes are shown underlined in the diagram.

A relationship is a graphical association established between two entities. This association is always binary and can exist between two different entities or between an entity and itself (recursive association). In any connection, two ends are distinguished (in accordance with the existing pair of connected entities), each of which indicates the name of the end of the connection, the degree of the end of the connection (how many instances of this entity are connected), the binding of the connection (i.e., whether any instance of this entity should participate in this connection).

A connection is represented as a line connecting two entities or leading from an entity to itself. At the same time, at the place of “docking” of the connection with the entity, a three-point entry into the entity rectangle is used if many (many) instances of the entity can be used for this entity in the connection, and a single-point entry if only one instance of the entity can participate in the connection. A mandatory link end is shown with a solid line, and an optional link end is shown with a broken line.

Each link has two ends and one or two names. The name is usually expressed in an indefinite verbal form: "to have", "belong", etc. Each of the names refers to its end of the connection. Sometimes the names are not written due to their obviousness. Each link can have one of the following link types:

Oneto one

Oneto many

many to many

Rice. 1 - Link types

A one-to-one relationship means that one instance of the first entity (left) is associated with one instance of the second entity (right). The one-to-one relationship most often indicates that in fact we have only one entity, incorrectly divided into two.

A one-to-many relationship means that one instance of the first entity (left) is associated with multiple instances of the second entity (right). This is the most commonly used type of connection. The left entity (from the “one” side) is called the parent entity, the right one (from the “many” side) is called the child entity.

A many-to-many relationship means that each instance of the first entity can be associated with multiple instances of the second entity, and each instance of the second entity can be associated with multiple instances of the first entity. The many-to-many relationship type is a temporary relationship type that is valid in the early stages of model development. In the future, this type of relationship should be replaced by two one-to-many relationships by creating an intermediate entity.

Each link can have one of two link modalities:

Maybe

Must

Rice. 2 – Communication modality

The modality "may" means that an instance of one entity may be associated with one or more instances of another entity, or may not be associated with any instance.

The “must” modality means that an instance of one entity must be associated with at least one instance of another entity.

Converting an ER Model to a Relational Data Model

Consider the transformation of an ER-model into a relational data model using an abstract example. Suppose we are given an ER model. It is necessary to get a set of tables (relations) of the type Table (Key, Attribute1, Attribute2, ..., AttributeN). The key can be compound. It is convenient to make attribute names unique on the scale of the ER model, then when building a relational model, they (almost never) will have to be renamed.

    Transformation of sets of entities (for brevity - entities)

    1. Regular Entity Transformation

Rice. 3 - Transformation of an ordinary entity

An ordinary entity is converted into a separate table, the fields of the table will be all the attributes of the entity: Entity (Key, Attribute1, Attribute2)

    1. Weak Entity Conversion

A weak entity is converted into a separate table, the fields of the table will be all the attributes of the entity plus the key attributes of all entities with which this weak entity is identified.

The key fields of all parent tables will be included in the key of the child table. For the child table, they will be called the foreign key: Entity1 (Key1, Key2, Attribute1, Attribute2).

Key 1

Attribute 1

Attribute2

Essence 1

Essence2

Key2

Rice. 4 - Weak Entity Transformation

    1. Entity Subtype Conversion

1 way. One table is created in which all attributes are placed. In order to indicate which subtype the object belongs to, you have to enter an additional field-attribute: Entity1(Key, Attribute1, Attribute2, Attribute3, Attribute4, Attribute4, Attribute).

The disadvantage of this method is that there are many empty fields in the table: for an object of subtype 1, attributes 4 and 5, and for an object of subtype 2, attributes 2 and 3 will remain empty.

2 way. A separate table is created for each subtype. It includes all the attributes of that subtype and all the attributes of the supertype:

Subtype1(Key, Attribute1, Attribute2, Attribute3)

Subtype2(Key, Attribute1, Attribute4, Attribute5)

The disadvantage of this approach is that the subtypes are no longer related to each other in any way.

3 way. One table is created for the supertype, and one table for each subtype, which includes the key fields of the supertype:

Entity1(Key, Attribute1)

Subtype1(Key, Attribute2, Attribute3)

Subtype2(Key, Attribute4, Attribute5)

The disadvantage of this approach is that the information about each object is now scattered across two tables.

Rice. 5 - Entity Subtype Conversion

    Relationship transformation

    1. Communication M:M

For links - double diamonds, nothing needs to be done, all information is already stored in the weak entity table.

Rice. 6 - Link M:M

A new table is created containing the key fields of each entity participating in the relationship and the relationship's own attributes, if any. The name usually reflects exactly which entities are linked, or the new table is called the name of the link.

Entity1Entity2(Key1, Key2, Attribute1).

If the link's own attribute is a key attribute, then in fact it is no longer a binary link, but a link of greater arity, i.e. this attribute can be replaced with a new entity towards which there is a "to one" relationship. In this case, you need to apply the rules for ternary and other relationships

    1. Link 1:M

Rice. 7 - Link 1:M

1 way. In the same way as in the case of M:M, a new table is created containing the key fields of each entity participating in the relationship. The name usually reflects exactly which entities are linked, or the new table is called the name of the link. The key will be the key of the second entity.

Entity1Entity2(Key1, Key2).

2 way. A new table is not created, but the key fields of the parent entity are added to the table of the child entity (they will not be included in the key of the child entity). The key fields of the parent entity are the foreign key for the child entity.

This method is preferable to use if the relationship is an "exactly one" relationship, that is, all entity instances participate in the relationship. In this case, the foreign key field will never be empty.

Child entity table: Entity2(Key2, Attribute1, Key1).

    1. Communication 1:1

Rice. 8 - Link 1:1

1 way. In the same way as in the case of M:M, a new table is created containing the key fields of each entity participating in the relationship. The name usually reflects exactly which entities are linked, or the new table is called the name of the link. The key will be the key of any entity.

This method is preferable to use if the relationship is not "exactly to one", that is, not all entity instances participate in the relationship.

Entity1Entity2(Key1, Key2) or Entity1Entity2(Key1, Key2).

2 way. In the same way as in case 2 1:M, a new table is not created, but the key fields of another entity (we will consider it parent) are added to the table of one of the entities (we will consider it a child).

If the relationship is not an "exactly one" relationship with respect to the parent table, that is, not all entity instances participate in the relationship, the foreign key field in some records may be empty.

Child entity table: Entity1(Key1, Attribute1, Key2),

or Entity2(Key2, Attribute2, Key1).

3 way. Two tables for entities related 1:1 are merged into one. The key of the new table can be a combination of the keys of both tables. If at least in one direction the relationship is "exactly to one", then the key of this entity can be considered the key of the joined table.

Tables: Entity1(Key1, Attribute1) and Entity2(Key2, Attribute2) are replaced by

Entity1Entity2(Key1, Attribute1, Key2, Attribute2)

or possibly Entity1Entity2(Key1, Attribute1, Key2, Attribute2),

or Entity1Entity2(Key1, Attribute1, Key2, Attribute2).

To link an entity to itself, the same rules apply, but since the same entity participates in the link twice, the key fields must appear in the same table twice. Therefore, one of the keys has to be renamed.

Consider a 1:M relationship, method 2. Renamed foreign key.

Rice. 9 - Relationship 1:M, foreign key renamed

Entity1(Key1, Attribute1, AnotherKey1).

For relationships with an arity greater than 2, the same method is usually used as for a binary M:M relationship - a new table is created containing the key fields of all related tables: Entity1Entity2Entity3(Key1, Key2, Key3).

Rules for converting an ER model into a relational one

Rule 1. Each entity is assigned a relation of the relational data model. In this case, the names of the entity and the relationship may be different, because the names of entities may not be subject to additional syntactic restrictions, except for the uniqueness of the name within the model. Relationship names can be limited by the requirements of a particular DBMS, most often these names are identifiers in some base language, they are limited in length and must not contain spaces or some special characters. For example, an entity can be named "Book catalog", and it is desirable to name the relation corresponding to it, for example, BOOKS (without spaces and in Latin letters).

Rule 2. Each attribute of an entity becomes an attribute of the corresponding relation. For each attribute, a specific data type is specified that is allowed in the DBMS and the mandatory or non-mandatory nature of this attribute (that is, the admissibility or inadmissibility NULL values for him).

Rule 3. The primary key of an entity becomes the PRIMARY KEY of the corresponding relation. Attributes that are part of the primary key of a relationship automatically receive the mandatory (NOT NULL) property.

Rule 4. In each relation corresponding to a subordinate entity, a set of attributes of the main entity, which is the primary key of the main entity, is added.

In the relation corresponding to the subordinate entity, this set of attributes becomes the FOREING KEY.

To model an optional link type on physical level the attributes corresponding to the foreign key are set to accept null values ​​(NULL attribute). With a mandatory type of relationship, the attributes receive the property of the absence of null values ​​(the NOT NULL sign).

Normalization theory

Normalization of relations (tables) is one of the fundamental parts of the theory relational databases data. Normalization aims to get rid of redundancy in relations and modify their structure in such a way that the process of working with them is not burdened by various extraneous complexities. If this approach is ignored, the design efficiency is rapidly declining, which, together with other similar liberties, can lead to critical consequences.

First normal form

A relation is in first normal form (abbreviated as 1NF) if all its attributes are atomic, that is, if none of its attributes can be divided into simpler attributes that correspond to some other properties of the described entity.

We will call the original relation the main one, and the value of the non-atomic attribute - the subordinate one.

In order to normalize the original relation, whose attributes are non-atomic, it is necessary to combine the schemas of the main and subordinate relations. In addition, if, for example, the table corresponding to the non-normalized relation is already contained in the database and filled with information, the task is complicated by the fact that the value of a non-atomic attribute may, in turn, contain several tuples.

Consider a relationship that has the attributes "Employee ID", "Name", "Position", "Projects". Obviously, one employee can work on several projects. Suppose a project is described by an ID, a title, and a due date.

Employee code

Full name

Job title

Projects

Ivanov Ivan Ivanovich

Programmer

ID: 123; Name: Steam boiler control system; Delivery date: 30.09.2011
ID: 231; Name: PS for control and notification of exceeding the MPC of various gases in the room; Delivery date: 30.11.2011
ID: 321; Name: Face recognition module for protective system; Delivery date: 01.12.2011

It is easy to see that not all attributes of this relation are atomic (indivisible). In particular, the attribute "Projects" can be divided into three simpler attributes: "Project code", "Name", "Date of delivery", and the value of this attribute for the employee Ivan Ivanovich Ivanov contains several tuples - information about three projects.

Consider an algorithm for normalizing a relation to 1NF.

Create a new relation, the schema of which will be obtained by merging the main and subordinate schemas of the original relation into one.

For each tuple of the original relation, include in the new one as many rows as there are tuples in the child relation of this tuple.

Fill in the attribute values ​​of the new relation corresponding to the attributes of the subordinate relation. Fill in the rows of the new relation with the values ​​of the atomic attributes of the original one.

Let's apply this algorithm to the relation above. The scheme of the new relationship will consist of 6 attributes: "Employee code", "Name", "Position", "Project code", "Name", "Commission date". For one single tuple of the given relation, add three rows to the new one, one for each project (according to the number of tuples in the subordinate relation). Now we can populate the split attribute values ​​with tuples from the sub relation. Then we will transfer the values ​​of atomic attributes to each of these lines: “Employee code”, “Name”, “Position” (all three lines will contain same values these attributes).

The result will look like this:

Employee code

Full name

Job title

Project Code

Name

Delivery date

Ivanov Ivan Ivanovich

Programmer

123

Steam boiler control system

30.09.2011

Ivanov Ivan Ivanovich

Programmer

231

PS for control and notification of exceeding the MPC of various gases in the room

30.11.2011

Ivanov Ivan Ivanovich

Programmer

321

Facial recognition module for security system

01.12.2011

Second normal form

A relation in 1NF can also have redundancy. The second normal form is intended to eliminate it. But before proceeding to its description, we must first identify the shortcomings of the first.

Let the initial relationship contain information about the supply of some goods and their suppliers.

Supplier code

City

City status

Product code

Quantity

Moscow

300

Moscow

400

Moscow

100

Yaroslavl

200

Stavropol

300

Stavropol

400

Pskov

100

It is known in advance that the following functional dependencies are contained in this relation:

( (Supplier Code, Product Code) -> ( Quantity),

(Vendor code) -> (City),

(Vendor code) -> (Status),

(City) -> (Status) )

Primary key in relation: (Supplier ID, Product ID).

Obviously, the relationship has redundancy: it describes two entities - the supply and the supplier. As a result, the following anomalies occur:

Insertion anomaly. You cannot add information about a supplier that has not yet delivered any goods to a relation.

Deletion anomaly. If there was only one delivery from the supplier, then when deleting information about it, all information about the supplier will also be deleted.

update anomaly. If you need to change any information about the supplier (for example, the supplier moved to another city), then you will have to change the attribute values ​​​​in all records of deliveries from him.

physical meaning The redundancy of the original relation lies in the fact that it describes not one entity, but two - the supply and the supplier.

To eliminate these anomalies, it is necessary to split the original relation into projections:

    The first should include the primary key and all non-key attributes explicitly dependent on it;

    In other projections (in this case it alone) will include non-key attributes that implicitly depend on the primary key, along with the part of the primary key on which these attributes explicitly depend.

As a result, two relations will be obtained:

Supplier code

Product code

Quantity

300

400

100

200

300

400

100

The first relationship now has the following functional dependencies: (Supplier Code, Product Code) -> (Quantity)

Supplier code

City

City status

Moscow

Yaroslavl

Stavropol

Pskov

The second relation corresponds to:

( (Vendor Code) -> (City),

(Vendor code) -> (Status),

(City) -> (Status) )

This split eliminated the anomalies described above: you can add information about a supplier that has not yet delivered goods, delete delivery information without deleting information about the supplier, and also easily update information if the supplier has moved to another city.

Definition of Second Normal Form: A relation is in second normal form (abbreviated as 2NF) if and only if it is in first normal form and each of its non-key attributes is irreducibly dependent on the primary key.

Semantic Modeling

The wide distribution of relational DBMS and their use in a wide variety of applications shows that the relational data model is sufficient for modeling subject areas. However, designing a relational database in terms of relationships based on the normalization mechanism we briefly discussed is often a very complex and inconvenient process for the designer.

At the same time, the limitations of the relational data model are manifested in the following aspects:

The model does not provide sufficient means to represent the meaning of the data. The semantics of the real domain must be represented in a model-independent way in the mind of the designer. In particular, this applies to the problem of representing integrity constraints that we mentioned.

For many applications it is difficult to model the subject area based on flat tables. In some cases, at the very initial stage of design, the designer has to force himself to describe the subject area in the form of a single (perhaps even non-normalized) table.

Although the entire design process is based on dependency considerations, the relational model does not provide any means to represent these dependencies.

Despite the fact that the design process begins with the selection of some essential objects of the subject area ("entities") and the identification of relationships between these entities, the relational data model does not offer any apparatus for separating entities and relationships.

The needs of database designers for more convenient and powerful means domain modeling brought to life the direction of semantic data models. While any developed semantic data model, like a relational model, includes structural, manipulation and integral parts, the main purpose of semantic models is to provide the possibility of expressing data semantics.

Most often in practice, semantic modeling is used at the first stage of database design. In this case, in terms of the semantic model, a conceptual schema of the database is produced, which is then manually converted to a relational (or any other) schema. This process is carried out under the control of methodologies in which all stages of such a transformation are clearly specified.

Less often, automated compilation of a conceptual schema into a relational one is implemented. In this case, two approaches are known: based on the explicit representation of the conceptual scheme as initial information for the compiler and the construction of integrated design systems with automated creation conceptual schema based on interviews with subject matter experts. In both cases, the result is a relational database schema in third normal form (it would be more accurate to say that the author is not aware of systems that provide more high level normalization).

Finally, the third possibility, which has not yet gone (or is only going) beyond research and experimental projects, is working with a database in a semantic model, i.e. DBMS based on semantic data models. In this case, two options are again considered: providing user interface based on a semantic data model with automatic display constructs into a relational data model (a task about the same level of complexity as automatically compiling a database conceptual schema into a relational schema) and a direct DBMS implementation based on some kind of semantic data model. Closest to the second approach are modern object-oriented DBMSs, whose data models are close in many respects to semantic models (although they are more powerful in some aspects and weaker in others).

Extended ER Model (EER Model)

The concepts of the usual ER-model are sufficient to represent most database schemas, for example, for most commercial database systems. However, such areas as computer-aided design systems, automated production preparation systems, etc. impose more complex requirements on the database. For their semantic modeling, the ER-model was supplemented with new concepts and transformed into an EER-model (Enhanced ER - model - extended entity-relationship model). Such a model contains all the elements of the ER model plus additional concepts, namely the concepts of generalization, specialization, and aggregation.

Generalization is the union of a set of classes (types) of entities into a more general type of entity - a superclass. If, during the analysis of entities, it is found that two or more classes (types) of entities have common sets of attributes, then such classes can be combined into a superclass.

Specialization is the opposite of generalization. It consists in the fact that some general (aggregated, superclass) class or type is divided into several more private (specialized) subclasses.

Aggregation is the process of combining several independent entities (types) into an aggregated (complex) type.

Conceptual ER Models

In accordance with the main stages of database design, after building a conceptual model, a database management system is selected, with the help of which the database will be organized and work with it. Each DBMS supports certain types and types of data, as well as means of representing relationships between the data that make up the DBMS data model. The second stage of database design consists in representing the conceptual model built at the previous stage by means of the DBMS data model or in mapping the conceptual model into the DBMS data model. This stage is often referred to as the logical database design. The resulting model is often also referred to as a conceptual model or schema, but specified to the concepts of a DBMS data model. In some sources, the resulting model is called the logical data structure or the database data model.

It is possible to characterize the concept of a DBMS data model in different ways. On the one hand, the DBMS data model is a way of structuring data that is considered as some kind of abstraction in isolation from the subject area. On the other hand, the DBMS data model is a tool for representing the conceptual model of the subject area and the dynamics of its change in the form of a database.

Considering both of the above sides, let's define the main structures of the DBMS data models used to represent the conceptual model of the subject area (entities, attributes, relationships).

A data element (field) is the smallest named unit of data. Used to represent the value of an attribute.

Inextricably linked with a data element is the concept of a “data type”, which can take the corresponding field. In different DBMS can be used different types data, the most common of which, used in many DBMS, are the following: numeric (numeric), character (char), date (date), etc.

A record is a named collection of fields. Used to represent a collection of entity attributes (entity records).

A record instance is a record with specific field values.

The primary key is the minimum set of record fields that uniquely identifies a file record instance.

A file is a named collection of record instances of the same type. Used to represent a homogeneous set of entities.

A set of files is a named collection of files processed by the system. Used to represent multiple entity sets.

The concept of "group" is a generalization of the concept of "file" and "record".

A group is a named collection of data elements and other groups.

The most important concept of the conceptual model is the concept of a relationship between entities (entity sets). In DBMS data models, the corresponding concept is reflected by the concept of "group relation".

A group relation is a named binary relation defined on two sets of instances of the groups under consideration. By the nature of binary relations, group relations of the form 1:1, 1:M, M:1, M:N are distinguished. Pairs of numbers are called coefficients of a group relation. In a group relation, one member of the group is designated as the owner of the relation, the other as a member.

Two forms are used to represent a group relation:

a) Graph. Groups are depicted as graph vertices, connections between groups are represented by arcs directed from the owner group to the member group with indication of the relationship name and coefficient.

The types of graphs are:

    hierarchical model (graph without cycles - tree);

    network model (general directed graph).

b) Tabular. The relationship between groups is represented by a table whose columns represent the keys of the corresponding groups. For a formal description of a table, the mathematical (set-theoretic) concept of a relation is used. The corresponding data model is called the relational model.

Examples of ER Models

Example 1

ER-model: ZhEK

Description of the task: ZhEK employees need a database to store information about requests for repairs received from residents. ZhEK serves a group of houses, including several streets. The application comes from the apartment. The application is accepted by the dispatcher, he sets the number and date of receipt of the application, determines the type of application and the deadline for its implementation. The application is carried out by a team of specialists. Each specialist can work in only one brigade, each brigade has a foreman.

Rice. 10 - An example of the ER-model of the ZhEK

Example 2

ER-model of the office "Horns and hooves"

Task Description: The Horns and Hooves office is engaged in commercial activities for the sale of products made from horns and hooves and the provision of magical services.

An employee of the organization has a full name, personnel number, position. Employees are divided into several departments. Each department has a number, name and head. An employee cannot manage more than one department.

The organization works with client enterprises. Each business has a name and address. A company may enter into several contracts.

The contract is characterized by a unique number, date and type. Each contract is supervised by some employee. As the goods and services under the contract are sold to the client, invoices are issued with some frequency.

The invoice is characterized by a unique number, issue date, payment term and amount, as well as a list of goods and services sold, indicating their quantity. Penalties are charged on unpaid invoices. The invoice can be paid in several installments, each payment is characterized by a number, date and amount. The payment number is unique within his account. Prices for goods and services may change over time.

Rice. 11 - An example of the ER-model of the office "Horns and Hooves"

An example of developing a simple ER model

When developing ER models, we must obtain the following information about the subject area:

1. List of entities of the subject area.

2. List of entity attributes.

3. Description of relationships between entities.

ER diagrams are convenient in that the process of extracting entities, attributes, and relationships is iterative. Having developed the first approximate version of the diagrams, we refine them by interviewing subject matter experts. At the same time, the documentation in which the results of the conversations are recorded are the ER diagrams themselves.

Suppose we are faced with the task of developing an information system for a certain wholesale trading company. First of all, we must study the subject area and the processes taking place in it. To do this, we interview company employees, read documentation, study order forms, invoices, etc.

For example, during a conversation with a sales manager, it turned out that he (the manager) believes that the system being designed should perform the following actions:

    Store customer information.

    Print invoices for released goods.

    Monitor the availability of goods in stock.

Let's select all the nouns in these sentences - these will be potential candidates for entities and attributes, and analyze them (we will highlight incomprehensible terms with a question mark):

    The buyer is a clear candidate for an entity.

    An invoice is a clear candidate for an entity.

    Product - a clear candidate for the entity

    (?) Warehouse - in general, how many warehouses does the company have? If multiple, then it will be a candidate for a new entity.

    (?) The presence of a product is most likely an attribute, but an attribute of which entity?

Immediately there is an obvious connection between the entities - "customers can buy many products" and "products can be sold to many buyers." The first diagram looks like this:

Rice. 12 - The first version of the ER diagram

Having asked additional questions to the manager, we found out that the company has several warehouses. Moreover, each product can be stored in several warehouses and be sold from any warehouse.

Where to place the entities "Invoice" and "Warehouse" and what to associate them with? Let's ask ourselves how these entities are related to each other and to the entities "Buyer" and "Product"? Buyers buy goods, while receiving invoices, which contain data on the quantity and price of the purchased goods. Each buyer can receive multiple invoices. Each invoice must be issued for one buyer. Each waybill must contain several goods (there are no empty waybills). Each item, in turn, can be sold to multiple customers via multiple invoices. In addition, each invoice must be issued from a specific warehouse, and many invoices can be issued from any warehouse. Thus, after clarification, the diagram will look like this:

Rice. 12 - The second version of the ER diagram

Now we need to think about entity attributes. It turns out the following:

Each buyer is a legal entity and has a name, address, bank details.

Each product has a name, price, and is also characterized by units of measurement.

Each invoice has unique number, the date of issue, a list of items with quantities and prices, and the total amount of the invoice. The invoice is issued from a specific warehouse and for a specific customer.

Each warehouse has its own name.

Again, we write out all the nouns that will be potential attributes, and analyze them:

Entity- a rhetorical term, we do not work with individuals. We don't pay attention.

The name of the buyer is a clear characteristic of the buyer.

The address is a clear characteristic of the buyer.

Bank details are a clear characteristic of the buyer.

The name of the product is a clear description of the product.

(?) The price of the product - it seems that this is a characteristic of the product. Is this characteristic different from the price in the invoice?

A unit of measurement is an explicit characteristic of a product.

The invoice number is an explicit unique characteristic of the invoice.

The date of the invoice is an explicit characteristic of the invoice.

(?) List of goods on the invoice - the list cannot be an attribute. Probably, you need to separate this list into a separate entity.

(?) The quantity of goods on the invoice is an explicit characteristic, but a characteristic of what? This is not just a characteristic of "goods", but "goods on the waybill".

(?) The price of the goods on the invoice - again, this should be not just a characteristic of the goods, but a characteristic of the goods on the invoice. But the price of the goods has already been met above - is it the same thing?

The amount of the invoice is an explicit characteristic of the invoice. This characteristic is not independent. The amount of the invoice is equal to the sum of the costs of all goods included in the invoice.

The name of the warehouse is a clear description of the warehouse.

Each item has some current price. This is the price at which the product is sold in this moment. Naturally, this price may change over time. The price of the same product in different invoices issued at different times may be different. Thus, there are two prices - the price of the item on the invoice and the current price of the item.

With the emerging concept of "List of goods on the invoice" everything is pretty clear. The "Invoice" and "Product" entities are related to each other by a many-to-many relationship. Such a relationship must be split into two one-to-many relationships. This requires an additional entity. This entity will be the "List of goods on the invoice" entity. Its connection with the entities "Invoice" and "Goods" is characterized by the following phrases - "each invoice must have several entries from the list of goods on the invoice", "each entry from the list of goods on the invoice must be included in exactly one invoice", "each product can be included to several entries from the list of goods on the invoice", "each entry from the list of goods on the invoice must be associated with exactly one product". The attributes "Quantity of goods on the invoice" and "Price of goods on the invoice" are attributes of the "List of goods on the invoice" entity.

Let's do the same with the connection connecting the entities "Warehouse" and "Goods". Let's introduce an additional entity "Goods in stock". The attribute of this entity will be "Quantity of goods in stock". Thus, the goods will be listed in any warehouse and its quantity in each warehouse will be different.

Now you can put all this into a chart:

Rice. 13 - The third (final) version of the ER diagram

Bibliography:

    Information systems in economics: A textbook for students. higher studies, institutions / V.B. Utkin, K.V. Baldin. - M.: Publishing Center "Academy", 2004. - 288 p.

    Development and operation of automated information systems: textbook. allowance / Ed. prof. L.G. Gagarina. - M.: ID "FORUM": INFRA-M, 2007. - 384 p.: ill. - (Professional education).

    Date K. Introduction to database systems. - M .: Dialectics, 2000.

    Pershikov V. I., Savinkov V. M. Explanatory dictionary of informatics. - M.: Finance and statistics, 2001.

    Rayordan R. Fundamentals of relational databases. - M .: Russian edition, 2001.

    Saucap R. Design relational systems databases. - M.: Russian edition, 2006.

    Database and knowledge management systems / Ed. A. N. Naumova. - M.: Finance and statistics, 2005.

    Sportak M., Pappas F. Computer networks and network technologies. - Kyiv: TID "DS" LLC, 2002.

    Utkin V. B. Fundamentals of Automation professional activity. – M.: RDL, 2001.

    Utkin V. B., Sazanovich A. N., Mdicharadze V. G. Computer science. – M.: RDL, 2007.

    MegaPlus: Electronics, computers, communication.2010. – No. 5.

Like any model, the entity-relationship model has several basic concepts, which form the initial bricks from which more than complex objects according to predetermined rules.

This model is most consistent with the concept of object-oriented design, which in currently is undoubtedly the basis for the development of complex software systems, so many concepts may seem familiar to you, and if this is true, then it will be easier for you to master database design technologybased on the ER model.

The ER model is based on the following basic concepts:

  • Essence, with which is used to model a class of objects of the same type. The entity has a name that is unique within the modeled system. Since an entity corresponds to a certain class of objects of the same type, it is assumed that there are many instances of this entity in the system. The object to which the concept of entity corresponds has its own set attributes - characteristics that define the properties of a given class representative. In this case, the set of attributes must be such that it is possible to distinguish between specific instances of the entity. For example, the Employee entity can have the following set of attributes: Personnel number, Last name, First name, Middle name, Date of birth, Number of children, Relatives abroad. A set of attributes that uniquely identifies a particular instance of an entity is called key. For the Employee entity, the Personnel number attribute will be the key one, because for all employees this enterprise payroll numbers will be different. An instance of the Employee entity will be a description of a particular employee of the enterprise. One of the generally accepted graphic symbols entities - a rectangle, in the upper part of which the name of the entity is written, and the attributes are listed below, and the key attributes are marked, for example, with an underline or a special font (Fig. 7.1):

Rice. 7.1.An example of defining an entity in an ER model

Between entities can be established connections- binary associations showing how entities relate or interact with each other. A relationship can exist between two different entities, or between an entity and itself. (recursive link). It shows how entity instances are related to each other. If a relationship is established between two entities, then it defines the relationship between instances of one and the other entity. For example, if we have a relationship between the entity "Student" and the entity "Teacher" and this relationship is the management of diploma projects, then each student has only one supervisor, but the same teacher can supervise many.graduate students. Therefore, it will be a one-to-many (1:M) relationship, one on the Teacher side and many on the Student side (see Figure 7.2).


Rice. 7.2.An example of a one-to-many relationship when linking the entities "Student" and "Teacher"

In different notations, the power of communication is depicted in different ways. In our example, we use the CASE notation of the POWER DESIGNER system, here the multiplicity is depicted by dividing the link by 3. The link has the common name "Graduation Design" and has role names from both entities. On the part of the student, this role is called "Writes a diploma under the guidance", on the part of the teacher, this relationship is called "Guides". Graphical interpretation of the relationship allows you to immediately read the meaning of the relationship between entities, it is clear and easy to interpret. Connections are divided into three types according to multiplicity: one to one(1:1), one and and-to-many(1M), many-to-many(MM). A one-to-one relationship means that an instance of one entity is associated with only one instance of another entity.

Link 1: M means that one instance of the entity located on the left of the link can be associated with several instances of the entity located on the right of the link. A one-to-one relationship (1:1) means that one instance of one entity is associated with only one instance of another entity, and a many-to-many relationship (M:M) means that one instance of the first entity can be associated with multiple instances of the second entity, and vice versa, one instance of the second entity can be associated with multiple instances of the first entity. For example, if we consider a “Studies” relationship between the entities “Student” and “Subject”, then this is a many-to-many (M:M) relationship, because each student can study several disciplines, but each discipline is studied by many students. This connection is shown in Fig. 7.3.

  • Any number of connections with different semantic loads can be set between two entities. For example, between the two entities "Student" and "Teacher" you can establish two semantic relationships, one - the previously considered "Graduation Design", and the second can be conditionally called "Lectures", and it determines which lectures of which teachers this student listens to and which This teacher gives lectures to students. It is clear that this is a connection of the type many-to-many. An example of these connections is shown in fig. 7.3.

Rice. 7.3.Many-to-Many Relationship Modeling Example

  • Communication of any of these types can be mandatory if each instance of an entity must participate in a given relationship, optional - unless every instance of the entity is required to participate in the relationship. In this case, the connection can be mandatory on the one hand And optional on the other hand. Compulsory communication is also indicated differently in different notations. We use the POWER DESIGNER notation again. Here, the optionality of the connection is indicated by an empty circle at the end of the connection, and the obligation by a perpendicular line crossing the connection. And this notation has a simple interpretation. A circle means that no instance can participate in this relationship. And the perpendicular is interpreted as the fact that at least one instance of the entity participates in this connection.

For this, let's consider the previously given example of the connection "Degree design". In our figure, this relationship is interpreted as optional on both sides. But after all, in fact, every student who writes a diploma should have his own supervisor of graduation design, but, on the other hand, not every teacher should conduct graduation design. Therefore, in this semantic formulation, the image of this connection will change and will look as shown in Fig. 7.4.

Rice. 7.4.An example of a required and optional relationship between entities

In addition, the principle of entity categorization is allowed in the ER model. This means that, as in object-oriented programming languages, the concept of an entity subtype is introduced, that is, an entity can be represented as two or more of its subtypes - entities, each of which may have common attributes and relationships and/or attributes and relationships that are defined once at the top level and inherited at the bottom level. All subtypes of one entity are considered mutually exclusive, and when separating an entity into subtypes, it must be represented as complete set mutually exclusive subtypes. If at the level of analysis it is not possible to identify the complete List of subtypes, then a special subtype is introduced, conditionally called OTHER, which can be further refined. IN real systems sometimes it is enough to introduce subtyping at two or three levels.

The entity from which subtypes are built is called supertype. Any instance of a supertype must be of a particular subtype. For graphic image principle of categorization or typing of an entity, a special graphic element, called discriminator node, in the POWER DESIGNER notation, it is depicted as a semicircle, with its convex side facing the super-entity. This side is connected by a directed arrow to the super-entity, and subtypes of this entity are connected to the diameter of this circle by arrows (see Fig. 7.5).

Rice. 7.5.Entity subtype diagram TEST

This diagram can be deciphered as follows. Each test in some testing system is either a test of knowledge of the SQL language, or some analytical task that is performed using pre-written Java applets, or a test in some area of ​​knowledge, consisting of a set of questions and a set of answers offered for each question.

As a result of constructing a domain model in the form of a set of entities and relationships, we obtain a connected graph. In the resulting graph, it is necessary to avoid cyclic connections - they reveal the incorrectness of the model.

As an example, let's design a mythological model of a system designed to store information about books and areas of knowledge presented in a library. The description of the subject area was given earlier. Let's start developing the model by highlighting the main entities.

First of all, there is the essence of the "Book", each book has a unique cipher, which is its key, and a number of attributes that are taken from the description of the subject area. The set of entity instances defines the set of books that are stored in the library. Each instance of the “Books” entity does not correspond to a specific book on the shelf, but to a description of some book, which is usually given in the subject catalog of the library. Each book can be present in several copies, and these are just those specific books that are on the shelves of the library.

To reflect this, we must introduce an Instances entity, which will contain descriptions of all book instances that are stored in the library. Each instance of the Instances entity corresponds to a particular book on the shelf. Each copy has a unique accession number that uniquely identifies a particular book. In addition, each copy of the book can be either in the library or in the hands of some reader, and in the latter case, for this copy, the date of taking the book by the reader and the date of the expected return of the book are additionally indicated.

Between the entities "Books" and "Instances" there is a relationship "one-to-many" (1:M), mandatory on both sides. What determines given type connections? We can assume that each book can be present in the library in multiple instances, so the relationship is one-to-many. Moreover, if the library does not have a single copy of the given book, then we will not store its description, therefore, if the book is described in the “Books” entity, then at least one copy of this book is present in the library. This means that from the side of the book, the link is mandatory. As for the “Instances” entity, there cannot be a single copy in the library that does not belong to a specific book, therefore, from the “Instances” side, the connection is also mandatory.

Now we need to determine how the reader will be represented in our system. It is natural to suggest introducing the “Readers” entity for this, each instance of which will correspond to a specific reader. In the library, each reader is assigned a unique library card number, which will uniquely identify our reader. The library card number will be a key attribute of the Readers entity.

In addition, in the "Readers" entity, there must be additional attributes that are required to solve the tasks, these attributes will be: "Last Name First Name", "Reader's Address", "Home Phone" and "Work Phone". Why did we introduce two separate attributes for phones? Because it is necessary to call these numbers at different times in order to catch the reader, so it will be important for the library administration to know what type it belongs to this phone. In the description of our subject area, there is a restriction on the age of our readers, therefore, in the “Readers” entity, we must enter the mandatory attribute “Date of birth”, which will allow us to control the age of our readers.

From the description of the subject area, we know that each reader can hold several copies of books in their hands. To reflect this situation, we need to draw a connection between the entities "Readers" and "Instances". And why not between the entities "Readers" and "Books"? Because the reader takes from the library a specific copy of a specific book, and not just a book. But how do you know which book a given reader has? And this can be found out by an additional relationship between the entities "Instances" and "Books", and this relationship associates one book with each instance, so we can unambiguously determine at any time which books are in the hands of the reader, although we associate with the reader only the inventory numbers of the books taken. A one-to-many relationship has been established between the entities "Readers" and "Instances", and at the same time it is not mandatory on both sides. The reader at the moment may not hold a single book in his hands, and on the other hand, this copy of the book may not be in the possession of any reader, but simply stand on a shelf in the library.

Now we need to reflect the last entity that is associated with the system directory. The system catalog contains a list of all areas of knowledge, information on which is contained in library books. We can remember the system directory in the library, from which we usually start the search for the books we need if we do not know their authors and titles. The name of the knowledge area can be long and consist of several words, therefore, to model the system catalog, we will introduce the entity "System catalog" with two attributes: "Knowledge area code" and "Knowledge area name". The "Knowledge Area Code" attribute will be the key attribute of the entity.

From the description of the subject area, we know that each book can contain information from several areas of knowledge, and on the other hand, it is known from practice that the library can contain many books related to the same area of ​​knowledge, so we need to establish between the entities "System catalog" and "Books" relationship "myo-to-many", binding on both sides. Indeed, in the system catalog there should not be such a field of knowledge, information on which is not presented in any book of our library, otherwise it would be meaningless. And vice versa, each book should be assigned to one or more areas of knowledge in order for the reader to find it faster.

The mythological model of the subject area "Library" is shown in fig. 7.6.

Rice. 7.6.Mythological model "Library"

The mythological model "Library" was developed by us for the tasks that were listed earlier. In these tasks, we did not set the condition for storing the history of reading a book, for example, in order to find someone who previously held a book and could harm it or forget it by accident a large amount money. If we set ourselves the task of storing this information as well, then our info-logical model would be different. I will leave this task to your independent creativity.







2023 gtavrl.ru.