1. Trang chủ
  2. » Giáo án - Bài giảng

Database design using entity relationship diagrams

388 0 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Database Design Using Entity-Relationship Diagrams
Tác giả Sikha Saha Bagui, Richard Walsh Earp
Trường học CRC Press
Chuyên ngành Database Design
Thể loại book
Năm xuất bản 2023
Thành phố Boca Raton
Định dạng
Số trang 388
Dung lượng 5,18 MB

Cấu trúc

  • Chapter 1 Data, Databases, and the Software (0)
    • 1.1 Introduction (32)
    • 1.2 Data (32)
    • 1.3 Building a Database (33)
    • 1.4 What Is the Sof ware Engineering Process? (0)
    • 1.5 Entity-Relationship Diagrams and the (38)
      • 1.5.1 Phase 1: Get the Requirements for (39)
      • 1.5.2 Phase 2: Specify the Database (39)
      • 1.5.3 Phase 3: Design the Database (40)
    • 1.6 Chapter Summary (40)
  • Chapter 1 Exercises (41)
  • Chapter 2 Data and Data Models (0)
    • 2.1 Introduction (42)
    • 2.2 Files, Records, and Data Items (42)
    • 2.3 Moving From 3 × 5 Cards to Computers (45)
    • 2.4 Database Models (50)
      • 2.4.1 T e Hierarchical Model (0)
        • 2.4.1.1 T e Hierarchical Model (0)
        • 2.4.1.2 Relationship Terminology (57)
        • 2.4.1.3 Drawbacks of the (58)
    • 2.5 T e Network Model (0)
    • 2.6 T e Relational Model (0)
    • 2.7 Chapter Summary (63)
  • Chapter 3 The Relational Model and Functional (0)
    • 3.1 Introduction (64)
    • 3.2 Fundamentals of Relational Database (64)
    • 3.3 Relational Database and Sets (67)
    • 3.4 Functional Dependency (69)
    • 3.5 Non-1NF to 1NF (71)
    • 3.6 T e Second Normal Form (0)
      • 3.6.1 Anomalies (79)
      • 3.6.2 Non-2NF to 2NF (80)
    • 3.7 T e T ird Normal Form (0)
    • 3.8 T e Equijoin Operation (0)
    • 3.9 Some Functional Dependency Rules (90)
    • 3.10 T e Boyce–Codd Normal Form (0)
    • 3.11 Chapter Summary (99)
  • Chapter 3 Exercises (99)
  • Chapter 4 The Basic ER Diagram : A Data Modeling Schema (0)
    • 4.1 Introduction (102)
    • 4.2 What Is a Data Modeling Schema? (102)
      • 4.2.1 What Is an Entity-Relationship Diagram? (103)
    • 4.3 Def ning a Database—Some Def nitions: Entity, Relationship, and Attribute (0)
      • 4.3.1 A Beginning Methodology (105)
      • 4.3.2 ER Design Methodology (106)
    • 4.4 A First “Entity-Only” ER Diagram (107)
    • 4.5 More about Attributes (110)
      • 4.5.1 T e Simple or Atomic Attribute (0)
      • 4.5.2 T e Composite Attribute (0)
      • 4.5.3 T e Multivalued Attribute (0)
      • 4.5.4 T e Derived Attribute (0)
      • 4.5.5 Keys (113)
    • 4.6 English Description of the Entity (118)
      • 4.6.1 T e Method (0)
        • 4.6.1.1 T e Entity (0)
        • 4.6.1.2 T e Attributes (0)
        • 4.6.1.3 T e Keys (0)
      • 4.6.2 ER Design Methodology (119)
      • 4.6.3 Examples (119)
        • 4.6.3.1 Figure 4.3 Example (120)
        • 4.6.3.2 Figure 4.4 Example (121)
        • 4.6.3.3 Figure 4.5a Example (122)
        • 4.6.3.4 Figure 4.6 Example (122)
        • 4.6.3.5 Figure 4.7 Example (123)
    • 4.7 Mapping the Entity Diagram to a (125)
    • 4.8 Chapter Summary (130)
  • Chapter 4 Exercises (130)
  • Chapter 5 Beyond the First Entity Diagram (0)
    • 5.1 Introduction (138)
    • 5.2 Examining an Entity: Changing an (139)
    • 5.3 Def ning a Relationship for Our New Entity (0)
      • 5.3.1 ER Design Methodology (142)
    • 5.4 A Preliminary Grammar for ER Diagrams (143)
      • 5.4.1 T e Relationship (0)
    • 5.5 Def ning a Second Entity (0)
    • 5.6 Does a Relationship Exist? (148)
    • 5.7 Attribute or Relationship? (148)
      • 5.7.1 ER Design Methodology (149)
    • 5.8 Chapter Summary (150)
  • Chapter 5 Exercises (150)
    • 6.1 Introduction (160)
    • 6.2 T e Cardinality Ratio of a Relationship (0)
      • 6.2.1 One to One (1:1) (162)
      • 6.2.2 Many to One (M:1) (162)
      • 6.2.3 One to Many (1:M) (163)
      • 6.2.4 Many to Many (M:N) (163)
    • 6.3 Participation: Full/Partial (165)
    • 6.4 English Descriptions (166)
    • 6.5 Tighter English (168)
      • 6.5.1 Pattern 1—x:y::k:1 (170)
      • 6.5.2 Pattern 2—x:y::k:1 (170)
      • 6.5.3 Pattern 3—x:y::k:M (171)
      • 6.5.4 Pattern 4—x:y::k:M (171)
      • 6.5.5 Summary of the Patterns and (172)
        • 6.5.5.1 Pattern 1 (172)
        • 6.5.5.2 Pattern 1 (172)
        • 6.5.5.3 Pattern 2 (173)
        • 6.5.5.4 Pattern 2 (174)
        • 6.5.5.5 Pattern 3 (174)
        • 6.5.5.6 Pattern 3 (174)
        • 6.5.5.7 Pattern 4 (175)
        • 6.5.5.8 Pattern 4 (176)
      • 6.5.6 ER Design Methodology (177)
    • 6.6 Some Examples of Other Relationships (178)
      • 6.6.1 An Example of the One-to-Many (178)
        • 6.6.1.1 Pattern 4–1:M, From the (179)
        • 6.6.1.2 Pattern 2—M(Partial):1, (180)
      • 6.6.2 An Example of the Many-to-One (181)
        • 6.6.2.1 Pattern 1—M:1, From the (181)
      • 6.6.3 An Example of the Many-to-Many (182)
        • 6.6.3.1 Pattern 3—M:N, From the (183)
        • 6.6.3.2 Pattern 4—N:M, From the (183)
    • 6.7 One Final Example (184)
      • 6.7.1 ER Design Methodology (185)
        • 6.7.1.1 T e Entity (0)
        • 6.7.1.2 T e Entity (0)
      • 6.7.2 Pattern 1—M:1, From the M Side, (189)
      • 6.7.3 Pattern 3–1:M, From the 1 Side, (190)
    • 6.8 Mapping Relationships to a (190)
      • 6.8.1 Mapping Binary M:N Relationships (190)
      • 6.8.2 Mapping Binary 1:1 Relationships (192)
      • 6.8.3 Mapping Binary 1:N Relationships (198)
    • 6.9 Chapter Summary (199)
  • Chapter 7 The Weak Entity (0)
    • 7.1 Introduction (210)
    • 7.2 Strong and Weak Entities (210)
    • 7.3 Weak Entities and Structural Constraints (215)
    • 7.4 Weak Entities and the Identifying Owner (215)
      • 7.4.1 Another Example of a Weak (217)
    • 7.5 Weak Entities Connected to Other (217)
    • 7.6 Revisiting the Methodology (220)
    • 7.7 Weak Entity Grammar (221)
      • 7.7.1 T e Keys (0)
    • 7.8 Mapping Weak Entities to a (222)
    • 7.9 Chapter Summary (224)
  • Chapter 7 Exercises (225)
  • Chapter 8 Further Extensions for ER Diagrams (0)
    • 8.1 Introduction (232)
    • 8.2 Attributes of Relationships (232)
      • 8.2.1 T e Attributes (0)
    • 8.3 Relationships Developing into Entities (235)
      • 8.3.1 T e Entity (0)
        • 8.3.1.1 T e Attributes (0)
        • 8.3.1.2 T e Keys (0)
    • 8.4 More Entities and Relationships (237)
      • 8.4.1 More T an Two Entities (0)
        • 8.4.1.1 Pattern 4—x:y::1:M, (238)
        • 8.4.1.2 Pattern 1—x:y::M:1, (238)
      • 8.4.2 Adding More Attributes (240)
    • 8.5 More Evolution of the Database (244)
    • 8.6 Attributes T at Evolve into Entities (0)
    • 8.7 Recursive Relationships (247)
      • 8.7.1 Recursive Relationships and (250)
        • 8.7.1.1 One-to-One Recursive (250)
        • 8.7.1.2 One-to-Many Recursive (251)
    • 8.8 Multiple Relationships (253)
    • 8.9 T e Derived or Redundant Relationship (0)
    • 8.10 Optional: An Alternative ER Notation (259)
    • 8.11 Review of the Methodology (261)
      • 8.11.1 ER Design Methodology (261)
      • 8.11.2 T e Entity (0)
        • 8.11.2.1 T e Attributes (0)
        • 8.11.2.2 T e Keys (0)
    • 8.12 Mapping Rules for Recursive Relationships (263)
    • 8.13 Chapter Summary (265)
  • Chapter 8 Exercises (265)
  • Chapter 9 Ternary and Higher-Order ER Diagrams (0)
    • 9.1 Introduction (270)
    • 9.2 Binary or Ternary Relationship? (271)
    • 9.3 Structural Constraints for Ternary (274)
      • 9.3.1 Many to Many to Many (M1:M2:M3) (274)
    • 9.4 An Example of an n -ary Relationship (276)
    • 9.5 n -ary Relationships Do Not Preclude (277)
    • 9.6 Methodology and Grammar for the (278)
      • 9.6.1 A More Exact Grammar (280)
        • 9.6.1.1 Pattern 3—M:N, From (280)
        • 9.6.1.2 Pattern 3—k:M, from the (280)
        • 9.6.1.3 Pattern 5 ( n -ary)—x:y:z::a:b:c, (281)
    • 9.7 Ternary Relationships From Relationship-Relationship Situations (283)
    • 9.8 n-a ry Relationships T at May Be (0)
    • 9.9 Mapping n -ary Relationships to a (290)
    • 9.10 Review of the Methodology (291)
      • 9.10.1 ER Design Methodology (291)
    • 9.11 Chapter Summary (292)
  • Chapter 9 Exercises (293)
  • Chapter 10 The Enhanced Entity-Relationship (EER) Model (0)
    • 10.1 Introduction (296)
    • 10.2 What Is a Generalization or Specialization? (296)
    • 10.3 Variants (298)
    • 10.4 Examples of Generalizations or (299)
    • 10.5 Methodology and Grammar for Generalization/ (305)
    • 10.6 Mapping Rules for Generalizations (307)
      • 10.6.1 Mapping Rule 15 (308)
      • 10.6.2 Mapping Rule 16 (311)
      • 10.6.3 Mapping Rule 17 (312)
      • 10.6.4 Mapping Rule 18 (313)
    • 10.7 Subclasses of Subclasses (314)
      • 10.7.1 Mapping Rule 19 (316)
    • 10.8 Categories or Union Types (317)
      • 10.8.1 Participation Ratios in (319)
      • 10.8.2 Mapping Categories or Union (322)
    • 10.9 Final ER Design Methodology (323)
      • 10.9.1 ER Design Methodology (324)
    • 10.10 Chapter Summary (325)
  • Chapter 11 Relational Mapping and Reverse (0)
    • 11.1 Introduction (330)
    • 11.2 Steps Used to Map ER/EER Diagrams (330)
    • 11.3 Reverse Engineering (337)
      • 11.3.1 Reverse Engineering Rule 1 (338)
      • 11.3.2 Reverse Engineering Rule 2. Look (339)
      • 11.3.3 Reverse Engineering Rule 2a. Check (342)
      • 11.3.4 Reverse Engineering Rule 3. Look (342)
      • 11.3.5 Reverse Engineering Rule 3a (345)
      • 11.3.6 Reverse Engineering Rule 3b (345)
      • 11.3.7 Reverse Engineering Rule 4. Check (347)
      • 11.3.8 Reverse Engineering Rule 4a (347)
      • 11.3.9 Reverse Engineering Rule 4b (347)
      • 11.3.10 Reverse Engineering Rule 5. Check (349)
      • 11.3.12 Reverse Engineering Rule 5b. Check (351)
      • 11.3.13 Reverse Engineering Rule 5c. Check (352)
      • 11.3.14 Reverse Engineering Rule 6. Check (352)
      • 11.3.15 Reverse Engineering Rule 7. Check (352)
    • 11.4 Chapter Summary (353)
  • Chapter 12 A Brief Overview of the Barker/Oracle-Like (0)
    • 12.1 Introduction (356)
    • 12.2 A First “Entity-Only” ER Diagram (357)
    • 12.3 Attributes in the Barker/Oracle-Like (358)
      • 12.3.1 Optional versus Mandatory (358)
    • 12.4 Relationships in the Barker/Oracle-Like (360)
    • 12.5 Structural Constraints in the Barker/ Oracle-Like Model (360)
    • 12.6 Dealing with the Concept of the Weak (363)
    • 12.7 Dealing with the Concept of Multivalued (364)
    • 12.8 Treatment of Foreign Keys (367)
    • 12.10 Mapping M:N Relationships (367)
    • 12.11 Chapter Summary (371)
  • Chapter 12 Exercises (371)

Nội dung

Tis text concentrates on steps 1 through 3 of the sofware life cycle for databases. A database is a collection of related data. Te concept of related data means a database stores information about one enterprise: a business, an organization, a grouping of related people or processes. For example, a database might contain data about Acme Plumbing and involve customers and service calls. A diferent database might be about the members and activities of a church group in town. It would be inappropriate to have data about the church group and Acme Plumbing in the same database because the two organizations are not related. Again, a database is a collection of related data. To keep a database about each of the above entities is fne, but not in the same database.

Data, Databases, and the Software

Introduction

In this chapter, we present essential concepts related to database design, defining key terms such as data and database We also outline a systematic approach to effectively designing a database, ensuring a comprehensive understanding of its foundational principles.

Data

Data refers to factual information regarding individuals or entities, such as a person's name, address, and gender For instance, specific data points about an individual might include "Mary Jo Davis" and "123 4th St."

A database is defined as a collection of related data, which requires a common characteristic among the entries For example, a list of names, addresses, and genders becomes a database when the individuals share a common trait, such as being golfers, dog owners, or subjects of a street interview Without this defining relationship, a mere set of facts about individuals does not qualify as a database.

To put it another way: Why are these people’s names and addresses being kept in one list?

2 • Database Design Using ER Diagrams

1 A tree is classifed as a “large oak tree about 100 years old.” What are three facts about this tree?

2 Another tree has the following characteristics: pine, small, 15 years old If I write the information about the oak and pine trees on a piece of paper, could you consider this a database? Why or why not?

3 Why is the piece of paper not a database of trees?

Building a Database

Constructing a database, such as one for pantry items, begins with listing the visible items on paper While this approach creates a basic collection of related data, the effectiveness of the database relies on the methodology used in its construction Key considerations include the purpose of the database, the intended users, and the ease of finding information within it A more systematic approach involves planning the database structure before compiling the list, ensuring it meets the needs of its users and facilitates quick access to relevant facts.

A bit of pre-thinking will save time in the long run because you plan how the list is to be used and by whom

When dealing with sofware and computer-related activity like data- bases, there exists a science of “how to” called sofware engineering (SE)

SE is a process of specifying systems and writing sofware To design a good database, we will use some ideas from SE

This chapter provides an overview of Software Engineering (SE) in the context of database planning, emphasizing the importance of a sound relational database design It highlights the prevalence of relational database models among the various types historically used, and aims to present a methodology rooted in SE principles for effectively designing relational databases.

Data, Databases, and Sofware Engineering • 3

You have a set of books on bookshelves in your house Someone asks you to create a list of all the books you have

1 Who is going to use this list?

2 When the list is completed, will it be a database?

3 What questions should be asked before you begin?

4 What is the question-and-answer procedure in question 3 going to accomplish?

Software engineering encompasses the comprehensive process of specifying, designing, coding, delivering, maintaining, and ultimately retiring software Often described as the "software life cycle," this field highlights that software has a defined beginning and end Numerous valuable resources are available on this subject.

SE Some are referenced at the end of this chapter

Some authors use the term sof ware engineering synonymously with

Systems analysis and design are essential for the effective development of any information system, as they encompass a variety of tasks related to information systems A key focus in this process is the specification and design of databases, which involves documenting the intended contents of the database and outlining the overall design strategy.

In software engineering, the fundamental principle is to develop software correctly through a structured life cycle comprising a series of steps or phases These steps emphasize the importance of thoughtful planning, ensuring that the needs are clearly identified before any coding begins Effective communication and understanding among all stakeholders in the software development process are crucial for this "thinking before action" approach A commonly referenced model for this methodology is the "waterfall" model, which illustrates a linear progression in the development process, where decisions made at each stage are difficult to revisit, much like water flowing over a waterfall.

4 • Database Design Using ER Diagrams

The initial stage of the software engineering process focuses on clearly defining the project's objectives This stage can be divided into two key components: first, the clarification of requirements, and second, the consensus on a formal specification document.

The waterfall model in software development suggests that once the specifications are finalized and approved by the user, they remain unchanged and serve as the foundation for the design phase This process can be compared to constructing a house, where the initial discussion clarifies the client's desires, and the specification document acts as a formal record of those requirements.

When you approach a builder to create a three-bedroom, two-bath house, the builder engages in a process called "elucidation," asking questions about your preferences, such as the number of stories and materials This dialogue helps the builder gather and organize your wishes into a final specification that reflects your needs Just like building a house, developing software and databases requires clear communication to avoid wasting time on designs that do not meet your expectations Formalizing this process ensures that both you and the builder are aligned, allowing the project to move forward efficiently.

Once the specifications for a house are finalized, the next step is to design the house according to those specifications During the design phase, only minor alterations to the specifications are acceptable, as a clear agreement must be reached before proceeding with the blueprint This process mirrors software and database development, which also follows a life-cycle approach—where software is created, utilized, maintained, and ultimately retired.

In the software development life cycle, participants are typically categorized into two main groups: users and analysts Analysts design software based on user specifications, aiming to capture what users believe they need For instance, in our example of compiling a list of books for a home library, the user is the individual requesting the list, while the analyst is responsible for creating that list, acting as the software writer, builder, or designer.

Data, Databases, and Sofware Engineering • 5

There is no consensus among software engineers on the precise number of steps or phases in a software development model, as these models differ based on the specific interests of the software engineering researcher in various aspects of the process A concise overview of the software development process will be provided, with the term "software" encompassing databases as well.

Step 1 (or Phase 1): Requirements Find out what the user wants/needs

T e “fnding-out procedure” is ofen called “elucidation.”

Step 2: Specif cation Write out the user’s wants/needs as precisely as possible In this step, the user and analyst document not only what is desired but also how much it will cost and how long it will take to go into use A basic principle of SE is to generate sofware on time and on budget Terefore, in addition to making each other understand what is wanted/needed, a very essential step is to defne a budget and timeline for creating the product

In Step 2a, the specification document undergoes a formal review to ensure that the user agrees with the analyst's interpretation of their needs, and that the analyst confirms the user’s requirements are clearly defined.

In Step 2b, it is essential to revise the specifications as needed and revisit Step 2a until both the analyst and the user reach a mutual understanding and agreement to proceed It's important to keep in mind the principles of the waterfall model, which emphasizes that once the specification phase concludes, revisiting earlier stages is not an option.

Step 3: Design—Sofware or a database is designed to meet the speci- fcation from step 2 As in-house building, now the analyst (the builder) knows what is required, so the plan for the sofware is for- malized—a blueprint is drawn up

Entity-Relationship Diagrams and the

THE SOFTWARE ENGINEERING LIFE CYCLE

This article focuses on the first three steps of the software life cycle for databases, emphasizing that a database is a structured collection of related data specific to a single entity, such as a business or organization For instance, a database may store information about Acme Plumbing, including customer details and service calls, while another database could track the members and activities of a local church group It is crucial to maintain separate databases for unrelated entities to ensure data integrity and relevance, highlighting the importance of organizing data appropriately within distinct databases.

Database systems are typically designed using an entity-relationship (ER) diagram, which serves as a blueprint for creating the actual database This finalized blueprint is produced during the design phase and acts as a crucial tool for analysts to visualize the data that will be stored in the system The initial phase, known as the requirements phase, can be challenging, as analysts must extract the needs and desires of the users, who may have varying levels of clarity regarding their requirements.

8 • Database Design Using ER Diagrams

Many users may be "computer savvy," yet they might not fully understand the capabilities of a software system This often presents a challenge for analysts who struggle to accurately interpret user needs and desires To effectively bridge this gap, it's essential for the analyst to develop a specification that is clear and comprehensible for both the user and the analyst, enabling efficient design and development processes.

In any project, users or user groups must effectively communicate their needs and ideas to analysts or analyst teams It is essential for users to articulate their desires and perceived requirements, while analysts are responsible for gathering, documenting, and transforming these insights into a structured plan that meets the users' expectations.

User descriptions can often appear vague and unstructured, as users typically possess a strong understanding of their business and its model In contrast, computer professionals may lack this business knowledge but are well-versed in technical aspects This disconnect means that a user's explanation of their business can be as unfamiliar to the analyst as computer terminology is to the user To address this challenge, we propose a methodology that clarifies the analyst's language, ensuring users feel comfortable with the database design while equipping analysts with effective tools for direct mapping into the database.

In brief, next we review the early steps in the SE life cycle as it applies to database design

1.5.1 Phase 1: Get the Requirements for the Database

In the first phase, we engage with users to understand the specific data they wish to organize within a database retrieval system This step involves encouraging users to articulate their intended use of the data, which is crucial for the analyst Ultimately, the analyst will develop a process for both loading and retrieving data from the database It's important to note that there may be a learning curve for the analyst as they familiarize themselves with the user's system and requirements.

Phase 2 involves grammatical descriptions and diagrams of what the analyst thinks the user wants Database design is usually accomplished with an ER diagram functioning as the blueprint for the to-be-designed database Since most users are unfamiliar with the notion of an ER dia- gram, our methodology will supplement the ER diagram with grammati- cal descriptions of what the database is supposed to contain and how the

Understanding the interrelation of database components is crucial for effective communication between analysts and users While technical descriptions can often be tedious, translating these concepts into clear, understandable statements fosters better collaboration For instance, when an analyst states, “all employees must generate invoices,” it allows users to confirm, deny, or adjust the assertion to reflect reality This distinction is vital, as the implications of “all employees must generate invoices” differ significantly from “some employees may generate invoices,” highlighting the importance of precise language in database management.

After finalizing the database diagram, the ER diagram serves as the definitive blueprint for constructing the database in phase 3 Transitioning from the ER diagram to the actual database is similar to a builder using a house blueprint to initiate construction.

This book focuses on database design, acknowledging that while there may be additional steps in the software engineering process, these steps are not the primary emphasis of our discussion.

1 Briefy describe the major steps of the SE life cycle as it applies to databases

2 Who are the two main players in the sofware development life cycle?

3 Why is written communication between the parties in the design process important?

4 What is the blueprint from which the actual database is created called?

5 What are the three important phases of designing a database?

Chapter Summary

This chapter provides foundational insights into data, databases, and the software engineering (SE) process, specifically focusing on how the SE process relates to Entity-Relationship (ER) diagrams, which serve as the blueprint for database design.

10 • Database Design Using ER Diagrams

Exercises

David runs a busy golf shop specializing in retail golf equipment and serving numerous customers To efficiently manage his operations, he seeks to transition to a computer-based system for tracking sales and inventory He consults Kaitlin, a computer expert, for guidance on how to implement this technology effectively.

1 In our context, David is a ; Kaitlin is a

2 When David explains to Kaitlin what he wants, Kaitlin begins writ- ing what?

3 When David says, “Kaitlin, this specifcation is all wrong,” what hap- pens next?

4 If David says, “Kaitlin, this specifcation is acceptable,” what happens next?

5 If, during the design, Kaitlin realizes David forgot to tell her about something he wants, what is Kaitlin to do?

6 How does Kaitlin get David’s specifcations in the f rst place?

7 Step 3a from above suggests: “Sofware design is independently checked against the specifcation.” What does this mean? How is it do be done?

To ensure the effectiveness of the database design, it is advisable for David and Kaitlin to seek an independent review Engaging co-workers for this evaluation can provide valuable insights and feedback, helping to refine the design before moving forward with implementation.

Schach, S.R (2011) Object-Oriented and Classical Sof ware Engineering New York:

Data and Data Models

Introduction

This chapter explores the evolution of databases, highlighting the development of the relational database model, which has become the foundation for most databases in use today While various historical methods for data storage and retrieval still exist, our primary goal in this book is to design an effective relational database We will examine how the relational model emerged from earlier approaches to data management, setting the stage for modern database design.

In this chapter, we will introduce functional dependencies to define the characteristics of an effective relational database Our goal is to familiarize you with relational databases while also examining their evolution, which highlights the simplicity and strength of the relational model.

Files, Records, and Data Items

Data must be organized in a file to be useful, reminiscent of a time before computers when businesses relied on paper documents to manage customer and product information For example, a doctor's office tracked patient records, while a sports team maintained player statistics, all stored in filing cabinets These collections of data can be referred to as a "database," which serves as a repository for information about specific entities, such as a simple customer file listing essential details.

Effective database design utilizing ER diagrams focuses on the relationships between entities, such as merchants and their customers Key components include efficient data storage and retrieval methods that enhance the accessibility of information, making it easier to manage business interactions.

A customer file is a collection of individual customer records, each containing essential information such as the customer's name, address, and phone number These records are made up of various data items, also referred to as fields In essence, a customer file encompasses multiple records, each consisting of specific fields of data that provide a comprehensive overview of customer information.

Table 2.1 presents an example of some data (you can imagine each line as a 3 × 5 card, with the three cards [three records] making up a f le).

The T table contains three records, each representing a different customer Each record includes four fields: record number, name, address, and city As new customers are added, their information will be documented in additional records.

3 × 5 card (a new record) and placed in the customer fle Several inter- esting questions and observations arise for the merchant keeping this information:

1 Te merchant may well want to add information, such as a telephone number, in the future Would you add a phone number to all 3 × 5 cards, or would the adding be done “as necessary”? If it were done

In our system, customers may or may not have telephone numbers associated with their records For those without a phone number, we designate their contact information as "null," which signifies that the information is unknown.

2 How will the fle be organized? Imagine not three customers, but 300 or 3,000 Would the 3 × 5 cards be put in alphabetical order? Perhaps, but what happens if you get another A McDonald or S Claud? Or, what if you wanted to fnd addresses more ofen than you wanted to

Record 1 McDonald, A 123 4th St Mobile, AL Record 2 Claud, S 452 Main St Pensacola, FL Record 3 Jarzin, D 92 Adams Lane Elberta, AL

Data and Data Models • 13 fnd names? T e feld on which the fle is organized is called a key

Organizing a file by telephone number or address instead of by name can enhance data retrieval efficiency A unique key is essential for locating information, but challenges arise when duplicate values for a key exist.

3 Suppose the fle were organized by telephone number What if the telephone number for a customer were not recorded because you either didn’t know the number or the customer didn’t have a phone?

In database management, a telephone number serves as a critical identifier for customer records; if this field is null, it undermines the organization of the data A key in a database must always have a value, as null entries would complicate record retrieval Therefore, if the telephone number is designated as the key for a customer file, it is essential for users to know the number in order to efficiently locate the corresponding record.

4 Te format of the fle above is:

The format of a file determines the sequence of fields within each record In this specific record, the first field is the record number, followed by the name, and subsequent fields Although the file design may allow for different field arrangements, once established, the order of these fields remains fixed.

If a telephone number feld were added, then the fle format could be: AaaaaaaaaaaaaaaaaaaaaaaaAaaaaaAaaaaaaaaAaaaaaAaaaaaaaaaaA

The illustrated shorthand format notation is referred to as a file design When the file is configured to locate data by name, the name serves as the key, which is indicated by underlining it, such as in the example: aaaaaaaaaaaaaaaaaaaaaaaAAaaaaaAaaaaaaaaAaaaaaAaaaaaaaaaaA.

5 You might ask, “Why not use the record number to organize the fle?” On one hand, it is unique (which is desirable for a key), but on the other hand, you would have to know the record number to f nd a customer Te example is organized by record number; however, imagine 300 or more customers You want to fnd Claud’s address— you would have to know the record number It makes more sense to organize this fle of 3 × 5 cards by name Taking some of these

14 • Database Design Using ER Diagrams

Sample Data for Enhanced aaaaaaaaAaaaa

Record 1 Adams, A 77 A St Pensacola, FL 555-5847 Record 2 Charles, X 365 Broad St Mobile, AL 555-8214 Record 3 Jarzin, H 92 Adams Lane Elberta, AL 555-1234 Record 4 McDonald, A 22 Pine Forest Pensacola, FL null Record 5 McDonald, J 123 4th St Mobile, AL 555-9978 Record 6 Richard, E 932 Dracena Gulf Breeze, 555-1111

Way FL Record 7 Claud, S 452 Main St Pensacola, FL 555-0003 points into consideration, Table 2.2 presents an enhanced version of the customer fle in which each line represents a 3 × 5 card

1 What does it mean to say a feld has unique values?

2 Why is it desirable to have a key be unique?

3 Why does a fle have to be organized by a key f eld?

FL Record 2 Charles, 365 Broad 555-8214 Mobile,

Lane AL What is wrong here?

Moving From 3 × 5 Cards to Computers

As the merchant's customer base expanded, the need for more comprehensive customer information grew This evolution in data processing involved implementing enhancement techniques that improved the organization of customer records, increased the number of fields on the 3 × 5 cards, and introduced more efficient methods for storing and retrieving individual records.

Customer record management raises several important questions regarding organization and accessibility Key considerations include whether records were maintained in alphabetical order by name, stored by telephone number, or organized by record or customer number Additionally, the challenges of accommodating new fields not present on existing forms or cards, as well as the implications of modifying data on record formats, highlight the data-processing dilemmas faced in the past.

In the early days of business computing, data was primarily stored on magnetic media such as disks and tapes, resembling the concept of virtual 3 × 5 cards This data was not physically tangible and required software and a display device to access and view the information Before 1975, punched cards were the standard method for inputting data into computers, a practice that dates back to the 1930s when sorters could efficiently scan and organize these cards By the 1960s, the use of punched cards for data entry had become widespread, with line printers serving as the typical output or display device.

In the early development of databases, software was created to manage the data stored on computers, which initially mirrored the traditional 3 × 5 card filing system During this initial phase, numerous challenges arose with both computers and databases, marking a significant evolution in data handling techniques from the early to mid-20th century.

In the 1960s, challenges in computing included data input, output display, and file maintenance, which encompassed storing, updating, and managing records To efficiently manage data, individuals could purchase a computer and employ programmers, operators, and data entry staff.

In the early days of computing, the high costs and large sizes of computers made them inaccessible for most small businesses, which often lacked the resources to purchase such technology or employ dedicated computer staff Consequently, data filing and retrieval were primarily the domain of large organizations that could afford these expenses.

When a company invests in a computer and the necessary infrastructure, various departments, such as sales, accounting, and production, seek to store different types of files on the system Each department requires access to customer data to enhance their operations and improve overall efficiency.

Different departments within the organization had varying perspectives on customer information The sales department sought details such as name, address, telephone number, and data related to purchasing tendencies Meanwhile, the accounting department required similar information but focused on tracking billing and payments The production team also needed customer insights, specifically regarding product requirements and production quantities Although each department aimed to gather comparable information, their approaches to the issue differed significantly.

In the early days of computing, departments shared expensive computers but maintained separate programming staff to manage their individual databases While there was an understanding of sharing the hardware, data sharing was largely absent The concept of a unified software package for data storage and retrieval was not yet developed Programmers utilized various languages, including COBOL, RPG, ALGOL, PL/1, and FORTRAN, to manage their data Consequently, each department independently created its own records, storage methods, and data entry teams, leading to fragmented data management practices.

The earliest databases were filing systems created by programmers using computer languages like COBOL Programmers developed COBOL programs to collect input data from punched cards and store it in computer files Subsequently, they wrote additional programs to retrieve and present the data according to user preferences These early computer filing systems were primarily simple sequential files.

Te data on punched cards was read and stored In Table 2.3, we recon- sider the customer fle we introduced previously

If you could look at the data on a disk, it would look like Table 2.4

Record 1 Adams, A 77 A St Pensacola, FL 555-5847 Record 2 Charles, X 365 Broad St Mobile, AL 555-8214 Record 3 Jarzin, H 92 Adams Lane Elberta, AL 555-1234 Record 4 McDonald, A 22 Pine Forest Pensacola, FL null Record 5 McDonald, J 123 4th St Mobile, AL 555-9978 Record 6 Richard, E 932 Dracena Way Gulf Breeze, FL 555-1111 Record 7 Claud, S 452 Main St Pensacola, FL 555-0003

Appearance of Sample Data on Disk

Adams, A 77 A St Pensacola, FL 555-5847 Charles, X 365 Broad St Mobile, AL 555-8214 Jarzin, H 92 Adams Lane Elberta, AL 555-1234 McDonald, A 22 Pine Forest Pensacola, FL null McDonald, J 123 4th St Mobile, AL 555-9978 Richard, E 932 Dracena Gulf Breeze, 555-1111

Claud, S 452 Main St Pensacola, FL 555-0003

Te records as addressed by COBOL had a structure like this:

The file depicted above is known as a "sequential file," which requires sorting to display data by address instead of name When new data is added, it must be placed correctly according to the sequential key, such as the name field Early methods focused on punched cards and involved extensive sorting, both for punched cards and for data stored on disks or tapes.

In the 1960s, to address the limitations of sorting for data retrieval, two main filing systems emerged: indexed and direct access filing systems While tapes were sequential devices that could store large amounts of data, disks offered the advantage of random access, allowing for more efficient non-sequential data retrieval This innovation led to the development of software systems that utilized disk storage effectively By the late 1960s, the first database systems began to appear, revolutionizing data management.

Database systems were purchasable programs to store and retrieve data as well as perform maintenance (adding, deleting, and modifying f elds

Database design has significantly evolved with the introduction of ER diagrams, allowing for more efficient data management without the need for COBOL programming Instead of individual departments managing their own data through separate programming efforts, a centralized database system enables shared access to data and resources This shift allows for a single programming staff and a unified database software package, ensuring that each department can maintain its unique perspective on the data while benefiting from a collaborative environment.

Transitioning from the traditional "my data" approach took several years, as hybrid systems primarily focused on data retrieval This emphasis delayed the shift to a fully relational environment, largely due to the significant investments companies had made in software and programming resources.

Database Models

Before the widespread adoption of relational databases, the primary database models were hierarchical and network, which dominated database software prior to the 1980s While these legacy systems are often viewed as outdated, some applications still rely on them today, highlighting their historical significance and the reasons why the relational model has become the standard in modern database management.

In this section, we present some versions of the hierarchical model for several reasons:

(a) To illustrate how older models were constructed from f le systems (b) To show why these fle-based databases became outdated when rela- tional databases became practical

(c) To see the evolution of f le-based systems

The following discussion highlights various database systems that were developed before the advent of relational databases, emphasizing their strengths and weaknesses This comparison aims to illustrate the significant advancements in database design and functionality that relational databases offer, showcasing their superiority in modern data management.

20 • Database Design Using ER Diagrams

In hierarchical database models, data is organized in a top-down structure where each record can have one or more dependent or child records, with each child record linked to a single parent record The parent-child terminology, while historical, is used to depict the dependency of one type of data on another, rather than implying a familial relationship This model can also be referred to as an owner-object relationship, but the parent-child terms are more widely recognized.

“child” records will be sports played by a “parent” person

In a hierarchical database structure, consider a scenario involving individuals who participate in sports at specific locations For instance, Brenda is a tennis player at city courts and a golfer at municipal links, placing her at the top of the hierarchy The sport locations represent the second tier in this system, establishing a parent-child relationship where each parent (Brenda) can link to multiple child records (sport locations), while each child record is associated with only one parent To effectively manage this hierarchical data, a two-file model can be utilized, consisting of one file for individuals and another for sport locations For these files to be interconnected, references or pointers must be established, allowing each sport location (child) to point back to the corresponding individual (parent).

AaaaaaaaaaaaaaAaaaaaaaaaAaaaaaaaaaAaaAaaaaaaaaAaaaaaaAaaaaaaaA With data: tennis, city courts, 1 golf, municipal links, 1 golf, municipal links, 2 snorkeling, Pensacola Bay, 2 running, UWF track, 2 downhill skiing, Ski Beech, 3

A Hierarchy of Persons and Sports with Parent Pointers

A diagram of this relationship is shown in Figure 2.1

The specific location of records on a disk is not crucial, as the meaning of the data remains intact even if the locations are represented by disk addresses For instance, consider a scenario involving thousands of individuals and various sports, where we define a "record address" as a four-letter alphanumeric string.

With data: golf, municipal links, B259 running, UWF track, B259 downhill skiing, Ski Beech, C333 snorkeling, Pensacola Bay, B259 tennis, city courts, A45C golf, municipal links, A45C

The system establishes a parent-child relationship through a pointer from the child to the parent In the SPORTS file, the "reference to person" corresponds to the primary key in the PERSON file, which is identified as a foreign key.

A foreign key (FK) establishes a crucial link between two files by referencing the primary key of another file, thereby creating a meaningful relationship Without this connection, the files would remain independent and unrelated, resulting in a disjointed system that lacks coherence.

22 • Database Design Using ER Diagrams

In discussing the parent-child relationship within a database, it's important to acknowledge its limitations For effective querying, data must be organized strategically, which necessitates that database designers consider the questions users are likely to ask.

To find out who plays golf at municipal links, begin by searching the SPORTS file for the term "golf at municipal links" and identifying the relevant parent records This process is straightforward since it starts with the specific sport in question However, if your inquiry is about "What sports does Richard play?", the approach changes You must first locate Richard in the PERSON file to obtain his record address, then sift through the SPORTS file to trace any connections to him The SPORTS file's organization does not readily support this search, requiring a thorough review of all sports instances to determine which, if any, are associated with Richard.

To improve the database model, consider referencing each sport within its corresponding parent record, allowing the system to effectively answer queries that originate from the parent level This enhancement will streamline data retrieval and improve overall efficiency in accessing information related to various sports.

Figure 2.2 illustrates the bidirectional relationship between parent and child records, showcasing a multiple-child pointer (MCP) system from the parent's perspective In this database, Richard has three pointers to child records, reflecting his participation in three different sports.

A Hierarchy of Persons and Sports with Parent and Child Pointers

When viewed from the child fle, the backward link is called a parent pointer In the SPORTS fle, if a record is examined like:

106, downhill skiing, Ski Beech, 3 the pointer-reference to person (3) would be called a parent pointer T e 3 points to the parent, Abbie

In this model, the relationship between parent and child records is established through two methods: a foreign key (FK) link from the child to the parent, known as the "reference to person" in the SPORTS field, and a multiple child pointer (MCP) scheme that allows for referencing multiple children in the parent records Although the MCP may seem redundant, it serves two practical purposes in managing these relationships effectively.

(a) It allows questions to be asked of the system that are easier to answer than with just parent pointers

(b) It allows the system to be queried without looking at the child records

For example, if you ask the question—how many sports does Richard play?—you need only look at the person fle and count MCP references

This article explores two methods for establishing relationships between two files: the Foreign Key (FK) parent pointer and the Multi-Column Primary (MCP) system Both techniques are effective for database implementation, but the MCP system is utilized more frequently than the FK system.

What have we learned so far about hierarchical databases?

1 Te pointers in the database dictate the ease with which the database is queried

2 While the child to parent pointers implement the relationship, they fall short of being able to easily answer some questions

3 Te simple child to parent scheme can be enhanced to better answer some questions by adding an MCP system to the plan

24 • Database Design Using ER Diagrams

4 Te notion of one parent for each child is still in place We will look presently at a system that allows us to go from multiple parents to multiple children However, if you think having parent records link- ing to multiple child records and vice versa will be complicated, you are correct It will take a lot of pointers

5 Staying with just the system presented above with both a parent pointer and multiple child pointers, the system is limited in that only so many children can be addressed by an MCP system In an MCP system, each parent must have enough predefned pointer spaces for as many child records there may be In the next section, we will see a system where the parent-to-child pointer scheme is limitless

2.4.1.1 The Hierarchical Model with a Linked List

Chapter Summary

This chapter provides a foundational understanding of database design and essential concepts, exploring historical data models such as hierarchical and network models, leading to the introduction of the relational model It serves as a crucial historical background for the subsequent material in the book.

Codd, E.F (1970) A relational model of data for large, shared data banks Communications of the ACM, 13(6): 377–387.

The Relational Model and Functional

Introduction

Edgar Codd proposed that relational databases should organize data in simple, unordered two-dimensional tables, allowing for straightforward querying while abstracting from the physical storage of data This article explores the concept of the "proper table" and functional dependencies (FDs), emphasizing the importance of normalization to ensure data is in an appropriate format, as Codd recommended, without creating physically linked records.

Fundamentals of Relational Database

The foundational concept of a relational database is the two-dimensional table, where data is organized in rows and columns, with each cell containing a single piece of information An example of this structure can be seen in Table 3.1a, illustrating the arrangement of data effectively.

Two-Dimensional Table of Relational Data

A AaaaaaaaA AaaaaaaaA AaaaaaaaA AaAaAAaA row1 data-cell data-cell data-cell row2 data-cell data-cell data-cell row3 data-cell data-cell data-cell

34 • Database Design Using ER Diagrams

Two-Dimensional View of aaaaaaaa Table

Jones 222-3333 123 4th St Smith 333-2154 55 Main St Adams 555-8888 3145 Euclid Ct

Two-Dimensional Arrangement Where Each Column Does Not Contain Same Type of Data

55 Main St Smith 333–2154 Adams 3145 Euclid Ct 555–8888

Te columns are given titles: name, phone_no, address

Te rows are horizontal arrangements of data cells under each column

Ofen, when discussing a row in a table, it is denoted like this:

Data cells hold factual information, such as customer details like name and address In a two-dimensional table, each column should represent the same type of data with consistent semantics However, Table 3.2 exemplifies a two-dimensional arrangement that fails to adhere to this principle, as it does not maintain uniformity in the data type across its columns.

The data remains present, yet its disorganized arrangement and nonsensical column headings render it inconsistent.

In relational databases, each data cell must be atomic, meaning it should contain only one fact For instance, if an individual, such as Adams, has two phone numbers, entering both in the same row would violate the principles of atomicity To maintain a valid relational database structure, the tables must be designed differently to accommodate such cases.

AA AA AA AA AA AA

Jones 222-3333 123 4th St Smith 333-2154 55 Main St Adams 555-8888, 555-8889 3145 Euclid Ct

In Table 3.3, Adams' phone number is classified as nonatomic, as it contains a repeating group of two phone numbers A data structure is considered a table only if it consists entirely of atomic data cells, which is not the case here Unlike Table 3.2, which contains all atomic cells, the presence of nonatomic cells in this arrangement disqualifies it from being a table According to Codd's database terminology, a data set qualifies as being in the first normal form (1NF) only when it is organized as a two-dimensional arrangement of atomic data.

In relational database theory, populated tables are referred to as "tables," while columns are known as "attributes." For convenience, tables are typically denoted by capital letters such as R, and attributes are represented by letters like A, B, C, and so on.

AaaaaAaaAaaAaaA is in 1NF, whereas

AaaaaAaaAaaAaaaaA is not 1NF as aaa represents a repeating group Te curly brackets mean

1 Is the following arrangement of data a table?

AA AA AA AA AA AA

AA AA AA AA AA AA

AA AA AA AA AA AA

AA AA AA AA AA AA

36 • Database Design Using ER Diagrams

2 Is the following arrangement of data a table?

3 Is the following arrangement of data a table?

4 Is the following arrangement of data a table?

5 Is the following arrangement of data a table?

Relational Database and Sets

In mathematics, we defne sets as an unordered collection of unique objects

Codd viewed and defned a relational database as: All tables contain sets of rows, and all data are atomic in two-dimensional tables Te notion of

“sets of rows” is a powerful one because it implies two things:

(a) Tere are no duplicate rows

(b) Tere is no order among the rows (Rows are not presumed to be sorted in any way)

Mathematical sets are not in sorted order and do not contain duplicates

If I had a set of apparel, it might contain shoes, socks, shirt, pants, and hats Te same set could be written as (pants, shirt, socks, hats, shoes) Te order of the items in the set is not defned An item either is in the set or not in the set As far as the set is concerned, the set (pants, shirt, socks, hat, pants, shirt, shoes, socks, hats) is the same as before Te duplication of items has no sense in sets

When we look at some rows of data in a relational database, we think of the tables as sets of rows Consider the Table 3.4

Set-wise, Table 3.5 shows the same table as Table 3.4

Te order of the rows is unimportant If we had a duplicate row, as shown in Table 3.6, we would not have a valid relational table Since a relational

Rich Carpenter Beryl Dentist Mary Jo Programmer

Table Showing Tat Order of Rows is Unimportant

Beryl Dentist Mary Jo Programmer Rich Carpenter

TABLE 3.6 Table With Duplicate Rows

AaaaaAA AaaaA Rich Carpenter Beryl Dentist Mary Jo Programmer Rich Carpenter

Database design utilizing ER diagrams emphasizes that a database consists of a "set of rows," where any additional duplicate row is unnecessary for information retrieval Practically, having duplicate rows undermines the fundamental principle of uniquely identifying each row through a primary key.

When querying a database for a specific entry, such as , we find that Lindsey is absent from the dataset This illustrates that each row in the database either exists within the set of rows or does not, highlighting the binary nature of data presence.

Functional Dependency

A functional dependency (FD) represents the relationship between attributes in a database table, where one attribute uniquely determines another For instance, a Social Security number (SSN) can identify an individual’s name, indicating that knowing the SSN allows us to ascertain the corresponding name.

In a database containing Social Security Numbers (SSNs) and names, each SSN uniquely corresponds to one name, establishing a functional dependency This means that knowing an individual's SSN allows for the retrieval of their associated name, highlighting the one-to-one relationship between SSNs and names within the database.

Functional Dependency (FD) establishes a relationship where one field serves as a reference point, ensuring a consistent value for another field This concept closely aligns with the primary key, as previously discussed The essence of FDs is to identify primary keys that guarantee all data within a record is solely dependent on that key.

In database design, the designer is tasked with defining data through functional dependencies (FDs), which requires gathering information from the user The user provides crucial insights, such as whether a project is associated with a single location or multiple locations, and whether an individual can possess only one phone number Additionally, it is important to note that analyzing a database without this user input may not yield accurate results.

“prove” some attribute is functionally dependent on another Te idea of a

FD is one of defnition, and it goes with the table design just like the def - nition of a column and data type

In a company where each employee is assigned a unique employee number, it's important to note that while multiple employees may share the same name, each individual is identified by their distinct employee number This system ensures clarity and avoids confusion, as every employee has one unique identifier alongside their name.

The relational model ensures that employee numbers are always unique and distinct, as defined by the company Allowing duplicate employee numbers associated with different names would create inconsistencies within the database.

We write an FD with an arrow like this:

Te expression aaaaaAAaAAaaaa is read “empno defnes name” or “empno implies name.”

1 In the following table, does all the data conform to aaaaaAAaA Aaaaa? aaaaaA aaaaAA

2 Does the fact that the data conforms to the proposed FD prove the FD is in fact true?

3 In the following table, does all the data conform to aaaaaAAaA Aaaaa? aaaaaA aaaaAA

4 In the following table, does all the data conform to aaaaaAAaA AaaaaA?

5 In the following table, does aaaaaAAaAAaaaa?

40 • Database Design Using ER Diagrams

Non-1NF to 1NF

Let us consider the arrangement of data in Table 3.7

The analysis of the data indicates that the identifier "aaaaaaaaaaa" is likely unique, as it appears to be used by the creator of the customer table to distinguish individual customers However, the data itself does not provide definitive proof of the uniqueness of "aaaaaaaaaaa."

This example implies that aaaa aaaaaaaa serves as a unique identifier; however, defining aaaaaaaaaaa as the primary key for this table is ultimately a matter of definition.

For Adams, the data structure is not in first normal form (1NF) due to the presence of a repeating group Specifically, the data cell that holds two phone numbers violates the 1NF definition To comply with 1NF, the arrangement must be reorganized, as the notation R(A, B, {C}, D) indicates that {C} represents this repeating group.

Before addressing the non-1NF issue, it is advisable to define a primary key in the relation R, as it uniquely identifies each row in a table For example, if the primary key is attribute A (represented as aaaaaaaaaAaaA), which is unique for each customer, the resolution of repeating groups can be achieved through a process known as decomposition Consequently, the original table R(A, B, {C}, D) will be divided into two separate tables: aaaAaAaAaaAaaA and aaaaAaaaAaaAaa.

Sample Data for aaaaaaaa customer_no name phone_no address

Going back to the aaaaaaaa example, if we defned AaaaaaaaaaaaA as the primary key, the decomposition would go like this:

AaaaaaaaaAaAaaaaaaaaAaaAaAAaaaaAaAaAaaaaaaaaAaaAAaaaaaaaAaA will become (will decompose into two tables):

AaaaaaaaaaAaAaaaaaaaaAaaAaAAaaaaAaAAaaaaaaaAaA aaaaAaaAA aaaaaaaaaaaAa and aaaaaaaaaAaAaaaaaaaaaaaAaAAaaaaaaaaAaAaaaaAaaAAaaaaaaaaaaaAAaA AaaaaaaaaAaA

Te notation AaaaaaaaaaaaAaAaaaaaaaa is called a concatenation Te new populated tables will look like Tables 3.8 and 3.9

Both tables are in First Normal Form (1NF), ensuring that all data is atomic The primary key from the first table, identified as "aaaaaaaaaaa," is referenced in the second table as a foreign key, linking it to the primary key of another table In the second table, the unique key "Aaaaaaaaaaaa" serves to distinctly identify each row.

TABLE 3.8 Sample Data for aaaaaaaaa customer_no name address

TABLE 3.9 Sample Data for aaaaaaaaa customer_no phone_no

42 • Database Design Using ER Diagrams

Sample Data for Original aaaaaaaa Table name phone_no address

CUSTOMER with no obvious Primary Key name phone_no address

In aaaaaaaaa, there is no one attribute identifying a row; hence, the entire row is considered a primary key Te row consists of the concatena- tion of aaaaaaaaaaa and aaaaaaaa

In scenarios where there is no clear customer number or key, such as in the provided data example, we face the challenge of identifying a reliable row identifier amidst repeating groups To address this, we can combine the values from the repeating groups with all other attributes, effectively designating the entire row as a key While this may not be the optimal key choice, the evaluation of the combined attributes' effectiveness will be explored later in the chapter as we delve into defining other normal forms This method successfully transforms a non-1NF data arrangement into 1NF, ensuring that each column contains atomic data in a two-dimensional table format The resulting decomposition, as illustrated in Table 3.11, highlights the absence of an obvious primary key.

Table 3.11 is made into one table with aaaa and aaaaaaaaA as the key (Table 3.12 ):

Sample Data for aaaaaaaa With aaaaAand aaaaaaaa as Key name phone_no address

Sample Data for aaaaaaaa Resolved to 1NF customer_no name phone_no address city

107 Adams 555-8888 3145 Euclid Ct Two Egg

107 Adams 555-8889 3145 Euclid Ct Two Egg

The updated table now adheres to First Normal Form (1NF), featuring distinct rows and a designated primary key For instance, the entry "aaaaaaaa" can be effectively transformed into 1NF by incorporating a unique identifier, resulting in a structured format that maintains data integrity.

But, since we have a primary key in aaaaaaaaaaa, this decomposition (Table 3.13) is a bit severe As we take up other normal forms, this problem will be resolved

1 What would you suppose is the key of this table? color make year

44 • Database Design Using ER Diagrams

2 Put this arrangement of data into 1NF: name homephone cellphone

To achieve the second normal form (2NF), it is essential to first ensure that the table is in the first normal form (1NF) When evaluating a table for the first time, confirm its status as a valid table (1NF) and examine its design beyond just the data presented.

The data appears to be in First Normal Form (1NF), indicating that it consists of atomic cells and lacks duplicate rows However, the table design is not specified, raising the question of the functional dependencies (FDs) present It is essential to determine whether 'aaaa' and 'aaaaa' are functionally dependent on 'aaaaaaaaaa.' While they seem to be semantically related, the creator of the data should provide a corresponding table design for clarity.

Now, let us consider this table of data with no FD defned as yet ( Table 3.15)

TABLE 3.14 Arrangement of Data license_no make color

TABLE 3.15 Sample Data for aaaaaaaa

The table is valid as it maintains consistency, contains atomic data, and has no duplicate rows It is named "aaaaaaaa" and features two attributes: "aaaaa" and "aaaa." The data types for these attributes are defined as "aaaaaaaaAaAaaaaaAAaaaaaaaaaaaAAaaaaAaaaaaaaaaaaa."

VARCHAR, short for "variable number of characters," is a widely used data type in databases Specifically, VARCHAR(20) indicates that the data stored in the name attribute can range from 0 to 20 characters in length While this definition covers the basics, there may be additional considerations to explore.

We have stated nothing about FDs If no FDs are defned with the table, the only ones we may assume are refexive:

Reflexive functional dependencies (FDs) are always valid and serve the purpose of mathematical completeness In simpler terms, they indicate that if I know a certain attribute (aaaa), I can also determine that same attribute Additionally, combinations of attributes can also form FDs; for instance, knowing a combination of attributes (aaaaa and aaaa) allows me to deduce both attributes (aaaaa and aaaa), illustrating a reflexive FD.

The table indicates the presence of functional dependencies (FDs), specifically showing that "aaaa" relates to "aaaa," but without explicit definition, we can only suggest that this is the case The existence of two individuals named Wallace does not pose an issue for FDs, as it is normal for multiple people to share the same first name However, each employee number (empno) is unique, ensuring that each empno corresponds to a single name value.

A proper defnition for this table would be:

46 • Database Design Using ER Diagrams

In database tables, the primary key, indicated by underlined text, is typically found on the left-hand side of a functional dependency (FD) While primary keys imply functional dependencies and are defined within the table, they do not have to be the first column, although it is conventional to place them there Understanding functional dependencies is crucial, as it helps identify potential primary keys and differentiate them from non-primary keys.

Let us look at another example Suppose you are given this table with aaaaadefned as the primary key: aaaaaaaaaAaAaaaaaAAaaaaaaaaaaaAAaaaAAaaaaaaaaaaaaAAaaaaA aaaaaaaaaaaaA

The table definition indicates that the first column will contain three-digit numbers, while the second and third columns will consist of character strings limited to 20 characters each Additionally, the underlined value in the first column signifies that it serves as a primary key, establishing two functional dependencies within the table structure.

T e Second Normal Form

48 • Database Design Using ER Diagrams

Sample Data for Ref ned aaaa Table cab_no date_driven driver color

TABLE 3.19 aaaa Table With Redundancy cab_no date_driven driver color

Suppose the data now appears as in Table 3.18

The current table appears functional, but it conceals a significant issue The design indicates that 'aaaaaa' and 'aaaaaaaaaaa' serve as the primary key While this holds true for the limited dataset provided, knowing the combination of 'aaaaaa' and 'aaaaaaaaaaa' is necessary to accurately identify the 'aaaaaa' and 'aaaaa' of the cab.

On February 4, 2023, I located the driver (Gen) and the cab (Green), but a potential issue arises For instance, if aaaaaa 102 is painted red, you can update the data in the table accordingly However, it's important to observe the necessary adjustments required, as demonstrated in Table 3.19.

To update the cab color across thousands of rows in a table, you must implement two changes to ensure that each entry reflects the new color This process involves modifying the redundant records of the cab color throughout the dataset.

When such redundancy appears, it is a symptom of a design problem Here, we say that the table is not in the 2NF

Te update we just proposed is called an update anomaly because the change of cab color is not a simple update to a table, but rather it requires

Te Relational Model • 49 multiple updates due to the redundancy Tere are other problems with this table of data—other anomalies Anomalies come in three forms: update, insert, and delete

An insert anomaly can occur in poorly designed tables when attempting to add data without complete information For instance, trying to insert cab and color data without specifying a driver or date would require creating a row with null values, such as This is invalid because a primary key cannot contain null values, as dictated by the entity integrity constraint in relational databases This rule ensures that every part of a primary key is defined, allowing for proper identification of rows and maintaining the integrity of the database.

A delete anomaly occurs when attempting to remove a specific record from a database leads to unintended loss of additional data For instance, if cab aaaaaa 102 is wrecked and needs to be deleted, removing it from the table also eliminates all associated records, such as the entry detailing that Steph drove the cab on 2/5/2023 This highlights how a delete anomaly can result in the accidental deletion of valuable information beyond the intended record.

In our non-2NF table, the functional dependency issue arises because the value of "aaaaa" for cab relies on "aaaaaa" rather than "aaaaaaaaaaa." The accurate functional dependencies for this table are as follows:

To identify the color, only the 'aaaaaa' is necessary When a table has a concatenated key and an attribute depends on only a part of that key, it exhibits a partial dependency Such tables are not in the second normal form (2NF) A table qualifies as being in 2NF when its primary key has no partial dependencies Therefore, for a table to be considered not in 2NF, it must contain a concatenated key with at least one attribute that does not depend on the entire key.

50 • Database Design Using ER Diagrams

In this non-2NF Cab situation, the symbolic discussion of the problem would be like this:

The table "aaAaaA aaA AaaA aa" does not meet the requirements for Second Normal Form (2NF) due to the presence of a partial dependency Specifically, the attribute "a" depends only on a portion of the composite key "aa," rather than the entire key, which violates the principles of 2NF.

To address tables that are not in Second Normal Form (2NF), the solution is decomposition This process involves breaking down a non-2NF table into two separate tables, ensuring that each table contains data that is solely dependent on the primary key For instance, a table with functional dependencies that are not in 2NF will be divided into two tables, both of which will meet the criteria for 2NF by ensuring that all non-key attributes depend exclusively on the primary key In the context of the Cab problem, the original table can be decomposed into two distinct tables, each structured to maintain the integrity of data relationships and comply with 2NF standards.

AaaaaaaAaAAaaaaaaaaaaaAAaAAaaaaaaA with sample data as shown in Table 3.20

TABLE 3.20 Sample Data for aaaaaa Table cab_no date_driven driver

TABLE 3.21 Sample Data for aaa Table cab_no color

TABLE 3.22 New Decomposed aaa Table cab_no color

With sample data as shown in Table 3.21

We have renamed the new aaaa table to aaaaaa to differentiate it from the original Upon reviewing our proposed update, if aaaaaa 102 were marked in red, the only modification in this decomposed table would be the second row of the aaa table, which is now represented as Table 3.22.

The update has resulted in a single change, with one row modified The tables are now structured in their decomposed 2NF form, eliminating redundancy and ensuring that all non-key data is fully dependent on the primary key of each table.

52 • Database Design Using ER Diagrams with FD aaaaaaAaAAaaaaaaaaaaaAAaAAaaaaaaA

When tables are decomposed, other anomalies are eliminated, allowing for the insertion of new entries into the CAB table without affecting existing data A new entry can be added without requiring additional dependencies, ensuring that the integrity of the database remains intact Furthermore, it is possible to delete specific combinations without compromising the information related to other entries, as only the targeted data is removed from the CAB table.

1 Given aaAaAaaa, is aaAaaAaaAAaaAaaAaa in 2NF?

2 Given aAaAaaaa, is aaaaAAaAaAaaAaaAaa in 2NF?

3 Given aaAaAaa and aAaAa, is aaAaaAaaAAaaAaaAaa in 2NF?

4 If a table is in the 1NF and you have no concatenated key, you do not have to worry about 2NF problems (True/False)?

Let us now consider another example of a table with a defned key: aaaaaaaaA aAaaaaaAAaaaaaaaaaaaAAaaaaA aaaaaaaaaaaaA Aaaaaaa aaAAaaaaaaaaaaaAAaaaaaaaaaaAaaaaaaaaaaaa with FD

Table 3.23 shows some sample data of aaaaaaaa

The table is in both First Normal Form (1NF) and Second Normal Form (2NF) as it consists of all atomic attributes and lacks concatenated keys, eliminating the possibility of partial dependencies However, there is an issue of redundancy present in the database, as evidenced by the relationship where multiple entries for aaaaaaaaA= aaaa correspond to aaaaaaaaaa= aaaaaaaaaa.

Redundancy in database design serves as a warning sign of underlying issues, particularly when it comes to primary keys In our example, the primary key, aaaaa, uniquely identifies each row, but the definition of aaaaaaaaaa is more accurately represented by aaaaaaaa, indicating that aaaaaaaaaa is functionally dependent on skill_ID rather than aaaaa This leads to a transitive dependency between aaaaaAAaAAaaaaaaaa and aaaaaaaaAAaAAaaaaaaaaaa, resulting in redundancy that can trigger anomalies such as update, insert, and delete issues For instance, an update anomaly occurs when changing the description of aaaaaaaa from "Prog" to "Programmer/analyst," requiring modifications in two rows of a four-row table Similarly, an insert anomaly arises when attempting to add a new entry, such as aaaaaaaa= Cart and aaaaaaaaaa= Cartographer, highlighting the challenges posed by redundancy in data management.

Sample Data for aaaaaaaa Table empno name skill_ID skill_desc

54 • Database Design Using ER Diagrams

Sample Data for aaaaaaaa Table With Cartographer empno name skill_ID skill_desc

107 Davis Prog Programmer null null Cart Cartographer

Deleting a row in a database can lead to a delete anomaly, violating the entity integrity rule For instance, if we remove the employee Charles, we not only eliminate his record but also lose the crucial information linking the code "aaaaaaaa" to the department "Public Relations." This demonstrates the risk of losing more data than intended during deletions.

Now, let us return to the original EMPLOYEE table and resolve the problem:

Te problem is the aaaaaaaaaa is functionally dependent on aaaaaaaa and not aaaaa Te corrected FDs should be: aaaaaAAaAAaaaaAaAAaaaaaaaaA and

Tis illustrates a transitive dependency as A → C and C → D Te third normal form (3NF) demands no transitive dependencies As with the par- tial dependencies in non-2NF, non-3NF arrangements cause anomalies:

1 Possible multiple changes when you update a aaaaaaaaaa (an update anomaly)

2 Inability to add a row with just a aaaaaaaa and its aaaaaaaaaa (an insert anomaly)

3 Losing information when deleting a row (remember losing a aaaaaaaa when Charles was deleted)—a delete anomaly.

Some Functional Dependency Rules

In this article, we present a set of functional dependency (FD) rules designed to assist in identifying normal forms within a database Our approach prioritizes common sense and logical reasoning over strict mathematical rigor, acknowledging that more comprehensive treatments of the topic exist, such as those by Elmasri and Navathe (2016) To establish normal forms, we first focus on identifying a minimal key within a table, which we denote with letters like "a" or "A." Attributes are represented as a sequence of letters, for instance, a table "a" with three attributes can be abbreviated as "aaa." By examining a problem that consists of a set of attributes and corresponding functional dependencies, we illustrate how these FD rules can effectively help organize the database.

To achieve Third Normal Form (3NF) for the given data, we start by identifying a minimal key Next, we assess whether the relation is in 3NF If it is not, we proceed to decompose the relation until it meets the 3NF criteria.

To effectively manage functional dependencies (FDs), we propose a set of guidelines For instance, if we denote a as aAaAaa, it indicates that the attribute 'a' defines both 'a' and 'a', meaning they are functionally dependent on 'a' For example, let 'a' represent aaaaaaaa.

60 • Database Design Using ER Diagrams a be aaaaaaaaaaaAA a be AaaaaA a be aaaaA a be AaaaaaA

In the context of functional dependencies (FDs), understanding the relationship between attributes is crucial If one is familiar with the specific attributes in a dataset, it becomes possible to derive other related attributes When constructing FDs, it is important to recognize that sets of attributes are unordered and that the presence of duplicate attributes does not contribute additional information Many concepts related to sets can be articulated through FD rules, allowing us to view FDs as relational expressions among attributes, where various formulations can convey the same underlying meaning.

In the specified table, we designate 'a' as the primary key, as it defines the other attributes, resulting in a shorthand representation of aaAaAaAaaAaaAaa, with the primary key underlined This article outlines two key rules: the reflexive rule, which states that if you know 'a', you can deduce 'a', and since we treat attributes as sets, it also implies aaAaAa and aAaAaa The second rule, known as the augmentation rule, indicates that if you have aAaAa, then you can derive aaAaAa, meaning that knowing a value for 'A' allows you to find a corresponding value for 'a'.

B If a isAAaaaaaaaaaaa and a is a Aaaaa, aAaAa says if you give me a value for aaaaaaaaaaa, then I can fnd a aaaa for that aaaaaaaaa aa Te augmentation rule says if I augment the LHS (Lef Hand Side) of the defning expression (A), I can still correctly fnd the RHS (Right Hand Side) Adding information to the LHS really does noth- ing to enhance the FD Suppose we take an example: employee_no 101 implies aaaa is Jones Now, if we add information to the LHS, like aaaaaaaaaaa = 101, Date of Birth = 21 Nov 1958, what is the name? It is Jones, and the added information of birthday on the LHS is superfuous Still, the rule is valid, and it will help us when deriv- ing a key iii Te decomposition rule: If aAaAaaaa then aAaAaaAaAaAaa and aAaAa (Please do not confuse this FD rule with the decomposition of tables

The relational model utilizes several key rules to achieve normal forms, which are essential for database organization The decomposition rule allows us to define a relationship where one attribute can provide multiple pieces of information, such as a person's name and phone number The union rule states that if two relations are valid, their combination is also valid Conversely, the reverse of this rule, known as the decomposition rule, confirms that if one attribute defines another, it can be broken down into its components The transitive rule is useful for identifying keys among a complex set of attributes, allowing us to derive information about one attribute based on another Lastly, the subset rule highlights that if an attribute is part of a larger group, it retains the properties of the group, enabling us to deduce information from the subset These rules collectively enhance the integrity and efficiency of relational databases.

In this article, we address a problem involving a dataset represented in a table format and a set of functional dependencies (FDs) Our objective is to identify a single set of attributes that can uniquely define all other attributes within the table, effectively determining a primary key Once we establish this key, we will proceed to explore the various normal forms to optimize the database structure.

Step 1 Find a LHS (Left Hand Side) (of an FD) that is a minimal key for a

To identify a minimal key from a set of attributes, begin by applying the reflexive property on the left-hand side (LHS) and utilize all available attributes The goal is to discover a subset of these attributes that can define the remaining ones, thereby establishing a minimal key if it exists.

To identify a minimal left-hand side (LHS) in functional dependencies (FDs), start by searching for a concatenated LHS, if available In the absence of a concatenated key, select an attribute that can help define the maximum number of other attributes Although this selection process is somewhat subjective, if the initial choice proves ineffective, simply revisit and adjust your LHS After establishing a suitable LHS, proceed to analyze the results.

FD rules and see if we can define the other attributes from our chosen LHS

62 • Database Design Using ER Diagrams

In this analysis, we examine the structure of aaAaAa to determine if aaAaAaaaaa can be derived, suggesting that aa serves as a minimal key for the original attribute a It is important to begin with the largest concatenated left-hand side (LHS), as we require at least aaA to establish the functional dependencies that follow.

Step 2 Use the rules to find as many RHS attributes as possible with the candidate LHS Since aaAaAa, we know by the reflexive rule aaAaAaa; and then by the union rule, we combine these two:

Union rule: aaAaAaA(given), aaAaAaa (reflexive), then aaAaAaaaa

Use the other given FDs to find other attributes that depend on the chosen LHS from step 1

We have established that aaAaAaaa We were given aAaAaa

Subset rule: aaAaAaaa (derived previously), then aaAaAaaAA

Transitive rule: aaAaAaaAaAaAaa (given), then aaAaAaaa

Union rule: aaAaAaaaaAaaAaAaa, then aaAaAaaaaaa and since we are dealing with sets of attributes, aaAaAaaaa

Step 3 Repeat step 2 as necessary to get all attributes on the RHS from the chosen LHS if you can

We have all attributes on the RHS except a The other FD we are given is aAaAa Using the same line of thought and we have established aaAaAaaaa:

Subset rule: aaAaAaaaa, then aaAaAaaAA

Transitive rule: aaAaAaaAaAaAa (given), then aaAaAaaAA

Union rule: aaAaAaaaaaAaaAaAa, then aaAaAaaaaa

The process of identifying a minimal key using functional dependency (FD) rules is not a rigid algorithm, as individuals may approach the rules in different sequences, such as transitive or union rules first What matters most is establishing a clear and valid path from the left-hand side (LHS) to encompass all attributes leading to the determination of a minimal key.

The minimal key defined by our LHS is aa, leading us to assess the normalization of the database Since there are no repeating groups, the data is in First Normal Form (1NF) with the structure aaAaaAaaAAaaAaaAaa To achieve higher normalization, we should first eliminate transitive dependencies By decomposing aAaA, we can transform the structure from aAaAaaAaaAAaaAaaAaa to two separate components: aaAaAaaAaaAAaaAaa and aaAaAaAaAaa.

When all transitive FDs have been removed, look for partial dependen- cies While aa is the key of a, we have a aaaAaAaAaa (a partial depen- dency) because aa has aaAaAaa

Te Relational Model • 63 aaAaAaaAaaAAaaAaa decomposes to aaAaAaAaAaaAaa and aaAaAaaAaaAAaaAA

The final version of the database is aaAaAaAaAaaAA

To achieve effective decomposition, it is recommended to eliminate transitive dependencies before addressing partial dependencies While partial dependencies can be removed initially, it is crucial to retain the functional dependency associated with transitive dependencies and their left-hand side For example, the sequence aAaAaaAaaAAaaAaaAaa can be simplified to aaAaAaaAAaaAaa by removing partial dependencies, yet this leads to the issue of retaining the necessary relationships, as the combination of a and a is not present in either aa or aa.

You have to keep a and a together and hence decompose like this:

AaAaAaaAaaAAaaAaaAaaAbecomes aaAaAaaAAaaAaaAaa and aaAaAaaAaaAAaaAand then handle the transitive dependency aaAaAaaAAaaAaaAaa decomposes to aaAaAaaAAaaAaa and aaAaAaaAAaaaA

The final version is aaAaAaaAaaAAaaAA

T e Boyce–Codd Normal Form

68 • Database Design Using ER Diagrams

Chapter Summary

This chapter introduces relational databases, which consist of sets of rows organized into relational tables In these tables, the rows have no implied order and must be unique, ensuring no duplicates exist The concept of normal forms is discussed, highlighting their benefits, the issues caused by non-normalized tables, and the process of converting non-normalized tables into normal form tables.

Anomalies in database management arise from improper table maintenance operations, such as deletion, insertion, and updating, often due to redundancy and poor table design The equijoin operation demonstrates the reconstruction of decomposed tables, while functional dependency (FD) calculus aids in identifying keys and data decompositions Additionally, Boyce-Codd Normal Form (BCNF) is essential for eliminating redundancies and anomalies in non-BCNF tables, although this process may obscure functional dependencies.

Exercises

If aAaAa, can you say aAaAa? Why or why not ?

Decompose the following data into 1NF tables:

Khanna, 123 4th St., Columbus, Ohio {Delhi University, Calcutta Uni- versity, Ohio State}

Ray, 4 Moose Lane, Pensacola, Florida {Zambia University, University of West Florida}

Ali, 88 Tiger Circle, Gulf Breeze, Florida {University of South Alabama, University of West Florida}

Sahni, 283 Penny Street, North Canton, Ohio {Wooster College, Mount Union College}

Consider the data in Table 3.30

Exercise 3.3 Table name address city state car color year

In order to achieve third normal form (3NF) for the given data, we need to identify and separate the entities involved The data includes individuals, their addresses, car brands, colors, and model years The unique entries are: Smith, residing at 123 4th St, Pensacola, FL, owns a blue 2019 Mazda and a red 2018 Nissan; Jones, living at 4 Moose Lane, Santa Clive, CA, has a red 2019 Lexus; and Katie, located at 5 Rain Circle, Fort Walton, FL, drives a white 2019 Taurus Additionally, there are two sets of foreign data (FDs) that need to be addressed By organizing this information into distinct tables for individuals, addresses, and vehicles, we ensure that each entity is represented without redundancy, adhering to the principles of database normalization.

Armstrong, W (1974 August 5–10) Dependency structures of data base relationships

Proceedings of the IFIP Congress Stockholm, Sweden

Bagui, S., and Earp, R (2011) Essentials of SQL Using SQL Server 2008 Burlington, MA: Jones and Bartlett

Chen, P.P (1976) Te entity-relationship model—toward a unifed view of data ACM

Codd, E (1970) A relational model for large, shared data banks Communications of the

Codd, E (1972) Further Normalization of the Data Base Relational Model Republished in Randall J Rustin (ed.), Database Systems: Courant Computer Science Symposia Series

Codd, E (1974 August 5–10) Recent investigations in relational database system Proceedings of the IFIP Congress Stockholm, Sweden

Date, C (2003) An Introduction to Database Systems Reading, MA: Addison-Wesley Earp, R., and Bagui, S (2021) A Practical Guide to Using SQL in Oracle, 3rd ed Redding, CA: BVT Publishing

Elmasri, R., and Navathe, S.B (2016) Fundamentals of Database Systems Reading, MA: Addison-Wesley

Maier, D (1983) Te Teory of Relational Databases New York, NY Computer Science Press Norman, R.J (1996) Object-Oriented Systems Analysis and Design Upper Saddle River, NJ: Prentice Hall

Schach, S.R (2011) Object-Oriented and Classical Sofware Engineering New York, NY: McGraw-Hill.

The Basic ER Diagram : A Data Modeling Schema

Introduction

This chapter introduces a data modeling approach and explores entity-relationship (ER) diagrams, focusing on key concepts such as entities, attributes, relationships, and keys It outlines the first three steps of an ER design methodology, starting with the creation of a one-entity diagram in Step 1, followed by the use of structured English to describe the database in Step 2, and concluding with the mapping of the ER diagram to a relational database in Step 3 These concepts—ER diagrams, structured English, and mapping—are developed throughout the book Additionally, the chapter initiates a running case study that will be expanded in subsequent chapters.

What Is a Data Modeling Schema?

A data modeling schema is a technique used to visually represent a database, often through graphic diagrams, although other communication methods can be effective as well For those outside the computer field, such diagrams may be unfamiliar The Entity-Relationship (ER) diagram serves as a key tool for data modeling, classified as a type of semantic model that aims to convey meaning from data arrangements While ER diagrams are widely used and popular, they are not the only semantic modeling tools available.

When we discuss the contents of a database, the data model helps us decide which piece of data goes with which other piece(s) of data on a

Database design using ER diagrams involves understanding various levels of abstraction in discussing databases One fundamental aspect is the ability to analyze and categorize data, such as the filing of "names," at different conceptual levels This approach enhances clarity and organization in database management.

(a) Abstractly, “We will fle names of people we know.”

We will collect the first, middle, and last names of individuals we know, ensuring each name is limited to 20 characters This data will be organized in alphabetical order by last name and formatted into a spreadsheet within package x.

When designing a database, the initial step involves creating an abstraction and refining it further Staying detached from the specifics of logical models, such as relational, hierarchical, and network structures, as well as physical implementations like fields and files, facilitates easier modifications to the model This approach allows for more flexible decisions regarding the eventual physical storage of data It's important to distinguish between physical data, represented by fields and files, and conceptual data during this process.

Mapping is the process of choosing a logical model and then moving to a physical database fle system from a conceptual model (the ER diagram)

To effectively retrieve data from a database, a physical file containing the data is essential Mapping serves as the crucial link between the design concept and its physical implementation This book emphasizes the relational database model, which enables users to overlook the physical storage of data thanks to the capabilities of database software.

4.2.1 What Is an Entity-Relationship Diagram?

The ER diagram serves as a semantic data modeling tool designed to abstractly represent data, resulting in what is known as a conceptual model This conceptual model is essential for effectively understanding and organizing data structures.

A schema represents a fixed and permanent description of data structure Once we confirm that our conceptual model accurately reflects reality, as depicted in our Entity-Relationship (ER) diagram, we can designate it as a schema.

An ER diagram serves as a valuable tool for documenting existing databases through reverse engineering Initially, we emphasize the application of ER diagrams in modeling databases that are yet to be created, while the process of reverse engineering is addressed subsequently.

An entity-relationship diagram (ERD) visually represents data through entities and their relationships Entities refer to objects or concepts that hold data, such as a person, a bank account, or a building This foundational concept was first introduced by Chen, emphasizing the importance of structuring data effectively.

(1976) described an entity as a “thing which can be distinctly identif ed.”

An entity refers to a person, place, object, event, or concept for which data is stored Relationships denote the connections between these entities Attributes represent the names of the data associated with each entity or relationship.

An entity represents a type or class of something and should be named accordingly Te following are some examples of entities:

• Examples of a person entity would be aaaaaaaaaAaaa, orAaaaaaaaaA

• Examples of a place entity would be aaaaa or aaaaaaa

• Examples of an object entity would be aaaaaaaa, aaaa, or aaaaaaa

• Examples of an event entity would be aaaaa, aaaaaaa, or aaaaaaaaaaaa

• Examples of a concept entity would be aaaaaaa or aaaaaaaaaa

When naming a business entity, it's essential to choose a name that is generic enough to accommodate future changes For instance, if starting with a donut shop, a name like "aaaaa" may initially seem fitting However, if the business is likely to expand into a broader range of pastries, a more inclusive name like "aaaaaa" would be more appropriate It's important to avoid overly generic terms like "business," as they do not accurately reflect the specific products, such as donuts or pastries, that the entity will focus on.

In older data-processing circles, we would have referred to an entity as a

The term "record" is often too physical and limiting, as it evokes the image of a tangible object In the context of databases, we typically store collections of data about various entities, referred to as entity sets This terminology moves away from the older concept of "sets of files," which suggests a focus on physical items, allowing us to embrace a more abstract understanding of data organization.

Def ning a Database—Some Def nitions: Entity, Relationship, and Attribute

Database design utilizing ER diagrams involves understanding the entity set (fle-set) and the concept of a record (entity) For instance, in a company with customers, there exists a customer entity set that comprises individual customer entities.

Entities can be categorized as broad, such as a PERSON, or narrowed down for specific applications, like a STUDENT or CUSTOMER Broad entities are known as generalizations, while narrower ones are referred to as specializations At the application level, we focus on entities without considering subgroups or supergroups Capturing data about a specific entity is termed an instance, which represents a single occurrence of that entity For instance, if we create an entity labeled "aaaa" to record data about a screwdriver, that screwdriver record becomes an instance of "aaaa." Each instance must be uniquely identifiable, distinguishing it from other instances of the same entity type, often achieved through a unique identifier known as a key, such as a customer number in a customer entity set.

A relationship refers to the connection or association between entities, typically expressed through verb phrases In this chapter and the next, we will delve deeper into the concept of an entity before revisiting the idea of relationships in Chapter 6, ensuring a solid understanding of entities first.

An attribute is a property or characteristic of an entity For example, an entity, aaaaaaaaaa, has attributes aaaaAaAAaaaaaAaAAaaaaaaaaaa, and so on

Database modeling starts by defining the data to be stored, typically gathered from the user For instance, when Ms Smith from Acme Parts Company requests a database design for her company's parts, she serves as the user while you take on the role of the database designer The information Ms Smith provides about the parts her company sells, manufactures, or purchases will form the foundation of the database description.

To begin creating a database, it is essential to identify a central "primary" entity that serves as the main category for data storage For instance, if the focus of the database is on students, this primary entity will encompass all relevant information pertaining to student data.

The initial step in creating a Basic ER Diagram involves selecting a primary entity, which is always represented in the singular form Once the primary entity is identified, the next phase is to gather relevant information or attributes that need to be recorded about it This process of choosing a primary entity from the data description marks the beginning of the requirements phase in software engineering for database development.

After selecting the primary entity, the next step is to determine the specific information to record about it For example, in the case of a student entity, we might include attributes such as name, student number, major, and address These attributes serve to qualify, identify, classify, or describe the state of the entity It’s essential for users to provide input on what data should be stored to ensure relevance and accuracy.

Step 1 Select one primary entity from the database requirements description and show attributes to be recorded for that entity

The requirements definition phase is the initial step in software engineering, where the systems analyst seeks to understand the user's needs After identifying a primary entity and its attributes, the focus shifts to gathering and analyzing the necessary information to create a successful software solution.

(a) Draw a diagram of our frst impression entity (our primary entity) (b) Translate the diagram into English

(c) Present the English (and the diagram) back to the user to see if the database designer and user understand one another If we do, we then progress on

In software engineering, the feedback step is crucial during the requirements and specification phases, as it helps refine the project based on input from both users and analysts This feedback loop is essential for accurately representing the desired outcomes Initially, we will illustrate how to create the entity diagram, followed by guidelines on translating the diagram into clear, understandable English.

76 • Database Design Using ER Diagrams

1 Of the following items, determine which could be an entity and state why: automobile, college class, student, name of student, book title, number of dependents

2 Why are entities not called fles or records?

4 Why do we need entity-relationship diagrams?

5 What are attributes? List attributes of the entities you found in question 1?

A First “Entity-Only” ER Diagram

In our example, we focus on a primary entity from a student information database, specifically a student This highlights the importance of storing relevant information about each individual entity within the system.

In this chapter, we will not concern ourselves with any other entities but rather add them in later chapters

When considering the attributes of a student, we identify key elements such as their name, address, and educational connection, commonly referred to as their school These three attributes—name, address, and school—serve as essential identifiers for the entity known as a student.

We initiate our exploration of ER diagrams with a "Chen-like" model, originally introduced by Chen in 1976 Over the years, Chen and other contributors have enhanced the ER diagram methodology, although a standardized version has yet to be established.

The Chen-like ER diagram model and its variants are widely used in database design due to their intuitive nature, allowing users to grasp the design without needing to understand the underlying logical model In contrast, models such as the Barker/Oracle-like model, discussed in Chapter 12, necessitate a comprehensive understanding of the relational model, as their diagrams are influenced by relational concepts.

To begin, in the Chen-like model, we will do as Chen originally did and put the entities in boxes and show attributes nearby One way to depict

In an ER diagram, attributes can be represented by circles or ovals connected to their respective entity boxes, as illustrated in Figures 4.1a and 4.1b Alternatively, Figure 4.1c showcases a more compact style for depicting attributes, which, while less descriptive, can be beneficial for maintaining clarity in Chen-like diagrams that may otherwise appear cluttered.

Chen-like Model: Entity with Attributes

STUDENT Entity with T ree Attributes

78 • Database Design Using ER Diagrams

The "attribute in an oval" model, commonly known as the Chen-like model, is widely recognized for its clarity and adaptability in representing attributes, as illustrated in Figures 4.2A, 4.2B, and 4.2C While the standard Chen-like model effectively conceptualizes relationships with its ovals and boxes, alternative concise forms, such as those in Figure 4.1C, provide practical benefits for documentation, especially when space is limited These alternate representations can be easily derived from the standard model, enhancing their utility in various contexts.

Second Alternative Model for ER Diagram

Tird Alternative Model for ER Diagram

Fourth Alternative Model for ER Diagram

T e aaaaaaa Entity with Five Attributes

The ER diagram depicted in Figures 4.1B and 4.1C features a single entity, aaaaaaa, which is characterized by three attributes: aaaaaAaaaaaaaAa and aaaaaa Should additional attributes, such as aaaaa and aaaaa, be incorporated into our conceptual model, they would be integrated into the existing entity aaaaaaa, as illustrated in Figure 4.3.

More about Attributes

Attributes are essential characteristics that provide descriptive details about entities They can be categorized into several types, including simple or atomic, composite, multivalued, and derived attributes Each attribute has specific properties, such as its name, description, format, size, and atomicity Additionally, some attributes serve as unique identifiers for entities, with key attributes being particularly important as they uniquely identify an entity.

4.5.1 The Simple or Atomic Attribute

Simple or atomic attributes cannot be broken down further or subdivided— hence the notion “atomic.” One may examine the domain of values 2 of

In database design, attributes can be categorized as simple or atomic, with a prime example being a Social Security number, which is unique and indivisible for each individual.

The determination of whether an attribute is simple or atomic relies on the specific circumstances faced by the database designer and the user's requirements For instance, a phone number can be represented as a straightforward nine-digit number in one database design, while in another context, it may be beneficial to separate the phone number into two components: the area code and the seven-digit number.

The classification of attributes in a database can determine whether they are simple or atomic, as illustrated by the birthdate attribute In a veterinary hospital database, it is beneficial to separate the birthdate into month, day, and year to accurately assess the age of young animals, making it a composite attribute Conversely, in a database for racehorses, where only the birth year is relevant, the birthdate can be considered atomic since it only consists of the year.

If an attribute is nonatomic, it needs to be depicted as such on the

ER diagram Te following sections deal with these more complicated, nonatomic attribute ideas: the composite attribute and the multivalued attribute

A composite attribute, also known as a group attribute, is formed by aggregating related attributes and should be given descriptive and general names While the term "name" suffices for a general description, specifying the components of the attribute can be beneficial Most data-processing applications break down the name into its constituent parts, leading to the classification as a composite attribute or aggregate This is typically composed of sub-attributes, which are detailed components of the composite attribute In ER diagrams, particularly in the Chen-like model, composite attributes are visually represented, highlighting their sub-attributes.

In database terminology, a composite attribute, such as a name, is composed of simpler, atomic, or elementary attributes To prevent confusion with the term "aggregate," which has a different meaning in certain database query languages, we specifically refer to these attributes as composite rather than aggregates.

The determination of whether an attribute is composite relies heavily on the specific needs and preferences of the database user For instance, in certain databases, the precise details of location, such as city, state, or zip code, may not be deemed essential.

PERSON comes from, so an address attribute in that database may not be broken up into its component parts; it simply may be called address

In another database, it may be important to know which city and state a

To enhance the database structure, the address attribute should be divided into distinct components: street address, city, state, and zip code, thereby transforming it into a composite attribute.

Another type of non-simple attribute to be managed is called a multival- ued attribute Te multivalued attribute, as the name implies, may take

Database design can effectively utilize ER diagrams to represent multivalued attributes, such as a person's school attendance For instance, if an individual has attended multiple schools, the attribute "school" can accommodate this complexity, enhancing the database's ability to capture relevant information accurately.

As a counter example, most people go by only one name; hence, the name is not multivalued.

The multivalued attribute "aaaaaa" is represented in the Chen-like model as a double oval, indicating that a database can store information about students who may have attended multiple "aaaaaa." However, this does not imply that "aaaaaa" is always a multivalued attribute, as it can also be single valued in certain databases In some cases, "aaaaaa" may refer to the current or most recent "aaaaaa" rather than all institutions attended If the data subjects can only attend one "aaaaaa" at a time, then "aaaaaa" should be treated as a single valued attribute.

The choice between single-valued and multivalued attributes in database design hinges on the specific needs of the user For instance, if the attribute "aaaaaa" represents "current school," it is advisable to label it as "current school" and treat it as a single-valued attribute Conversely, a multivalued attribute, as depicted in Figure 4.5A, indicates that multiple schools can be associated with each student.

Derived attributes are not explicitly recorded but can be inferred or calculated from existing data within a database For instance, an age can be derived from a stored birth date In the Chen-like model, these derived attributes are represented by a dashed oval, illustrating their calculated nature.

A database serves as a storage system for data that can be easily retrieved In this context, a key is an attribute that helps identify a specific occurrence of an entity While modeling databases using Entity-Relationship (ER) models, certain attributes often appear to function as natural keys.

FIGURE 4.5A aaaaaaa Entity with a Multivalued Attribute

FIGURE 4.5B aaaaaaa Entity with a Derived Attribute: Age

84 • Database Design Using ER Diagrams

A candidate key serves as a unique identifier for an entity, and when selected, it becomes the primary key for that entity.

In database design, a candidate key serves as a unique identifier for an entity, ensuring that each record can be distinctly recognized For instance, while attributes like last names may not guarantee uniqueness—since siblings often share the same surname—assigning a unique student number to each student allows schools to efficiently retrieve individual student records The primary purpose of a key is to provide a reliable means of locating a specific entity instance within a dataset.

English Description of the Entity

To provide users with valuable feedback on the entity, we will create a clear English description that summarizes its attributes Rather than requiring users to analyze the entity diagram, this structured English grammar will convey the essential information in an accessible manner, reflecting the analyst's insights.

The template for structured English regarding single entities is defined as follows: Let ENTITY represent the name of the entity, and att(j) denote its attributes The sequence of these attributes is flexible, with j assigned as 1, 2, and so on, chosen arbitrarily; however, once an order is established, it remains fixed If there are n attributes identified, the generalized English equivalent of our diagram will be presented subsequently.

Tis database records data about Entity For each ENTITY in the database, we record att(1), att(2), att(3), att(n)

For each ENTITY , there always will be one and only one att(j) T e value for att(j) will not be subdivided

For each ENTITY , we will record att(j), which is composed of x, y, z, ( x, y, z ) are the component parts of att(j)

88 • Database Design Using ER Diagrams

For each ENTITY , we will record att(j)’s Tere may be more than one att(j) recorded for each ENTITY

For each ENTITY , there may exist att(j)’s, which will be derived from the database

(a) More than one candidate key (strong entity):

For each ENTITY, we will have the following candidate keys: att(j), att(k), (where j , k are candidate key attributes)

(b) One candidate key (strong entity):

For each ENTITY , we will have the following primary key: att(j)

(c) No candidate keys (weak entity):

For each ENTITY , we do not assume any attribute will be unique enough to identify individual entities without the accompanying reference to owner- ENTITY (i.e., some other entity ) 3

(d) No candidate keys (intersecting entity): Tis is discussed next

Step 2 Use structured English for entities, attributes, and keys to describe the elicited database

Step 3 Show some sample data

Sample data usually helps describe the database to the user as it is per- ceived by the analyst

We now revisit each of our fgures and add an English description to each

Upon reviewing Figure 4.3, it is evident that there are no multivalued or composite attributes present The attributes are defined as follows: att(1) corresponds to the name, att(2) to aaaaaa, and so forth, with the assignments made arbitrarily The subsequent section will provide an English translation of the entity diagram utilizing the specified templates.

Tis database records data about aaaaaaas For each aaaaaaa in the data- base, we record a aaaa, a aaaaaa, an aaaaaaaAaAaAAaaaaaAaaaaaaAa and a aaaaa

For each aaaaaaa, there will be one and only one aaaa T e value for aaaa will not be subdivided (note that in Figure 4.3 we did not divide name).

For each aaaaaaa, there will be one and only one aaaaa T e value for aaaaa will not be subdivided

For each aaaaaaa, there will be one and only one aaaaaaa T e value for aaaaaaa will not be subdivided

For each aaaaaaa, there will be one and only one aaaaaa T e value for aaaaaa will not be subdivided

For each aaaaaaa, there will be one and only one aaaaa T e value for aaaaa will not be subdivided

For each aaaaaaa, we do not assume any attribute will be unique enough to identify individual entities (Remember we are describing Figure 4.3.)

In addition to these descriptions, some sample data is ofen helpful in showing the user what you have proposed Sample data for Figure 4.3 is shown in Table 4.1

Initial Sample Data for Figure 4.3 name major address school phone

Smith Cosc 123 4th St St Helens 222–2222 Jones Acct 222 2nd St PS 123 333–3333 Saha Eng 284 3rd St Canton 345–3546 Kapoor Math 20 Living Cr High 435–4534

90 • Database Design Using ER Diagrams

While the descriptions may appear repetitive and overly structured, we aim to provide clarity and avoid ambiguity for users who may not want to interpret diagrams.

Now, consider Figure 4.4 T is fgure has a composite attribute, aaaa T e English description of this entity diagram is next

Tis database records data about aaaaaaas For each aaaaaaa in the database, we record a aaaa, a aaaaaa, and an aaaaaaa

For each aaaaaaa, there will be one and only one aaaa T e value for aaaa will be subdivided into AaAaaaAaaaaAaAAaaaaAaaaa, and aaaa aaaaaaaaaaAa

For each aaaaaaa, there will be one and only one aaaaaaa T e value for aaaaaaa will not be subdivided

For each aaaaaaa, there will be one and only one aaaaaa T e value of the aaaaaa will not be subdivided

For each aaaaaaa, we do not assume any attribute will be unique enough to identify individual entities

Sample data for Figure 4.4 is shown in Table 4.2

Initial Sample Data for Figure 4.4 name.fi rst name.last name.mi school address

Richard Earp W U Alabama 222 2nd St

Boris Backer Heidelberg 333 Dreistrasse Helga Hogan H U Hoover 8 8 Half Moon Ave Arpan Bagui K Northern School 33 Bloom Ave Hema Malini South Bend 100 Livingstone

Next consider Figure 4.5A T is fgure has a composite as well as a mul- tivalued attribute Te English translation of this entity diagram is given next

For the entity, this database records data about aaaaaaas For each aaaaaaa in the database, we record a AaaaaAaAAaaaaaaAaa and an aaaaaaaA

For each aaaaaaa, there is a unique aaaa The value for aaaa is categorized into aAaaaAaaaaAaAAaaaaAaaaa and aaaa aaaaaaaaaa To illustrate the division of names, we utilize dot notation to indicate the origins of each attribute component, resulting in aaaaaaAaaaaAaaaaaaaaaaA aaaaaaaAa.

For each aaaaaaa, there will be one and only one aaaaaaa T e value for aaaaaaa will not be subdivided

For each aaaaaaa, we will record aaaaaaAa Tere may be more than one aaaaaa recorded for each student

For each aaaaaaa, we do not assume any attribute will be unique enough to identify individual entities

Sample data for Figure 4.5a is shown in Table 4.3

Consider Figure 4.6 T is fgure has composite, multivalued, and key attri- butes Te English translation of this entity diagram is as follows

Initial Sample Data for Figure 4.5A name.fi rstname.lastname.mischool address

Richard Earp resides at 222 2nd St in Mountain, Alabama, while Boris Backer is located at 333 Dreistrasse in Heidelberg, Volcano Helga Hogan can be found at 88 Half Moon Ave in St Helens, Hoover Arpan Bagui is based at 33 Bloom Ave in Northern School, and Hema Malini lives at 100 Livingstone in South Bend.

92 • Database Design Using ER Diagrams

Tis database records data about aaaaaaas For each aaaaaaa in the database, we record a aaaaAaA AaaaaaaAa, an aaaaaaa, and a aaaaaaaaaaaaaa

For every aaaaaaa, there exists a unique aaaa, with its value divided into aaaaaaAaaaaAaaaaaaaaaaAaaaaaaaA Similarly, each aaaaaaa corresponds to a single aaaaaaa, and its value remains undivided.

For each aaaaaaa, we will record aaaaaaAa Tere may be more than one aaaaaa recorded for each student

For each aaaaaaa, there is an attribute—aaaaaaaaaaaaaa —unique enough to identify individual entities

Finally, consider Figure 4.7 (top f gure) T is fgure shows a strong entity

We combine the grammar to keep the methodology from being overly repetitive Te English translation of this entity diagram follows

Tis database records data about aaaaaaaaaas For each aaaaaaaaaa in the database, we record a aaaa, aaaaaaaaaa, aaaa, aaaaa, and aaaaaaaaaa

Each AUTOMOBILE will have one and only one AaaaaaA aaaaaaaaaaaA aaaaAaAAaaaaaAaAand aaaaaaaaaa None of these attributes will be subdivided

For each aaaaaaaaaa, the attribute, aaaaaaaaaa, will be unique enough to identify individual entities.

Figure 4.7 illustrates a weak entity, which is characterized by the absence of a key that is typically present in a strong entity.

Before leaving this introductory chapter on ER diagrams, we show the other major component of ER diagrams Figure 4.8 shows a relationship

This article discusses an ER Diagram of a database that illustrates the relationship between two entities: an aaaaaaaaaa and a aaaaaaa Chapter 6 provides an in-depth exploration of the concept of relationships within ER Diagrams, highlighting how they add dynamic action to the visual representation For instance, as depicted in Figure 4.8, the relationship may indicate that aaaaaaas drive aaaaaaaaaas.

94 • Database Design Using ER Diagrams

Our ER design methodology has evolved to the following so far:

Step 1 Select one primary entity from the database requirements description and show attributes to be recorded for that entity Label keys if appropriate

Step 2 Use structured English for entities, attributes, and keys to describe the elicited database

Step 3 Show some sample data.

Mapping the Entity Diagram to a

A relational database is a structured collection of two-dimensional tables, known as relations, which consist of rows (or tuples) and columns (attributes) In this system, all attributes must be atomic, and keys cannot be null Importantly, users do not need to be aware of the physical storage location of the data on devices like disks, as the focus is on the logical organization of data within these tables.

The conversion of an ER diagram into a database is known as mapping This article focuses exclusively on the relational model, and as we progress through the chapters, we will establish mapping rules to transform ER diagrams into relational databases effectively.

We start with a rule to map strong entities

When mapping strong entities, create a new table for each strong entity and designate the specified key as the primary key of that table If the entity has multiple candidate keys shown in the ER diagram, select one to serve as the primary key This newly created table will be referred to as TABLE1.

Next, we must map the attributes into the strong entity’s table, TABLE1 Mapping rules are diferent for atomic attributes, composite attributes,

Te Basic ER Diagram • 95 and multivalued attributes First, we present the mapping rule for map- ping atomic attributes

Mapping Rule 2 involves mapping entities with atomic attributes to a table, creating individual columns for each atomic attribute In this context, we will specifically map the atomic attributes linked to TABLE1.

In discussing relational tables, it is common to abbreviate the diagram with a notation like this:

A relational database realization of the entity diagram in Figure 4.3 would look like this:

In the provided example, Table 4.4 illustrates a relation named "aaaaaaa," where the attributes from the entity diagram serve as column headings This table showcases sample data, representing the expected format and type of information within a relational database It is important to note that the order of the columns does not affect the relational database, as consistency in the chosen order is key This example aims to help database analysts effectively communicate their vision of the database structure to users.

Sample Data for Figure 4.3 name phone school address major

Jones 932-5100 U Alabama 123 4th St Chemistry Smith 932-5101 U Mississippi 123 5th St Math

Adams 932-5102 LSU 123 6th St Agriculture Sumon 435-0997 UWF 11000 Univ Cyber Sec Mala 877-0982 Mount Union U Alliance History

96 • Database Design Using ER Diagrams

In relational databases, it is essential that all columns are atomic, which means they should not contain nonatomic attributes To ensure atomicity when dealing with composite attributes, we must break them down and record only their individual components This approach is crucial for effective mapping to the relational database structure.

Mapping rule 3—Mapping composite attributes For entities with composite attributes, map entities to a table and form columns of each elementary (atomic) part of the composite attributes

Refer to Figure 4.4 A relational database, which corresponds to the entity diagram in Figure 4.4, would be: aaaaaaaaAaaaaaaAaaaaAaaaaaaaaaaAaaaaaaaAaAAaaaaaaAaAAaaaaaaaAaA

In relational databases, composite attributes are typically represented using shorthand notation, where the attribute is expressed with dot notation, such as aaaaa.aAaaaA In this context, the term "aaaa" serves as a qualifier for the various components of the composite attribute.

With some sample data for Figure 4.4 in Table 4.5

A multivalued attribute is illustrated in Figure 4.5A, showcasing an entity diagram where the aaaaaaa entity features a composite attribute, aaaa, alongside a multivalued attribute, aaaaaa This indicates that students can have multiple aaaaaa entries associated with their row The data depicted in Figure 4.5A is comparable to the format presented in Table 4.6.

Sample Data for Figure 4.4 name.fi rst name.last name.mi school address

Richard Earp W U Alabama 222 2nd St Boris Backer Heidelberg 333 Dreistrasse Helga Hogan H U Hoover 88 Half Moon Ave Arpan Bagui K Cambridge 33 Bloom Ave Hema Malini Fashion U 100 Livingstone

Sample Data for Figure 4.5A name.first name.last name.mi address school

Richard Earp W 222 2nd St U Alabama, St

Boris Backer 333 Dreistrasse Heidelberg, Manatee

U, UCF, UWF Helga Hogan H 88 Half Moon U Hoover, Mount

Ave Union U, Manatee U Arpan Bagui K 33 Bloom Ave Cambridge, USF,

Harvard Hema Malini 100 Livingstone Fashion U, Milan U

A relational table requires that every attribute be atomic; however, the presence of the aaaaaa attribute disqualifies it from being a relational table To properly map this multivalued attribute in an atomic manner, we adhere to specific mapping rules.

Mapping Rule 4 involves handling multivalued attributes by creating a separate table dedicated to these attributes Each value of the multivalued attribute is recorded in a new row, alongside the key from the original table The key for this new table is formed by concatenating the multivalued attribute with the key of the owner entity Finally, the multivalued attribute is removed from the original table to maintain data integrity.

According to mapping rule 4, a key is necessary to effectively map multivalued attributes To illustrate this, we utilize Figure 4.6, which can be represented through two distinct relations.

Some sample data would be as shown in Tables 4.7A and 4.7B

98 • Database Design Using ER Diagrams

Sample Data for aaaaaaa student_number name.fi rst name.last name.mi address

111–11–2222 Richard Earp W 222 2nd St 222–11–2222 Boris Backer 333 Dreistrasse 234–45–4567 Helga Hogan H 88 Half Moon Ave 888–77–9990 Arpan Bagui K 33 Bloom Ave 123–45–4321 Hema Malini 100 Livingstone

Sample Data for aaaaaaaaaaaaaa student_number school

St Helens Mountain Volcano Heidelberg Manatee U UCF UWF

U Hoover Mount Union U Manatee U Cambridge USF Harvard Fashion U Milan U

In relational databases, each row in a table consists of atomic attributes, ensuring that every row is unique Consequently, the candidate key for any table encompasses all attributes collectively While it is often possible to identify a subset of these attributes as a key, the uniqueness of each row guarantees that the complete set of attributes also serves as a valid candidate key.

1 How do you map multivalued attributes?

2 How do you map composite attributes?

3 What is a unique identifer? Is it a candidate key? Is it “the” pri- mary key? Discuss

Chapter Summary

This chapter primarily focused on the development of the entity concept and the creation of a one-entity diagram using the Chen-like model It also addressed the concept of attributes and detailed how to map a one-entity diagram to a relational database Additionally, the grammar for a one-entity diagram and its attributes was established, with further development planned for subsequent chapters The next chapter will explore the creation of a second entity and its relationship with the primary entity.

Exercises

Note: Te user should clarify the assumptions made when reporting their work

To create a comprehensive database for businesses, each entity will include essential attributes such as the business name, address, contact information, and a list of employee first names An ER diagram using the Chen-like model will visually represent these relationships, showcasing how each business connects to its employees The English description of the diagram will clarify these relationships and highlight any assumptions made during the design process, such as the decision to treat employee names as a separate entity linked to the business Finally, the diagrams will be mapped to a relational database structure, ensuring efficient data organization and retrieval.

In analyzing the database, composite attributes can be identified as those that consist of multiple components, such as a full address encompassing street, city, state, and zip code Multivalued attributes are those that can hold multiple values for a single entity, like a person's phone numbers or email addresses Additionally, derived attributes may exist, which are calculated from other attributes, such as a person's age derived from their birth date Effective keys for the database should include unique identifiers, such as a social security number or a customer ID, to ensure each record can be distinctly recognized.

To create a comprehensive database of the books on your shelf, focus on capturing essential details such as the author's last name and the course number associated with each book Begin by designing an ER diagram using the Chen-like model, which visually represents the relationships between entities in your database Following the diagram, provide a clear English description that outlines the connections and attributes of the entities involved Finally, compare the results of your ER diagram to ensure accuracy and coherence in your database design.

100 • Database Design Using ER Diagrams

English to your diagrams and state any assumptions you made when drawing the diagrams

In the given database, composite attributes may include those that can be divided into smaller sub-attributes, such as a full name consisting of first and last names Multivalued attributes could be represented by fields like phone numbers or email addresses, where multiple entries are possible for a single entity Derived attributes might involve calculations based on existing data, such as age derived from a birth date Suitable keys for the database would be unique identifiers that ensure each record can be distinctly recognized, such as an ID number Finally, mapping the diagram to a relational database would involve structuring these attributes into tables with defined relationships, ensuring data integrity and efficient retrieval.

The recently opened West Florida Mall in Pensacola, Florida, has quickly become a popular destination for shoppers and retailers alike As part of a larger chain of malls, it is essential for West Florida Mall to establish a comprehensive database to effectively manage information related to its stores, owners, and employees To initiate the development of this database, the first step involves designing an Entity-Relationship (ER) diagram that captures the initial user specifications for the mall's management system.

To effectively manage a mall, it is essential to document key details about the mall and its stores, including the mall's name and address Each mall must house one or more stores at any given time, making accurate record-keeping crucial for operational success.

To effectively manage each store, we will collect essential information including a unique store number, the store's name, its location (room number), departments, the owner's name, and the manager's name Each store can have multiple departments, each managed by a department manager, while the store itself will have a single overall manager Additionally, each store is owned by one owner and is situated in only one mall.

A store manager is responsible for overseeing a single store, and it is essential to document specific details about them, including their name, Social Security number, the store they manage, and their salary.

The store owner is an individual responsible for managing the establishment, and we will collect essential information such as their name, address, and office phone number A store owner must possess at least one store, but they may own multiple stores as well.

As per step 1 in designing the ER diagram, we have to select our primary entity and then the attributes for our primary entity (step 1 is shown next):

Step 1 Select one primary entity from the database requirements description and show attributes to be recorded for that entity

We will choose aaaa as our primary entity

Our next step is to translate the diagram into English:

Step 2 Use structured English for entities, attributes, and keys to describe the database that has been elicited

Tis database records data about a aaaa

For each aaaa in the database, we record a aaaa, an aaaaaaa, and aaaaaaaaaaAa

For each aaaa, there will be one and only one aaaa Te value for aaaa will not be subdivided

For each aaaa, there will be one and only one aaaaaaa Te value for aaaaaaa will not be subdivided

For each aaaa, record aaaaaaaaaaAa Tere may be more than one aaaaaaaaaa recorded for each aaaa Te value of each aaaaaa aaaa will not be subdivided

For each aaaa, we assume the mall aaaa will be unique

T e aaaa entity is shown in Figure 4.9

In this case study, we have identified a primary entity, aaaa, and outlined its key attributes We utilized structured English to effectively describe both the entity and its attributes Following this, we will map the entity diagram to a relational database to enhance data organization and accessibility.

102 • Database Design Using ER Diagrams

Mapping the Entity to a Relational Database aaaa is a strong entity, so we use mapping rule 1, which states:

Develop a new table (relation) for each strong entity and make the indicated key of the strong entity the primary key of the table

If more than one candidate key is indicated on the ER diagram, choose one for the primary key

We develop a new relation for the entity, aaaa (as shown in Figure 4.9), and aaaa will be our primary key Data represented by Figure 4.9 might look like Table 4.8

The presence of a multivalued attribute, Aaaaaaaaaaa, in aaaa disqualifies the table from being considered relational, as aaaaaaaaaa is not atomic According to mapping rules, multivalued attributes must be handled appropriately to maintain relational integrity.

Form a separate table for the multivalued attribute Record a row for each value of the multivalued attribute together with the

Sample Data for Figure 4.9 name address store_name

West Florida N Davis Hwy, Pensacola, FL Penney’s, Sears,

Cordova Mall 9th Avenue, Pensacola, FL Dillards, Parisian,

Circuit City Navy Mall Navy Blvd, Pensacola, FL Belks, Wards, Pearl

Vision BelAir Mall 10th Avenue, Mobile, AL Dillards, Sears,

In an ER diagram, the key of a new table derived from a multivalued attribute will be formed by concatenating the multivalued attribute with the key of the owner entity, while the original multivalued attribute should be removed from the original table.

By applying this mapping rule, two relational tables will be created, with the key of the first table, which contains the multivalued attribute, highlighted by both 'aaaa' and 'aaaaaaaaaaA.'

And data would look like Table 4.9

And the table with the multivalued attribute, Table 4.10

Sample Data for aaaa name address

West Florida Mall 1234 N Davis Hwy, Pensacola, FL Cordova Mall 613 9th Avenue, Pensacola, FL

Navy Mall 31458 Navy Blvd, Pensacola, FL BelAir Mall 12 10th Avenue, Mobile, AL

Sample Data for aaaaaaaaaa name store_name

West Florida Mall Popcorn Store

104 • Database Design Using ER Diagrams

Sample Data for aaaaaaaaaa name store_name

Tis case study is continued at the end of the next chapter

C J Date (1995) in his book "An Introduction to Database Systems" emphasizes the use of the term "property" over "attribute" due to its broader applicability; however, we opt for "attribute" as it is more widely recognized in common usage.

2 Te domain of values is the set of values that a given attribute may take on T e domain consists of all the possible legal values that are permitted on an attribute

A data type encompasses a range of attributes and defines the permissible operations associated with them In the context of database creation, the emphasis is primarily on storage and retrieval, leading to a focus on the domain of values within database data types.

3 Te details of the weak entity/strong entity-relationship will become clearer as we introduce relationships in Chapter 5.

4 Tese mapping rules are adapted from Elmasri and Navathe (2016)

Batini, C., Ceri, S., and Navathe, S.B (1992) Conceptual Database Design Redwood City,

Chen, P.P (1976) Te entity-relationship model—toward a unifed view of data ACM Transactions on Database Systems, 1(1): 9–37

Chen, P.P (1977) Te entity-relationship model: A basis for the enterprise view of data

Codd, E (1990) Relational Model for Data Management—Version 2 Reading, MA:

Date, C.J (1995) An Introduction to Database Systems, 6th ed Reading, MA: Addison-Wesley

Date, C.J (2003) An Introduction to Database Systems Reading, MA: Addison-Wesley

Earp, R., and Bagui, S (2000) Building an entity-relationship diagram: A sof ware engi- neering approach Database Management Journal, 22–10–41: 1–16

Elmasri, R., and Navathe, S.B (2016) Fundamentals of Database Systems Reading, MA:

Jef ry, A., Hofer, V., and Heikki, T (2011) Modern Database Management Upper Saddle River, NJ: Prentice Hall

Beyond the First Entity Diagram

Introduction

Having established a method for drawing, interpreting, and refining a primary entity, we can now advance to more intricate databases Our next step involves building upon this primary entity by exploring additional information that is related to it.

In this chapter, we utilize a systematic approach by examining our primary entity to determine if any of its attributes should be treated as separate entities Additionally, we seek out other relevant information within our description and incorporate it accordingly.

(a) An existing entity and examine the existing entity-relationship (ER) diagram

After creating new entities, it is essential to identify the relationships that exist between them A database comprises related data, allowing for the addition of new entities as long as they can be connected to existing ones This chapter elaborates on steps 3 through 5 of the ER design methodology outlined in this book Step 3 focuses on examining the attributes of the primary entity, while step 4 addresses the need for additional entities Finally, step 5 explores how to develop the relationships between these entities effectively.

This chapter introduces the concept of relationships but does not present new mapping rules, as these will be clearer after exploring structural constraints on relationships in Chapter 6 The discussion will continue with the case initiated in Chapter 4.

108 • Database Design Using ER Diagrams

Examining an Entity: Changing an

AN ATTRIBUTE TO BE AN ENTITY

In Figure 5.1, we present an entity characterized by various attributes, including a composite attribute, an atomic key attribute, a multivalued attribute, and another atomic attribute During our initial session with the user, after reviewing the diagram, English descriptions, and sample data, the user requests to capture comprehensive information about the schools a student has attended Specifically, the user emphasizes the need to record not only the names of the schools but also their locations, including city and state, as well as the type of school, such as community college, university, or high school.

The user has indicated that the attribute "aaaaaa" should be classified as an entity, as it represents something about which we want to record information Initially, we intended to document schools attended, but the focus has shifted to gathering details about the schools themselves A key indicator for considering an attribute as an entity is the necessity to store information related to it Consequently, we transition from Figure 5.1 to Figure 5.2, where "aaaaaa" is established as a distinct entity alongside "aaaaaaa." We designate "aaaaaaaaaaaA" as the unique key for the entity "aaaaaa."

Te next step would be to defne a relationship between the two entities

FIGURE 5.1 aaaaaaa Entity with a Multivalued Attribute

Beyond the First Entity Diagram • 109

Two ER Diagrams: One of aaaaaaa and one of aaaaaa

Databases are specifically structured to organize and store related information efficiently For instance, it would be illogical to combine data about students with foreign currencies or to merge details of airline flights with employees from a tennis ball factory.

110 • Database Design Using ER Diagrams

In a database, it is essential to establish a collection of related data entities, as they contribute to a coherent structure For instance, an attribute like "school" can be considered a significant part of an entity, highlighting the importance of relationships between different data elements.

“part of” student), but now school has become an entity by itself What we must do now is relate the aaaaaa entity to the aaaaaaa entity

In Figure 5.2, we have two entities, but they appear as though they are independent To make the aaaaaa entity and the aaaaaaa entity

Def ning a Relationship for Our New Entity

Beyond the First Entity Diagram • 111 function as a database, we have to add something: the relationship the entity aaaaaa has to the entity aaaaaaa

In an ER diagram, a relationship signifies a connection between two or more entities or between an entity and itself, known as a recursive relationship Typically represented by a verb or verb phrase, the relationship name illustrates the link between entities Understanding these relationships equips us with the necessary tools to create a database description using ER diagrams.

In the Chen-like model, a relationship is depicted by a diamond on the line joining the two entities together, as shown in Figure 5.3

Figure 5.3 illustrates a two-way relationship between two entities, represented by a verb that connects them It is essential to express all relationships bidirectionally For instance, in the Chen-like model, one might describe the relationship as "Entity A performs action on Entity B" or "Entity B is acted upon by Entity A."

The degree of a relationship indicates the number of entities involved For instance, in a binary relationship, two entities participate, as illustrated in Figure 5.3.

We now have a powerful tool for creating database descriptions through Entity-Relationship (ER) diagrams These diagrams capture essential information about entities, referred to as x and y, and illustrate the relationships between them effectively.

Our growing and amended methodology is discussed next

Step 1 Select one primary entity from the database requirements description and show the attributes to be recorded for that entity Label keys if appropriate and show some sample data

Step 2 Use structured English for entities, attributes, and keys to describe the elicited database

Step 3 Examine attributes in the primary entity (possibly with user assistance) to fnd out if information about one of the attributes is to be recorded

Step 3a If information about an attribute is needed, then make the attribute an entity, and then

Step 3b Defne the relationship back to the original entity

Step 4 Show some sample data

112 • Database Design Using ER Diagrams

A Preliminary Grammar for ER Diagrams

In Chapter 4, we established a grammar to define an entity, and now we have incorporated a relationship into our diagram To enhance our English description of the proposed database, we aim to present sample data that will clarify our direction Additionally, we plan to expand our list of grammatical expressions.

For each relationship, we add the following comment (in loose English [for now]):

A(n) Entityl Relationship Entityvn2 (active voice) and a(n) Entity2 Relationship Entityl (passive voice)

A aaaaaaa attends a aaaaaa, and a aaaaaa is attended by a aaaaaaa

Users ultimately determine the appropriateness of the expressions we use, and they may prefer different verb tenses or select verbs that better reflect the situation For instance, a user might choose to describe a relationship as "aaaaaaas will matriculate at aaaaaaa." As an exercise, you will be tasked with providing a comprehensive description of the ER diagram in Figure 5.3, detailing all entities, attributes, keys, and relationships involved.

After analyzing the original primary entity for "suspicious" attributes, we can now enhance our data set For instance, if the user wishes to include details about the automobiles owned or driven by the students, we can expand our description accordingly.

We want to record information about students—their name and student numbers In addition to information about students, we want to record

In expanding the primary entity diagram, we can incorporate essential automobile details such as the vehicle identification number, car make, body style, color, and model year By selecting aaaaaaa as the primary entity, we can effectively integrate this vital vehicle information to enhance our data structure.

The automobile is a distinct entity that requires information recording In our methodology, we can initially categorize the automobile as an attribute, later advancing it to the status of an entity during the migration process This relationship is illustrated in Figure 5.4 of the Chen-like model, although we will set aside the discussion of the related entity for now.

If the aaaaaaaaaa attribute is added to the aaaaaaa entity and later identified as needing its own entity, we can create the aaaaaaaaaa entity and establish the appropriate relationship within the model Figure 5.4 may suffice for users who do not wish to store details about the automobiles.

In our initial diagram, we could have identified the attribute "aaaaaaaaaa" as an entity, allowing us to represent it alongside the entity "aaaaaaa." By acknowledging "aaaaaaaaaa" as an entity, we would illustrate both entities and explore the relationship between them, as depicted in Figure 5.5, which showcases the connection between "aaaaaaa" and "aaaaaaaaaa."

A aaaaaaa Entity with an Attribute Called aaaaaaaaaa

114 • Database Design Using ER Diagrams

An ER Diagram of the aaaaaaaaaaaaaaaaaa Database

In Chen-like notation, we define the relationship between two entities, such as "aaaaaaa" and "aaaaaaaaaa," using a verb; in this instance, we select "drive," as illustrated in Figure 5.5 While users may later modify this relationship, the initial assumption is that it implies "A student drives an automobile." Alternative verbs to describe the relationship could include "register" or "own." This relationship is classified as a binary relationship.

In ER diagrams, relationships between entities are typically labeled with descriptive names However, when a relationship is challenging to articulate or remains unclear, a two-letter code is employed to represent it For instance, the relationship denoted as "SA" signifies an acknowledgment of the existence of a connection, even if the specifics are not fully defined.

Beyond the First Entity Diagram • 115

An ER Diagram of the aaaaaaaaaaaaaaaaaa Database with an “Unknown,” “Yet- to-Be-Determined” Relationship are not clear on exactly what to call it (aaAaAaaaaaaaaAaaaaaaaaaa)

Of course, if we were confdent of “drive” as the relationship, we would use drive

In the context of entities and relationships, entities are represented as nouns while relationships are depicted as verbs For instance, in the drive relationship, Students (N) drive (V) Automobiles (N).

In this article, we explore the concept of relationships in English grammar, specifically focusing on how students relate to automobiles For instance, we can express this relationship by stating, "Students (N) are related to (V) Automobiles (N)" or "A student (N) is related to (V) an Automobile (N)." In the following chapter, we will further elaborate on this English description and the relationship aspect of the diagram.

At this point, we have introduced the aaaaaaaaAaaaaaaaaaa, and aaaaaa entities With all three entities, the aaaaaaaaaaaaaaaaaaa aaaaaa database would look like Figure 5.7

116 • Database Design Using ER Diagrams

An ER Diagram of the aaaaaaaaaaaaaaaaaaaaaaaaa Database

1 Can the nature of an entity change over time? Explain

3 What are the diferences between an entity and a relationship?

4 When would it be preferable to consider an attribute an entity?

5 Does it make sense to have an entity with one attribute? Why or why not?

Beyond the First Entity Diagram • 117

Some situations may unfold for which a relationship might be unclear For example, consider this user description of a desired database:

Create a database for aaaaaaaaa and aaaaaaaaaaAaaaaaaaaa will have a AaaaaAaAAaaaaaaaAaAAaaaaaAaaaaaa, and aaaaaaaaAaaaaaa aaaaaaaaa will have a aaaaaaaaAaaaaaaAaAAaaaa, and aaaaaaa

In this database, we clearly have two entities: aaaaaaaa and aaaaaaaa

We want to store information about customer s (name, address, ) and supplier s (supplier number, supplier name, ) But, what is the connection between the two?

The current user description for designing the database is incomplete and lacks clarity, highlighting the need for two distinct components: one for customers and another for suppliers It is essential to recognize that the relationship between customers and suppliers is mediated through a company or vendor, rather than being direct At this stage, without additional entities or relationships, it is prudent to refrain from establishing connections in the overall database diagram Consequently, the development of separate databases for customers and suppliers may be necessary until further information is obtained from the user.

In database design, distinguishing between attributes and relationships can be challenging, as both convey information about an entity Attributes define the properties or characteristics of an entity, while relationships illustrate connections with other entities For instance, when creating a library database, introducing a primary entity with a specific attribute may not adequately represent an optional association that should be classified as a relationship between two entities.

In database design using ER diagrams, it's essential to recognize that many entities may not always have values for certain attributes, leading to the use of null values, particularly for items not currently on loan, such as books in a library Since only a small percentage of books are typically borrowed at any given time, the "borrower" attribute may frequently be null This prevalence of nulls suggests that the "borrower" could be better represented as an attribute of a separate entity By explicitly defining the relationship between the entities involved, database designers can more accurately categorize attributes and entities, enhancing the overall structure and clarity of the database design Understanding the differences between attributes, relationships, and entities is crucial for effective database modeling.

1 Are relationships between two entities permanent, or can the nature of this relationship change over time?

2 Are attributes of an entity permanent?

3 Does there always exist a relationship between two entities?

Our ER elicitation and design methodology is described next

Step 1 Select one primary entity from the database requirements description and show attributes to be recorded for that entity Label keys if appropriate and show some sample data

Step 2 Use structured English for entities, attributes, and keys to describe the elicited database

Step 3 Examine attributes in the primary entity (possibly with user assistance) to fnd out if information about one of the attributes is to be recorded

Step 3a If information about an attribute is needed, then make the attribute an entity, and then

Step 3b Defne the relationship back to the original entity

Step 4 If another entity is appropriate, draw the second entity with its attributes Repeat step 2 to see if this entity should be further split into more entities.

Does a Relationship Exist?

Some situations may unfold for which a relationship might be unclear For example, consider this user description of a desired database:

Create a database for aaaaaaaaa and aaaaaaaaaaAaaaaaaaaa will have a AaaaaAaAAaaaaaaaAaAAaaaaaAaaaaaa, and aaaaaaaaAaaaaaa aaaaaaaaa will have a aaaaaaaaAaaaaaaAaAAaaaa, and aaaaaaa

In this database, we clearly have two entities: aaaaaaaa and aaaaaaaa

We want to store information about customer s (name, address, ) and supplier s (supplier number, supplier name, ) But, what is the connection between the two?

The current user description for designing the database is incomplete and lacks clarity, highlighting the need for a clear understanding of the relationships between customers and suppliers It’s important to note that the connection between these entities is typically mediated through a company or vendor, rather than being direct At this stage, the database structure consists of two distinct components: one for customers and another for suppliers If additional entities arise in the future, they may introduce new linking relationships; however, without further information, the current focus remains on these two separate categories Consequently, it may be necessary to develop two unrelated databases until more comprehensive details are gathered.

Attribute or Relationship?

Determining whether a characteristic is an attribute or a relationship can sometimes be challenging, as both convey information about an entity Attributes describe the qualities or properties of an entity, while relationships define its connections with other entities For instance, in a library database, if we introduce a primary entity with a specific attribute, it may not adequately represent an optional association that should instead be classified as a relationship between two entities.

In database design using ER diagrams, it’s crucial to recognize that certain attributes, like the "borrower" for library books, may often be null, indicating that not all entities are currently on loan This prevalence of null values suggests that the "borrower" attribute might be better represented as part of a separate entity By creating a distinct entity for borrowers and clearly defining the relationship between books and borrowers, database designers can more accurately categorize attributes and entities Understanding the difference between attributes, relationships, and entities is vital for effective database design.

1 Are relationships between two entities permanent, or can the nature of this relationship change over time?

2 Are attributes of an entity permanent?

3 Does there always exist a relationship between two entities?

Our ER elicitation and design methodology is described next

Step 1 Select one primary entity from the database requirements description and show attributes to be recorded for that entity Label keys if appropriate and show some sample data

Step 2 Use structured English for entities, attributes, and keys to describe the elicited database

Step 3 Examine attributes in the primary entity (possibly with user assistance) to fnd out if information about one of the attributes is to be recorded

Step 3a If information about an attribute is needed, then make the attribute an entity, and then

Step 3b Defne the relationship back to the original entity

Step 4 If another entity is appropriate, draw the second entity with its attributes Repeat step 2 to see if this entity should be further split into more entities

Beyond the First Entity Diagram • 119

Step 5 Connect entities with relationships if relationships exist Step 6 Show some sample data.

Chapter Summary

In Chapter 4, we defined entities, attributes, and relationships, but in practical database design, it can often be challenging to identify whether a component should be classified as an entity, attribute, or relationship This chapter explores various techniques to effectively discern the appropriate classification for each element in database design.

Exercises

Introduction

In Chapters 4 and 5, we explored key components of entity-relationship (ER) diagrams, including entities, attributes, and relationships For effective requirement elicitation, it is essential to define structural constraints alongside relationships, as they provide crucial information on how entities relate to each other Structural constraints are categorized into two types: cardinality and participation.

In this chapter, we introduce a grammar that describes the structural constraints of relationships, aiding in the requirement elicitation process This grammar will provide a template for expressing the meaning of diagrams in precise English Additionally, we develop steps 6 and 7 of the ER design methodology, where step 6 defines the nature of a relationship in English, and step 7 focuses on presenting the current database design to the user.

This chapter focuses on the development and discussion of mapping rules for relationships, illustrated with examples and sample data Additionally, we continue our ongoing case study that started in Chapter 4 and was further explored in Chapter 5.

6.2 THE CARDINALITY RATIO OF A RELATIONSHIP

Cardinality is a rough measure of the number of entities (one or more) related to another entity (or entities) For example, as shown in Figure 6.1,

130 • Database Design Using ER Diagrams

One-to-One Relationship aaaaaaaaaaaaaaaaaaaaaaa

Many-to-One Relationship aaaaaaaaaaaaaaaaaaaaaaa

One-to-Many Relationship aaaaaaaaaaaaaaaaaaaaaaa

Many-to-Many Relationship aaaaaaaaaaaaaaaaaaaaaaaA

T e Cardinality Ratio of a Relationship

In relationships between entities, there are four types of numerical involvement: one-to-one (1:1), many-to-one (M:1), one-to-many (1:M), and many-to-many (M:N) The most prevalent relationships in data modeling are one-to-many (1:M) and many-to-many (M:N).

In the one-to-one (1:1) type relationship, one entity is associated with one other entity and vice versa For example, if in our drive relationship (see

Figure 6.2), we stated one automobile is driven by one student and one student drives one automobile, then the student/automobile relationship would be one-to-one, symbolically: aaaaaaaaaaaaaaaaaaaaaaa

Diagrammatically, we can represent a 1:1 relationship as shown in Figure 6.1A

In a many-to-one relationship, multiple students are linked to a single automobile, while that automobile can also be associated with numerous students This dynamic illustrates the interconnectedness between students and vehicles, highlighting the unique relationship each student has with a specific automobile.

In our analysis, we opted for the phrase “is associated with” instead of “drive” to avoid ambiguity, as the statement “many students drive one automobile” can be interpreted in multiple ways Choosing a specific verb for relationships in initial diagrams may lead to misinterpretations unless the analyst is certain it accurately reflects the user's intent Alternatively, we could have used “is related to” for a more neutral stance regarding the precise verb choice.

We will refine the terminology used to describe relationships A relationship in this context refers to a scenario where a family owns a single car that is utilized by multiple family members.

132 • Database Design Using ER Diagrams

Diagrammatically, we can represent an M:1 relationship as shown in Figure 6.1B

In a one-to-many (1:M) relationship, one student can be linked to multiple automobiles, while each automobile is associated with only one student It is crucial to clearly identify which entity represents the "one" side and which represents the "many" side in this relationship.

Diagrammatically, we can represent a 1:M relationship as shown in Figure 6.1C

In many-to-many relationships, multiple instances of one entity are linked to numerous instances of another entity, represented as M:N, indicating that M occurrences of one type correspond to N occurrences of another.

If our SA relationship were many to many, a student would be associated with many automobiles and an automobile with many students:

In this scenario, if we consider SA as "drive," it illustrates that multiple students can drive multiple cars, ideally not all at once Each student may operate several cars, while any car can be driven by various students For instance, envision a family with several vehicles where any member can drive any car, highlighting the flexibility and shared use among individuals.

Diagrammatically, we can represent an M:N relationship as shown in Figure 6.1D

In expressing cardinality, this x:x ratio, where x = 1 or M or N, is called a cardinality ratio

To accurately represent the current situation for our students and automobiles, we must model reality based on user input By actively listening to the user, we make informed assumptions and create a model This model is then communicated back to the user in structured English for their approval or corrections.

A common pitfall in ER design is attempting to model every possible scenario, which is impractical The primary objective of developing a database is to address specific local situations, guided by the systems analysis process in software engineering.

In classical systems analysis, the database analyst engages with the user to understand their needs, creates a tailored specification, and presents the findings for feedback This process focuses on modeling the user's specific reality rather than a universal database design If discrepancies arise, the analyst can adjust the conceptual model, emphasizing the importance of mutual understanding between the analyst and the user regarding the model's representation.

In our example, we will establish a one-to-one relationship between students and automobiles, where each student is associated with a single vehicle Although there may be exceptions to this model, we will focus on defining the relationship and the information within each entity It's important to note that this conceptual model may evolve based on real-world scenarios, but for our purposes, we will begin with this specific framework.

In the Chen-like model, we illustrate the one-to-one nature of relationships by incorporating cardinality numbers on the lines of the ER diagram that link entities and relationships.

In Figure 6.2, we illustrate a 1:1 relationship between students and automobiles, indicating that each student is associated with one automobile and vice versa However, this does not imply ownership or insurance responsibilities; rather, it signifies that a student can drive at most one automobile on campus, and each automobile is driven by only one student To accurately represent this relationship, we have named it "drive," emphasizing the concept of driving within our database model For further clarification, please refer to Figure 6.2, which showcases the renamed model with 1:1 cardinality.

134 • Database Design Using ER Diagrams

ER Diagram of a aaaaaaaaaaaaaaaaaa Database with a Relationship Named drive and Cardinality Ratios.

Participation: Full/Partial

It is likely on any campus not all students will drive an automobile For our model, we could normally assume all of the automobiles on the campus

Extending Relationships • 135 are associated with a student (We are for the moment excluding faculty and staf driving by only modeling the student/automobile relationship.)

In our enhanced Chen-like ER diagrams, we use a double line to represent that every automobile is driven by a student, indicating full participation in the relationship Conversely, we maintain a single line between the student entity and the relationship to signify that not all students drive automobiles, as some may not have cars on campus These single and double lines are known as participation constraints or optionality constraints, as illustrated in Figure 6.3.

The double line signifies full participation in a relationship, often referred to as mandatory by some designers This means that if a relationship requires full or mandatory participation, there cannot be any missing values for that attribute For instance, in our database, every automobile must be associated with a student.

Single line, partial participation, often referred to as optional participation, indicates that some students may not have any connection to an automobile.

2 What kind of information does the cardinality ratio give us?

3 In how many diferent ways can two entities be involved in a car- dinality relationship? Give examples

4 What kind of information does the participation constraint give us?

5 Is it always necessary to have cardinality ratios as well as partici- pation constraints in the same ER diagram? Why? Explain.

English Descriptions

We refine English grammar to articulate the impact of relationships on entities through our structural constraints, establishing a standardized method for expressing these relationships This standardized language will be consistently utilized throughout the discussion.

136 • Database Design Using ER Diagrams

An ER Diagram of the database outlines the relationships, cardinality ratios, and participation involved in the data model Utilizing a standardized language to describe these ER diagrams enhances communication with users during the systems analysis process, enabling effective feedback and clarifying the precise meaning of the relationships depicted.

In the Chen-like model, the double lines defne full participation as in,

“Automobiles fully participate in the drive relationship.” Better yet, the double lines invite us to state the relationship as:

Automobiles must be driven by one (and only one) student

T e must comes from the full (mandatory) participation and the one from the cardinality.

Te grammar for describing the partial or optional relationship for the aaaaaaa entity to the aaaaaaaaaa entity would be:

Students may drive one and only one automobile

In Entity-Relationship (ER) diagrams, the concept of cardinality is crucial, as it defines the relationships between entities For instance, a student may own one automobile, while each automobile must be associated with exactly one student Understanding these relationships is essential for accurately interpreting ER diagrams, as illustrated in Figure 6.4.

Tighter English

We strongly recommend an English sentence accompany each diagram to reinforce the meaning of the fgure (refer to Figure 6.4) English is of en an ambiguous language T e statement:

Automobiles must be driven by one and only one student actually means:

Automobiles, which are in the database, must be driven by one and only one student

Te relationship should not be stated loosely, as in:

Tis could be vaguely interpreted

138 • Database Design Using ER Diagrams

T e aaaaaaaaaaaaaaaaaa Database: Translating the Diagram into English

Another way to put this is:

Every automobile must be driven by one and only one student Students may drive one and only one automobile

To clarify the relationship statement, we will utilize the English expression previously illustrated and define four possible patterns for articulating this relationship It is essential that all binary relationships are expressed in two ways, reflecting both sides of the relationship.

In the following examples, we aim to adhere closely to the established pattern, while also prioritizing common sense and proper grammar when necessary It is perfectly acceptable to rephrase the exact wording for clarity, as long as the original message remains unchanged.

From the k side, full participation (k =1 or M) : Te x’s, recorded in the database, must be related to one and only one y No x is related to more than one y

Students must be advised by one advisor or,

Students, recorded in the database, must be advised by one and only one advisor No student is advised by more than one advisor

The phrase recorded in the database serves as a valuable tool for database designers who may generalize beyond the specific issue at hand While one might argue that certain conditions could be true or false, the key question is whether those scenarios will actually arise in the context of this particular database Often, a negative statement can effectively clarify the meaning of the relationship, ensuring a more precise understanding of the data structure.

In a k-partial participation model (where k = 1 or M), each x may correspond to a unique y, although not all x entries in the database are guaranteed to have a relationship with y Additionally, some x may remain unrelated to any y, and each x is restricted to associating with only one y.

Some students join a fraternity which becomes:

140 • Database Design Using ER Diagrams

Students, but not necessarily all students (recorded in the database), may join a fraternity Some students may not join a fraternity Students may not join more than one fraternity

In the context of k-side participation (where k equals 1 or M), it is essential that each x recorded in the database is associated with multiple y entities To clarify relationships, it may be beneficial to include statements such as "No x is related to a non-y" or "Non-x's are not related to a y," depending on the intended meaning of the assertion.

Automobiles are driven by (registered to) many students which means:

Automobiles, recorded in our database, must be driven by many (one or more) students

There are several ideas implied here

First, we are only talking about vehicles registered at this school

This database exclusively registers student-owned vehicles, ensuring that any car listed must be both registered and operated by at least one student.

Fourth, the “one or more” comes from the cardinality constraint

It is important to refrain from making assumptions about the relationship between the y and M sides in relation to the x, as this topic is addressed in a different pattern Inferring additional relationships beyond the one discussed can lead to inaccuracies, such as claiming that all students drive cars or are connected to a vehicle, which are both false statements.

From the k side, partial participation (k = 1 or M): x, but not necessarily all x (recorded in the database), may be related to many (zero or more) y ’s

Some x may not be related to a y

Some courses may require (use) many books which restated becomes:

Courses, but not necessarily all courses (recorded in the database), may use many (zero or more) textbooks Some courses may not require textbooks

When modeling relationships in a database, the phrase "zero or more" is used to denote cardinality due to partial participation If the English phrasing appears incorrect, it may indicate an inappropriate model choice This grammatical structure serves two main purposes: it helps restate the database design for a "naive user" and facilitates clarity among designers regarding the database's meaning While a detailed English version can be overwhelming for designers, it is crucial to recognize that statements like "x are related to one y" can have multiple interpretations unless clarified with explicit constraints Additionally, negation statements can aid in defining requirements, but they should be used judiciously to avoid unnecessary complexity Ultimately, incorporating negative or non-contradictory grammar is beneficial for requirement elicitation, provided it does not lead to contradictions or confusion.

6.5.5 Summary of the Patterns and Relationships

Te relationship is: x:y::1(full):1 and is diagrammatically shown by Figure 6.5

Te relationship is: x:y::M(full):1 and is diagrammatically shown by Figure 6.6

142 • Database Design Using ER Diagrams

Chen Model of 1(full):1 Relationship: Pattern 1

Chen Model of M(full):1 Relationship: Pattern 1

Tis pattern implies an instance of aaaaaaa must participate in a relationship with aaaaaaa and can only exist for one (and only one) of aaaaaaa

Chen Model of 1(partial):1 Relationship: Pattern 2 and is diagrammatically shown by Figure 6.7

Te relationship is: x:y::M(partial):1 and is diagrammatically shown by Figure 6.8

In this pattern, certain instances of aaaaaaa may occur independently of aaaaaaa; however, when aaaaaaa is connected to aaaaaaa, it can only be linked to a single, specific instance of aaaaaaa.

Te relationship is: x:y::1(full):M and is diagrammatically shown by Figure 6.9

144 • Database Design Using ER Diagrams

Chen Model of M(partial):1 Relationship: Pattern 2

Chen Model of 1(full):M Relationship: Pattern 3 and is diagrammatically shown by Figure 6.10

Tis pattern implies an instance of aaaaaaa must participate in a rela- tionship with aaaaaaa and can exist for more than one of aaaaaaa

Te relationship is: x:y::1(partial):M and is diagrammatically shown by Figure 6.11

Chen Model of M(full):N Relationship: Pattern 3

Chen Model of 1(partial):M Relationship: Pattern 4

Te relationship is: x:y::M(partial):N and is diagrammatically shown by Figure 6.12

In this pattern, certain instances of aaaaaaa may occur independently of aaaaaaa; however, when aaaaaaa is connected to aaaaaaa, it can be associated with multiple instances of aaaaaaa.

146 • Database Design Using ER Diagrams

Chen Model of M(partial):N Relationship: Pattern 4

1 Sketch an ER diagram showing the participation ratios (full/ partial) and cardinalities for the following: a Students must be advised by one advisor, and an advisor can advise many students b Students, but not necessarily all students, may join a frater- nity Some students may not join a fraternity Students may not join more than one fraternity A fraternity may have many students (in its membership).

Our refned methodology may now be restated with the relationship information added

Step 1 Select one primary entity from the database requirements description and show attributes to be recorded for that entity Label keys, if appropriate, and show some sample data

Step 2 Use structured English for entities, attributes, and keys to describe the elicited database

Step 3 Examine attributes in the primary entity (possibly with user assistance) to fnd out if information about one of the attributes is to be recorded

Step 3a If information about an attribute is needed, then make the attribute an entity, and then

Step 3b Defne the relationship back to the original entity

Step 4 If another entity is appropriate, draw the second entity with its attributes Repeat step 2 to see if this entity should be further split into more entities

Step 5 Connect entities with relationships if relationships exist Step 6 State the exact nature of the relationships in structured English from all sides; for example, if a relationship is A:B::1:M, then there is a relationship from A(1) to B(M) and from B(M) back to A(1)

Step 7 Present the “as designed” database to the user complete with the English for entities, attributes, keys, and relationships Ref ne the diagram as necessary

Step 8 Show some sample data.

Some Examples of Other Relationships

This section explores three additional examples of relationships, specifically focusing on two 1:M (one-to-many) relationships and one M:N (many-to-many) relationship These examples aim to enhance understanding of the concepts presented earlier, as 1:M and M:N relationships are prevalent in database design.

6.6.1 An Example of the One-to-Many Relationship (1:M)

In database design, 1:M and M:1 relationships represent different perspectives of the same issue, highlighting the importance of identifying which entity is designated as 1 and which as M For instance, in a one-to-many relationship between dorm rooms and students, a single dorm room can accommodate multiple students, while many students can reside in one dorm room This relationship exemplifies a 1:M scenario, illustrating the need for clarity in defining entity roles within the database structure.

In Figure 6.13 (the Chen-like model), the name we chose for the aaaaa aaaaaaa relationship was occupy

148 • Database Design Using ER Diagrams

An ER Diagram (without Attributes) of a 1:M Relationship

Note, not all dorms have students living in them; hence, the participa- tion of dorms in the relationship is partial Informally:

Dorms may be occupied by many students

Furthermore, all students may not reside in dorms, so the relationship of aaaaaaa to aaaa is also partial:

Students may occupy a dorm room

Now, let us restate the relationships in the short and long English forms.

T e frst statement, dorms may be occupied by many students, f ts pattern

6.6.1.1 Pattern 4–1:M, From the 1 Side, Partial Participation

“Some x are related to many y.”

Terefore, the more precise statement is: x, but not necessarily all x, (recorded in the database) may be related to many (zero or more) y’s Some x’s are not related to a y or

Dorms, but not necessarily all dorms, (recorded in the database) may be occupied by many (zero or more) students

Students may occupy a dorm room

6.6.1.2 Pattern 2—M(Partial):1, From M Side, Optional Participation

“Some x’s are related to one y.”

The comprehensive interpretation of the statement indicates that while x is recorded in the database, it does not imply that all instances of x are associated with a single y Some instances of x may lack a relationship with y, and importantly, no instance of x can be linked to more than one y.

T is x and y notation resolves into x = students, y = dorms, and hence:

Not all students in the database occupy a dorm, and each student is allowed to occupy only one dorm room at a time Some students may not have a dorm room, while no student is permitted to occupy more than one dorm simultaneously.

A student may occupy a (one and only one) dorm and a dorm may be occupied by many students

150 • Database Design Using ER Diagrams

An ER Diagram (without Attributes) of an M:1 Relationship

6.6.2 An Example of the Many-to-One Relationship (M:1)

In our database model for a school, each student is assigned to a specific parking lot, which we categorize as entities like East Lot 7 and North Lot 28 This relationship can be represented as many automobiles assigned to one parking lot, illustrating a many-to-one relationship The corresponding diagram is depicted in Figure 6.14, showcasing this relationship without additional attributes.

The relationship between aaaaaaaaaaA and PaaaaaaaaaaA is characterized by complete participation, indicating that every automobile is assigned to a parking lot, and each parking lot is designated for students' automobiles.

Te grammatical expressions of this relationship are discussed next

6.6.2.1 Pattern 1—M:1, From the M Side, Full Participation

T e x, recorded in the database, must be related to one and only one y No x is related to more than one y

Extending Relationships • 151 x = automobile, y = parking lot, relationship = park

Automobiles must be parked in a single designated parking lot, as each vehicle is recorded in the database No automobile is allowed to occupy more than one parking space, ensuring that every vehicle parks in only one location.

Te inverse is discussed next

6.6.2.2 Pattern 3–1:M, From the 1 Side, Full Participation

T e x, recorded in the database, must be related to many (one or more) y ’s

(No x is related to a non-y” or “Non-x’s are not related to a y— the negative will depend on the sense of the statement.)

Parking lots, recorded in the database, must park many (one or more) automobiles

Te negative in this case seems misleading, so we will omit it Te point is recorded parking lots must have students parking there

Each automobile is required to be parked in a single parking lot, while every parking lot must accommodate at least one automobile, with the capacity to hold multiple vehicles.

6.6.3 An Example of the Many-to-Many Relationship (M:N)

In the classic M:N relationship, students enroll in multiple courses, and each course is attended by numerous students This interconnection illustrates the fundamental nature of the relationship, as depicted in Figure 6.15.

The term "enroll" effectively illustrates the relationship between student participation and course enrollment Student participation is characterized as mandatory, while course enrollment is described as optional This distinction, though arbitrary, highlights that both aspects can vary based on individual user needs and preferences The choice of using "full" for participation and "partial" for enrollment emphasizes the different levels of commitment involved The grammatical nuances of this relationship will be explored further.

152 • Database Design Using ER Diagrams

An ER Diagram (without Attributes) of an M:N Relationship

6.6.3.1 Pattern 3—M:N, From the M Side, Full Participation

T e x, recorded in the database, must be related to many (one or more) y

No students are enrolled in non-courses, and non-students do not enroll in courses This indicates that there is no relationship between students and non-courses, as well as between non-students and courses.

Students, recorded in the database, must be enrolled in many (one or more) courses

Te inverse is explained next

6.6.3.2 Pattern 4—N:M, From the N Side, Partial Participation

T e x, but not necessarily all x (recorded in the database), may be related to many (one or more) y Some x may not be related to y x = course, y = student, relationship = enroll

Courses, but not necessarily all courses (recorded in the database), may enroll many (one or more) students Some courses may not enroll students

A student must enroll in one or more courses, and a course may have one or more students enrolled in it

The term "course partiality" likely refers to courses listed in the database that are not currently enrolling students, which may indicate potential offerings or courses that are no longer available If a course is included in the database solely based on student enrollment, then the participation constraint is considered full, altering the understanding of the entity-relationship.

This database design emphasizes the importance of storing information solely about active students, highlighting the reality of user data needs While it’s possible to include all students, even inactive ones, the choice to focus on active participants reflects a more accurate representation of the relationships involved Throughout this chapter, all examples illustrate binary relationships, including the forthcoming example that further reinforces this concept.

1 Give an example of a 1(full):1 relationship? Does such a relation- ship always have to be mandatory? Explain with examples

2 Give an example of a 1(partial):1 relationship? Does such a rela- tionship always have to be optional? Explain with examples

3 Give an example of a M(full):N relationship? Would such a relationship always be optional or mandatory? Explain with examples

4 Give an example of a M(partial):N relationship? Would such a relationship always be optional or mandatory? Explain with examples.

One Final Example

As a fnal example to conclude the chapter, we present one more prob- lem and then our methodology 1 Consider a model for a simplif ed airport

This article discusses the process of database design using Entity-Relationship (ER) diagrams It focuses on the attributes of two entities, referred to as "aaaaaaaaaa" and "aaaaaaa," detailing their specific characteristics The attributes for "aaaaaaaaaa" include "aaaaAaAAaaaaaaaaaaaaaaAa" and "aaaaaaaaa aAaaaaaaA," while the attributes for "aaaaaaa" consist of "AAaAaaaaaaaAaAAaaaaaaaaaaaAaA," "AaaaaaaaaaaaA," and "aaaaaaaaaaa." An ER diagram is to be created to visually represent these relationships and attributes.

Note: We are leaving out many attributes we could consider Assume this is all of the information we choose to record

Te solution is given next

Step 1 Select one primary entity from the database requirements description and show attributes to be recorded for that entity Label keys if appropriate and show some sample data

Suppose we choose aaaaaaaaa as our primary entity aaaaaaaaa has the following attributes: AaaaaaaaaaAaAaaaaaaAaAAaaaa[frst, middle, last], aaaaaaaaaaaaaaA

We draw this much of the diagram, choosing aaaaaaaaaaAaaaaaa as a key and noting the composite attribute aaaa Tis diagram is shown in Figure 6.16

Step 2 Use structured English for entities, attributes, and keys to describe the elicited database

Tis database records data about aaaaaaaaaa For each passenger, we record AaaaaaaaaaaAaaaaaaAaAAaaaaA [frst, middle, last], pieces_of_luggage (aaaaaaaaaaaaaa)

For each attribute, there is a unique corresponding value that remains indivisible Additionally, each attribute will have a singular recording associated with it, ensuring that these values are also not subject to subdivision When dealing with composite attributes, the same principle applies.

For each aaaaaaaaa, we will record their aaaa, which is com- posed of faaaaaAaaaaaa, and aaaa AaAaaaaAaaaaaa, and aaaa are the com- ponent parts of name

For each aaaaaaaaa, we will have the following primary key: aaaaaaaaaaAaaaaaaAaA

Note, we have chosen aaaaaaaaaaAaaaaaa as a primary key for aaaaaaaaa

If this were not true, another means of unique identifcation would be necessary Here, this is all the information we are given

Step 3 Examine attributes in the primary entity (possibly with user assistance) to fnd out if information about one of the attributes is to be recorded

No further information is suggested

Step 4 If another entity is appropriate, draw the second entity with its attributes Repeat step 2 to see if this entity should be further split into more entities

Te other entity in this problem is aaaaaa, with attributes AaAaaaaaaa, aaaaaaaaaaaAaAAaaaaaaaaaaaAaAAaaaaaaaAaaaaAa

156 • Database Design Using ER Diagrams

Again, we use structured English as in the following

Tis database records data about Flights For each aaaaaa, we record aAaaaaaaa, aaaaaaaaaaa, aaaaaaaaaaa, and aaaaaaaaaaa

For each aaaaaa, there will be one and only one aAaaaaaaa T e value for aAaaaaaaa will not be subdivided

For each aaaaaa, there will be one and only one recording of aaaaaaaa aaaa Te value for aaaaaaaaaaa will not be subdivided

For each aaaaaa, there will be one and only one recording of aaaaaaaA aaaa Te value for aaaaaaaaaaa will not be subdivided

For each aaaaaa, there will be one and only one recording of aaaaaaaA aaaa Te value for aaaaaaaaaaa will not be subdivided

For the key(s): One candidate key (strong entity):

For each aaaaaa, we will have the following primary key: aAaaaaaaa

We are assuming aAaaaaaaa is unique

Step 5 Connect entities with relationships if relationships exist

What relationship is there between fights and passengers?

All passengers will fy on one fight All fights will have multiple pas- sengers Te diagram for this problem is illustrated in Figure 6.17 and Figure 6.18

In this analysis, we have decided to represent one fight for each passenger in our database The specifications do not clarify whether to use a singular or plural representation, so we opted for a singular approach Additionally, we ensured that both sides are fully represented in each instance.

Recording data on passengers who did not fly on a flight or flights without passengers may seem illogical However, if the database is intended to store information about potential passengers who may not book specific flights, or about flights that lack passengers, a conceptual redesign of the database would be necessary.

The ER Diagram for the aaaaaaaaaaaaaaaa Database illustrates the entities and their attributes in a clear format Figure 6.18 adopts a concise approach to attribute description and incorporates previous steps along with sample data For initial conceptualization, Figure 6.17 can be utilized and subsequently transformed into the documentation style of Figure 6.18 Both figures necessitate the inclusion of structured English to enhance understanding.

As designers, we make a choice and then present our choice to the user

If a user chooses to store data on all fights and passengers over time, it would require a different database structure, potentially involving an M:N relationship and accommodating partial participation for non-passenger flights and passengers Ultimately, the decision rests with the user, and our current goal is to create a model for them to review and validate.

158 • Database Design Using ER Diagrams

Sample Problem—Alternate Presentation of Attributes with Explanation and Sample Data

Step 6 State the exact nature of the relationships in structured English from all sides, such as, if a relationship is A:B::1:M, then there is a relationship from A(1) to B(M) and from B(M) back to A(1)

6.7.2 Pattern 1—M:1, From the M Side, Full Participation

T e x, recorded in the database, must be related to one and only one y No x are related to more than one y

Extending Relationships • 159 x = passenger, y = fight, relationship = f y

Passengers, recorded in the database, must fy on one and only one f ight

No passenger fies on more than one f ight

6.7.3 Pattern 3–1:M, From the 1 Side, Full Participation

T e x, recorded in the database, must be related to many (one or more) y ’s x = f ight, y = passenger, relationship = f y

Flights, recorded in the database, must fy many (one or more) passengers

A passenger must fy on a fight, and a f ight must have at least one (and can have many) passengers on it

Attribute descriptions follow previous patterns and are lef to the exercises

Step 7 Present the “as designed” database to the user complete with the English for entities, attributes, keys, and relationships Ref ne the diagram, as necessary

Step 8 Show some sample data

Mapping Relationships to a

In this section, we build upon the mapping rules introduced at the end of Chapter 4, where we explored the mapping of entities with composite and multivalued attributes Now, having examined the structural constraints of relationships, we will focus on the process of mapping relationships effectively.

For mapping binary M:N relationships, we present mapping rule 5

160 • Database Design Using ER Diagrams

Mapping rule 5 involves creating a new table to represent binary M:N relationships This new table will include the primary keys of both entities involved in the relationship, with the primary key of the table being a combination of these keys Additionally, any attributes associated with the M:N relationship should also be included in this new table.

For example, refer to Figure 6.15 If the aaaaaaa and aaaaaa tables have the data shown in Tables 6.1 and 6.2

Before implementing mapping rule 5, it is essential to confirm that the primary keys for the involved entities are defined For instance, if "aaaaaaa" and "aaaaaaa aaaaaa" serve as the primary keys for "aaaaaa" and "aaaaaa" respectively, the M:N relationship can be mapped by creating a relation named "aaaaaa," as illustrated in Table 6.3.

Sample Data for aaaaaaa Table in Figure 6.15 name.fi rst name.last name.mi student_number address

Richard Earp W 589 222 2nd St Boris Backer 909 333 Dreistrasse Helga Hogan H 384 88 Half Moon Ave Arpan Bagui K 876 33 Bloom Ave Hema Malini 505 100 Livingstone

Sample Data for aaaaaa Table in Figure 6.15 cname c_number credit_hrs

TABLE 6.3 Sample Data for aaaaaa in Figure 6.15 c_number student_number

TABLE 6.3 (Continued) Sample Data for ENROLL in Figure 6.15 c_number student_number

CGS3464 909 STA3023 589 HIST2022 384 STA3023 505 STA3023 876 HIST2022 876 HIST2022 505

Both aaaaaaaa and aaaaaaaaaaaaaa together are the primary key of the relation, aaaaaa

Te relational mapping for Figure 6.15 would be as follows: aaaaaaa(aaaa.aAaaa, aaaa.aaaa, aaaa.aa, aaaaaaaaaaaaaa, aaaaaaa) aaaaaa (aaaaa, aaaaaaaa, aaaaaaaaaa) aaaaaa(aaaaaaaa, aaaaaaaaaaaaaa)

In M:N relationships, it is common for data to emerge that aligns more closely with the relationship itself than with the individual entities involved While relationship attributes are discussed in Chapter 8, any new relationship attributes that develop will be mapped using the primary keys of the entities.

To establish a binary 1:1 relationship, incorporate the primary key of one entity as the foreign key in the other entity The determination of which entity serves as the primary key and which as the foreign key is clarified by the mapping rules outlined in this section.

Mapping rule 6 addresses the mapping of a binary 1:1 relationship where one side has full participation while the other has partial participation In this scenario, the primary key from the side with partial participation should be stored as a foreign key on the side with full participation Additionally, any attributes associated with the relationship should be included on the side where the key is added.

Chapter 8 and then embellish the mapping rules accordingly.)

162 • Database Design Using ER Diagrams

For example, refer to Figure 6.3 It says:

An automobile, recorded in the database, must be driven by one and only one student and

A student may drive one and only one automobile

Here, the full participation is on the aaaaaaaaaa side since “An auto- mobile ‘must’ be driven by a student.”

In the relational database realization of the ER diagram depicted in Figure 6.3, we incorporate the primary key from the partial participation constraint side, denoted as "aaaaaaaaaaa," into the "aaaaaaaaaa" table This primary key serves as a foreign key in the "aaaaaaaaaa" relation, ensuring proper linkage between the tables The resulting structure includes the "aaaaaaaaaa" table with attributes such as "aaaaaaaaaa," "aaaa," "aaaaaaaaaa," "aaaa," and "aaaaaaaaaaaaaa," alongside the "aaaaaaa" table featuring fields like "aaaa.aAaaa," "aaaa.aaaa," "aaaa.aa," and "aaaaaaaaaaaaaa." The data representation is further illustrated in Tables 6.4 and 6.5.

Sample Data for aaaaaaaaaa in Figure 6.3 vehicle_id make body_style color year student_number

Sample Data for aaaaaaa in Figure 6.3 name.fi rst name.last name.mi student_number address

Boris Backer 909 333 Dreistrasse Helga Hogan H 384 8 8 Half Moon Ave Arpan Bagui K 876 33 Bloom Ave Hema Malini 505 100 Livingstone

Sample Data for Multivalued Attribute in Figure 6.3 student_number school

Since aaaaaaa has a multivalued attribute school, we need to map the multivalued attribute to its own table (as per mapping rule 4, mapping multivalued attributes), as shown in Table 6.6

In this case, if the relationship had any attributes, they would be included in the aaaaaaaaaa relation since that is where the key went

Mapping rule 7—Mapping a binary 1:1 relationship when both sides have partial participation constraints

When both sides have partial participation constraints in a binary 1:1 relationship, the relationships can be mapped in one of two ways For the f rst option:

Mapping rule 7A Select either one of the relations to store the key of the other (and live with some null values)

Figure 6.2 illustrates the partial participation constraints from both sides, assuming the absence of any school attributes.

An automobile may be driven by one and only one student and

A student may drive one and only one automobile

164 • Database Design Using ER Diagrams

In a relational database, the primary key of the table 'aaaaaaaaaa' should be stored in 'aaaaaaa' in the following structure: 'aaaaaaaaaa(aaaaaaaaaa, aaaa, aaaaaaaaaa, aaaaa, aaaa)' and 'aaaaaaa(aaaa.aAaaa, aaaa.aaaa, aaaa.aa, aaaaaaaaaaaaaa, aaaaaaa, aaaaaaaaaa)' Sample data illustrating this structure can be found in Tables 6.7 and 6.8.

In the aaaaaaa relation, aaaaaaaaaaA is a foreign key

According to mapping rule 7B, you can establish a new relationship to encapsulate the connection between two related entities, similar to the approach in mapping rule 5 In this scenario, any null values will be excluded from the linking table.

Sample Data for aaaaaaaaaa in Figure 6.2 vehicle_id make body_style color year

A39583 Ford Compact Blue 1999 B83974 Chevy Compact Red 1989 E98722 Mazda Van Green 2002 F77665 Ford Compact White 1998 G99999 Chevy Van Grey 1989

Sample Data for aaaaaaa in Figure 6.2 name.fi rst name.last name.mi student_ address vehicle_id number

Richard Earp W 589 222 2nd St A39583 Boris Backer 909 333 Dreistrasse B83974 Helga Hogan H 384 88 Half Moon Ave F77665 Arpan Bagui K 876 33 Bloom Ave E98722 Hema Malini 505 100 Livingstone

We illustrate the mapping of Figure 6.2 using this rule Te relational realization would be aaaaaaaaaaaAaaaaaaaaaaAaAaaaaaAaaaaaaaaaaaAaaaaaaAaaaaAa aaaaaaaaAaaaaaaAaaaaAaaaaaaaaaaAaaaaaaaaAAaaaaaaaaaaaaaaAaAaaaaaaaAa aaaaaaaaaaaaaaaaaaaAaaaaaaaaaaaAaaaaaaaaaaaaaaAa and with some data as shown in Table 6.9

In this case, the two relations, aaaaaaa and aaaaaaaaaa, would remain as shown in Tables 6.10 and 6.11

Sample Data for Alternative Representation of aaaaaaaaaaaaaaaaaa in Figure 6.2 vehicle_id student_number

Sample Data for Alternative Representation of aaaaaaa in Figure 6.2 name.first name.last name.mi student_ address number

Boris Backer 909 333 Dreistrasse Helga Hogan H 384 8 8 Half Moon Ave Arpan Bagui K 876 33 Bloom Ave Hema Malini 505 100 Livingstone

Sample Data for Alternative Representation of aaaaaaaaaa in Figure 6.2 vehicle_id make body_style color year

A39583 Ford Compact Blue 1999 B83974 Chevy Compact Red 1989 E98722 Mazda Van Green 2002 F77665 Ford Compact White 1998 G99999 Chevy Van Grey 1989

166 • Database Design Using ER Diagrams

Mapping rule 8 addresses the mapping of a binary 1:1 relationship where both entities have full participation constraints To determine which entity should contain the key of the other, consider the semantics of the relationship If the decision remains ambiguous, refer to mapping rule 7B for guidance.

Now, assuming full participation on both sides of Figure 6.2, the two tables aaaaaaa and aaaaaaaaaa could be: aaaaaaaaAaaaaaaAaaaaAaaaaaaaaaaAaaaaaaaaAAaaaaaaaaaaaaaaAaAaaaaaaaa aaaaaaaaaaaAaaaaaaaaaaAaAaaaaaAaaaaaaaaaaaAaaaaaaAaaaaaAaaaaaaaaaaaaaaAa and with some sample data as shown in Tables 6.12 and 6.13,

In this scenario, Aaaaaaaaaaaaaaa serves as a foreign key in aaaaaaaaaa, which is derived from aaaaaaaaaa Alternatively, we could have utilized the primary key, aaaaaaaaaa, from aaaaaaaaaa for inclusion in the aaaaaaa table However, incorporating foreign keys in both tables would create redundancy within the database, which is not advisable.

Sample Data for Figure 6.2 With Full Participation on Both Sides: aaaaaaa Table name.first name.last name.mi student_number address

Richard Earp W 589 222 2nd St Boris Backer 909 333 Dreistrasse Helga Hogan H 384 88 Half Moon Ave Arpan Bagui K 876 33 Bloom Ave Hema Malini 505 100 Livingstone

Sample Data for Figure 6.2 With Full Participation on Both Sides: aaaaaaaaaa Table vehicle_id make body_style color year student_number

Next, we develop mapping rules to map binary 1:N relationships T ese mappings will depend on what kind of participation constraint the N side of the relationship has

Mapping rule 9—Mapping binary 1:N relationships when the N side has full participation Include the key of the entity on the 1 side of the relationship as a foreign key on the N side

For example, in Figure 6.13, if we assume full participation on the stu- dent side, we will have

Dorm rooms may have zero or more students and

Students must live in only and only one dorm room

Te “1 side” is aaaa; the “N side” is aaaaaaa So, a reference to aaaa (aaaaa, the key of aaaa) is included in aaaaaaa

And, if we had the following sample data as shown in Tables 6.14 and 6.15,

Sample Data for aaaaaaa in Figure 6.13 name.fi rst name.last name.mi student_number dname

Richard Earp W 589 A Boris Backer 909 C Helga Hogan H 384 A Arpan Bagui K 876 A Hema Malini 505 B

Sample Data for aaaa in Figure 6.13 dname supervisor

168 • Database Design Using ER Diagrams the relational mapping would be aaaaaaaaAaaaaaaAaaaaAaaaaaaaaaaAaaaaaaaaAAaaaaaaaaaaaaaaAaAaaaaaa aaaaaAaaaaaAaAaaaaaaaaaaAa

When mapping binary 1:N relationships where the N side has partial participation, treat the scenario similarly to a binary M:N relationship by creating a separate table for the relationship The key for this new table should be a concatenation of the keys from the related entities, and any attributes associated with the relationship should also be included in this table.

1 State the mapping rule(s) that would be used to map Figure 6.14 Map Figure 6.14 to a relational database and show some sample data

2 State the mapping rule(s) that would be used to map Figure 6.17 Map Figure 6.17 to a relational database and show some sample data.

Chapter Summary

This chapter explores cardinality and participation ratios in ER diagrams, featuring examples and diagrams of binary relationships with structural constraints based on the Chen-like model It emphasizes improved English grammar for each diagram and defines steps 7 and 8 of the ER design methodology The concluding section addresses mapping relationships, highlighting that as the model grows in complexity, we will revisit mapping rules in subsequent chapters to address this intricacy.

Let us reconsider our student example in Exercise 5.2 in which the only attributes of student are aaaaaaaaaaaaaaAand aaaa Now we have another

The HIGH SCHOOL entity represents the educational institution from which a student graduates, encompassing key attributes such as the high school's name and its geographical location, including city and state To visually represent this data structure, an ER diagram will be created using the Chen-like model, ensuring that all components are accurately described in English Following this, the ER diagram will be mapped to a relational database to facilitate data organization and retrieval.

In this scenario, we have a college dormitory represented as a single DORMITORY ROOM entity, which includes attributes for both private and double rooms The STUDENT entity encompasses attributes such as student ID, name, and contact information To visualize this structure, we can create an ER diagram using the Chen-like model, ensuring to include all relevant English descriptions for clarity Subsequently, we can map this ER diagram to a relational database, establishing the relationships between the DORMITORY ROOM and STUDENT entities to facilitate efficient data management.

The STUDENT database comprises two primary entities: Students and Campus Organizations Students possess attributes such as student ID and name, while Organizations include attributes like organization ID and organization name An ER diagram using the Chen-like model illustrates the relationships between these entities, detailing their attributes and connections Following the diagram, the entities are mapped to a relational database structure, which includes tables for Students and Organizations, along with sample data entries for each This structured approach ensures clear representation and organization of the data within the database.

The STUDENT and ADVISOR database consists of two main entities: Students and Advisors Each Student is characterized by attributes such as Student ID, name, and contact information Advisors are defined by their names, office numbers, and the major they specialize in, identified by a major code (e.g., Chemistry as CHEM, Biology as BIOL, and Computer Science as COMPSC) An Entity-Relationship (ER) diagram can be created using the Chen-like model to visually represent these entities and their relationships This diagram should include detailed English descriptions of each component Subsequently, the ER diagram can be mapped to a relational database structure, which will involve defining tables for Students and Advisors, along with their respective attributes, and providing sample data to illustrate the database's functionality.

170 • Database Design Using ER Diagrams

To effectively manage restaurant data, it is essential to create a database that captures key information such as the restaurant name, location, employee names and IDs, restaurant capacity, smoking or nonsmoking areas, hours of operation, employee salaries, and titles Each employee is assigned to only one restaurant, ensuring that every restaurant has at least one employee An Entity-Relationship (ER) diagram, modeled in the Chen style, should be constructed to visually represent these relationships, accompanied by detailed English descriptions Subsequently, this ER diagram must be mapped to a relational database schema, which includes sample data to illustrate the database's structure and functionality.

In the revised exercise 6.5, it is stated that an employee can work for multiple restaurants, while each restaurant must employ at least one individual and may have numerous employees The corresponding Chen-like ER model illustrates this relationship, highlighting the entities of "Employee" and "Restaurant" and their connections The relational mapping further delineates how these entities interact within a database structure, ensuring clarity in the representation of their associations.

In the revised exercise 6.5, each employee is required to work exclusively for one restaurant, while a restaurant can employ multiple individuals This scenario can be represented using a Chen-like Entity-Relationship (ER) model, illustrating the relationship between employees and restaurants, alongside the corresponding relational mapping.

Record the following data in a database: business name, owner, location(s), telephone numbers, delivery truck number, truck capacity, usual route description (e.g., North, West, Central, Lake, ) Draw the

ER diagram using the Chen-like model Present the relational mapping Follow the methodology and include all English descriptions of your diagram

Refer to Figure 6.19 What are the English language statements you can make about the f gure?

ER Diagram of West Florida Mall with Four Entities and Structural Constraints

Refer to Figure 6.18 Complete the diagram by adding a precise English description of each attribute Map Figure 6.18 to a relational database

172 • Database Design Using ER Diagrams

The cardinality of the relationships in the given scenarios can be summarized as follows: In the first case, each student must own exactly one car, and each car can only be owned by one student, indicating a one-to-one relationship In the second scenario, while each student can drive multiple cars, each car can only be driven by one student, establishing a one-to-many relationship from the perspective of the student Lastly, in the third situation, students can rent multiple cars, and each car can be rented by multiple students, which creates a many-to-many relationship between students and cars.

Which of these cardinality rules are optional? Which rules are manda- tory? Show these relationships diagrammatically using the Chen-like ER notation

In the previous chapters, we identified our main entities based on user specifications and defined their relationships This chapter focuses on creating the ER diagram for our case study, following steps 6 and 7 of the ER design methodology, and mapping the diagram to a relational database, complete with sample data.

Step 6 develops the structural constraints of binary relationships

Step 6 State the exact nature of the relationships in structured English from all sides; for example, when a relationship is A:B::1:M, there is a relationship from A(1) to B(M) and from B(M) back to A(1)

First, for the relationship located_in:

From MALL to STORE, this fts pattern 3, 1(full):N:

A mall must have at least one store and can have many stores

ER Diagram of West Florida Mall with Four Entities and Structural Constraints

Malls, recorded in the database, must have many (one or more) stores located in them

From aaaaa to aaaa, this fts pattern 1, M(full):1:

Many stores (one or more) must be in one mall

Stores, recorded in the database, must be in one mall

T e aaaa entity is mapped as mapped in Chapter 5, as shown in Table 6.16

174 • Database Design Using ER Diagrams

Sample Data for aaaa name address

West Florida Mall N Davis Hwy, Pensacola, FL Cordova Mall 9th Avenue, Pensacola, FL

Navy Mall Navy Blvd, Pensacola, FL

BelAir Mall 10th Avenue, Mobile, AL

Sample Data for aaaaa sloc sname snum mall_name

Rm 101 Penneys 1 West Florida Mall

Rm 102 Sears 2 West Florida Mall

Rm 109 Dollar Store 3 West Florida Mall

Rm 110 Rex 4 West Florida Mall

Next, we have to map the relationship between the aaaa entity and the aaaaa entity Tis is a binary 1:N relationship; hence, we use mapping rule 9, which states:

Include the key of the entity on the 1 side of the relationship to the N side as a foreign key

The key from the "aaaa" side will be incorporated as a foreign key in the "N" side, as illustrated by the sample data in Table 6.17.

In aaaaa, the multivalued attribute 'dept' necessitates maintaining a relationship with this attribute, as discussed in Chapter 5 Sample data illustrating this relationship is presented in Table 6.18.

Ten, for the relationship owns:

From aaaaa to aaaaa, this fts pattern 3, 1(full):M:

Owners, recorded in the database, must own one or more stores

One owner must own at least one store and may own many stores

TABLE 6.18 Sample Data for aaaaaaaaaa snum dept

Sample Data for aaaaa With Foreign Key sloc sname snum mall_name so_ssn

Rm 101 Penneys 1 West Florida Mall 879-987-0987

Rm 102 Sears 2 West Florida Mall 928-088-7654

Rm 109 Dollar Store 3 West Florida Mall 826-098-0877

Rm 110 Rex 4 West Florida Mall 982-876-8766

From aaaaa to aaaaa, this fts pattern 1, M(full):1:

Stores, recorded in the database, must have one and only one owner

Many stores can have one owner

In a 1:N relationship between aaaaa and aaaaa, we apply mapping rule 9 by taking the key from the 1 side, aaaaaa, and incorporating it as the foreign key on the N side, represented as aaaaaaAaaaaa This relationship is exemplified with sample data, as illustrated in Table 6.19.

And, the relation for the aaaaa entity remains as developed in Chapter 5 With some sample data, it is shown in Table 6.20

176 • Database Design Using ER Diagrams

Sample Data for aaaaa in Figure 6.13 so_ssn so_name so_off_phone so_address

Pkwy, Pensacola, FL 826-098-0877 Sardar (850)474-9873 109 Navy Blvd,

Pensacola, FL 928-088-7654 Bagui (850)474-9382 89 Highland Heights,

Tampa, FL 982-876-8766 Bush (850)474-9283 987 Middle Tree,

From aaaaa to aaaaaaaaaaaaa, this fts pattern 1, 1(full):1:

Stores, recorded in the database, must have one store manager

Stores must have one store manager and can only have one and only one store manager

From aaaaaaaaaaaaa to aaaaa, this also fts pattern 1, 1(full):1:

Store managers, recorded in the database, must manage one and only one store

Store managers must manage at least one store and can manage only one store

The relationship between aaaaa and aaaaaaaaaaaaa is characterized as a binary 1:1 relationship By applying mapping rule 8, the relation aaaaa can be transformed to include a foreign key derived from the key in aaaaaa aaaaaaa, as illustrated in Table 6.21 with sample data.

Te relation for the aaaaaaaaaaaaa entity remains as developed in Chapter 5 We show this with some sample data, as shown in Table 6.22 Our next step is step 7, which is

Step 7 Present the “as designed” database to the user complete with the English for entities, attributes, keys, and relationships Ref ne the diagram, as necessary

Sample Data for aaaaa With Foreign Key sloc sname snum mall_name so_ssn sm_ssn

Sample Data for aaaaaaaaaaaaa sm_ssn sm_name sm_salary

Our relational database has been successfully mapped, excluding the data, with primary keys underlined The key elements include: aaaa (name, address), aaaaa (sloc, sname, snum), mall (name, so, ssn, sm, ssn).

AaaaaaaaaaaA snum dept aaaaa so _ ssn so _ name so _ off _ phone so _ address AaaaaaAaaaaaaaA sm _ ssn sm _ name sm _ salary

178 • Database Design Using ER Diagrams

We continue the development of this case study at the end of the next chapter

1 Modeled afer Elmasri and Navathe (2016)

Batini, C., Ceri, S., and Navathe, S.B (1992) Conceptual Database Design Redwood City,

Earp, R., and Bagui, S (2001) Extending relationships in the entity-relationship diagram

Elmasri, R., and Navathe, S.B (2016) Fundamentals of Database Systems Reading, MA:

Kroenke, D.M (2010) Database Processing Upper Saddle River, NJ: Prentice Hall

McFadden, F.R., and Hofer, J.A (2007) Modern Database Management Upper Saddle

Ramakrishnan, R., and Gehrke, J (2003) Database Management Systems New York: McGraw Hill.

The Weak Entity

Introduction

Chapters 4 and 5 introduced the concepts of the entity, the attribute, and the relationship Chapter 6 dealt with structural constraints, that is, how two entities are related to one another In this chapter, we discuss the con- cept of the “weak” entity, which is used in the Chen-like model Weak entities may not have a key attribute of their own as they are dependent on a strong or regular entity for their existence Strong entities always have a primary key Te weak entity has some restrictions on its use and gener- ates some interesting diagrams Tis chapter revisits and redef nes steps

3 and 4 of the entity-relationship (ER) design methodology to include the concept of the weak entity Grammar and mapping rules for the weak entity are also developed.

Strong and Weak Entities

Identifying a key for an entity can be challenging, particularly in cases involving weak entities, as discussed in Chapter 4 While we have primarily focused on strong entities, which possess easily identifiable keys, it's important to note that a unique identifier can consist of a single attribute or a combination of attributes Examples of unique identifiers for strong entities include student numbers, vehicle identification numbers (VIN), and driver’s license numbers.

A weak entity is an entity that relies on another entity for its existence and typically lacks a unique identifier A classic example of a weak entity is a dependent, which is associated with a primary entity, such as a parent When designing a database, it is essential to recognize the relationship between weak and strong entities to ensure proper data structure and integrity.

180 • Database Design Using ER Diagrams

In the context of database management, the entity representing employees and their dependents illustrates a multivalued attribute Each dependent is directly linked to a specific employee instance, meaning that without the corresponding employee, the dependent would not exist in the database This relationship designates the employee entity as the owner or identifying entity for the dependent, which is classified as a weak entity.

In database design, a weak entity can arise when a dependent name is represented as a multivalued attribute, as illustrated in Figure 7.1 For instance, the data corresponding to such a diagram can be found in Table 7.1, highlighting the relationship between the weak entity and its attributes.

Suppose in our conversations with the user, we discover more information is supposed to be gathered about the dependents Following

Sample Data for aaaaaaaa in Figure 7.1 fname minit lname emp ID dependents

John J Jones 0001 John, Jr; Dumas; Sally Sam S Smith 0004 Brenda; Richard

Adam A Adams 0007 John; Quincy; Maude Santosh C Saha 0009 Ranu; Pradeep; Mala

In our methodology, we recognize that dependents qualify as entities, as we are documenting information about them Therefore, we will define an entity named "aaaaaaaaa." By establishing "aaaaaaaaa" as an entity, we will enhance the diagram from Figure 7.1 to Figure 7.2.

When designing a database, attributes are typically named using singular nouns In this case, the term "dependents" is used in plural form, indicating the presence of a multi-valued attribute within the entity This design choice highlights the importance of recognizing multi-valued attributes, which will be further explored in the finalized database structure.

In the initial phase of database design, the identification of user requirements may lead to the inclusion of multi-valued attributes in the ER diagram However, these complexities will be addressed during the normalization process, which occurs after the ER diagram is converted into a relational database.

Figure 7.2 illustrates a challenge where the entity is reliant on another entity for its existence, resulting in it being classified as a weak entity This weak entity lacks a distinct identifier, as its attributes do not provide unique values Consequently, none of these attributes qualify as candidate keys on their own, highlighting the absence of a single attribute that can serve as a candidate key.

In the Chen-like model, weak entities are represented within a double box, while their corresponding relationships to owner entities are depicted in a double diamond For instance, in Figure 7.3, the weak entity is identified by its owner entity, known as the identifying entity This relationship highlights the dependency of the weak entity on its owner for identification.

Weak entities are treated similarly to strong entities regarding attributes, although weak entities do not have primary keys Figure 7.3 illustrates several attributes associated with these entities.

A dependent must be related to one employee, and an employee may have many dependents

A weak entity relies on a strong entity, which results in a mandatory participation constraint and a cardinality of one in the initial segment of the grammatical description.

T e aaaaaaaaa entity has the attributes aaaaaaAaaaaaaaaaa, and aaaaaaaaa

When managing weak entities, it is essential to determine how each instance can be identified The owner of the weak entity, referred to as aaaaaaaaa, is linked to the strong entity aaaaaaaa Therefore, the identification process combines the key of aaaaaaaa with specific information from the weak entity aaaaaaaaa The attribute aaaaa serves as a potential identifier for aaaaaaaaa and is designated as a partial key.

In Figure 7.3, the attribute "aaaaa" is highlighted with dashes, indicating that it serves as a partial key that identifies dependents but does not do so uniquely It is assumed that all dependents possess unique names, although the relationship itself has not been explicitly named.

T e aaaaaaaaaaaaaaaaaaAER Diagram and lef it as ED for aaaaaaaaaaaaaaaaaa Suitable names for the relationship might be “have” as in:

Employees may have many dependents or “dependent on” as in:

Employees may have many dependents dependent on them

184 • Database Design Using ER Diagrams

We could also have used “related to” as in:

Employees are related to many dependents

Many verb phrases, such as "dependent on" and "related to," can appear redundant or misleading Therefore, it may be more effective to leave the relationship unspecified If a user opts to define the relationship with a verb, the analyst can then rename it accordingly.

Weak Entities and Structural Constraints

Weak entities require full participation from their associated strong entities to ensure that each weak entity is uniquely identified within the database Without this total participation, there could be instances where a weak entity, such as a dependent, exists without a direct link to a strong entity, like an employee Typically, the relationship between weak and strong entities is characterized by a one-to-many (1:M) cardinality, although this is not a strict rule.

Weak Entities and the Identifying Owner

In certain scenarios, a weak entity can be linked to an owner entity while also having additional relationships For instance, in a given example, two relationships—“owns” and “drives”—connect the entities involved The weak entity, in this case, is the automobile, which relies on the existence of an employee; without an employee, the automobile cannot exist in the database The automobile's identification is tied to its owner, as indicated by the double diamond symbol representing the “owns” relationship, highlighting the full participation of the weak entity in this relationship.

In Figure 7.4, a "drives" relationship is illustrated, indicating that automobiles can be operated by employees other than their owners This highlights that every automobile is driven by at least one employee, demonstrating full participation in the driving relationship between the vehicle and the driver-employee.

A Weak Entity with Two Relationships necessarily be the actual owner To identify aaaaaaaaaa, we are saying we need the owns relationship, but other non-owner drivers may exist

In Figure 7.4, it is illustrated that an employee can own multiple automobiles To determine the specific automobiles owned by an employee, we need to consider not only the employee's ID but also the make, model, and color of the vehicles The attributes of the automobile entity, specifically the make, model, and color, are designated as partial keys (indicated by dotted underlining in Figure 7.4) because they do not uniquely identify an automobile on their own.

186 • Database Design Using ER Diagrams

The vehicle identification number (VIN) of a weak entity is crucial, as it serves as a unique identifier While the weak entity can possess a primary key, it remains classified as weak because it relies on the presence of its owner or driver employee for its existence in the database Thus, having a primary key does not automatically qualify an entity as "strong."

2 How would you identify a strong entity?

3 How would you identify a weak entity?

4 What kind of a relationship line (single or double) would be leading up to the weak entity in a Chen-like diagram?

6 What would the structural constraints of a weak entity generally be?

7 What would the cardinality of a weak entity generally be?

7.4.1 Another Example of a Weak Entity and the Identifying Owner

In the ER diagram illustrated in Figure 7.5, there are two strong entities, aaaaaa and aaa, alongside one weak entity, aaa The diagram demonstrates that the strong entity aaaaaa owns the weak entity aaa, establishing a relationship where aaaaaa is the identifying or controlling entity for aaa This ownership relationship is depicted with a double diamond, indicating that it is a weak relationship Consequently, aaa is classified as a weak entity that is dependent on the strong entity aaaaaa.

The relationship between treats and aaa does not exhibit a double diamond structure, as aaa does not own aaa In this context, treats represents a strong relationship, while aaa is considered a strong entity in relation to aaa.

Weak Entities Connected to Other

We would like to make a fnal point regarding weak entities Just because an entity is weak does not preclude it from being an owner of another weak

The diagram illustrates a weak entity relationship, specifically highlighting the inclusion of dependents' hobbies, as shown in Figure 7.6 While the necessity of recording such information may be questioned, it is assumed that the user requires it The entity representing hobbies can be categorized by their types, such as stamp collecting, baseball, or tying knots.

188 • Database Design Using ER Diagrams

T e aaaaaaaaaaaaaaaaaaaaaaaa ER Diagram observing trains, etc.) T e aaaa attribute of aaaaa is a partial key for aaaaa

T e entity aaaaaaaaa is the owner of the weak entity aaaaa, and the entity aaaaaaaa is the owner of the weak entity aaaaaaaaa

This situation is highlighted to demonstrate its possible existence, and as we delve deeper into this topic, it is essential to approach this unique scenario with caution.

1 Can a weak entity be dependent on another weak entity?

2 Can a weak entity have a relationship that is not “weak” with the identifying entity?

3 Can a weak entity be related to more than one entity (strong or weak)?

Revisiting the Methodology

The incorporation of a weak entity in an ER diagram necessitates a reevaluation of our methodology and adjustments to our approach We may identify the weak entity in one of two scenarios, as demonstrated by the transformation of the multivalued attribute known as the “dependent,” which occurs during steps 3a and 3b.

Step 3 Examine attributes in the primary entity (possibly with user assistance) to find out if information about one of the attributes is to be recorded

Step 3a If information about an attribute is needed, then make the attribute an entity, and then

Step 3b Define the relationship back to the original entity

In Step 3c, when a new entity relies completely on another entity for its existence, it should be represented as a weak entity, indicated by a double box The relationship to the identifying entity is denoted with a double diamond, signifying full participation of the weak entity in the relationship Additionally, the partial key identifier(s) within the weak entity should be highlighted with a dashed underline.

Te second place a weak entity might appear would be as part of step 4 when new entities are considered:

Step 4 If another entity is appropriate, draw the second entity with its attributes Repeat step 2 to see if any attributes should be further split into more entities

190 • Database Design Using ER Diagrams

In cases where additional entities lack candidate keys, they should be represented as weak entities, as outlined in step 3c, and connected to an identifying entity The weak entity's participation in the relationship is considered full or mandatory, and the partial key identifiers within the weak entity should be indicated with a dash or dot underline.

Again, a weak entity cannot exist without an identifying entity If the weak entity is “discovered” independent of an identifying entity, the rela- tionship connection should be made immediately.

Weak Entity Grammar

In our previous discussion, we explored the grammar related to weak entities Now, we aim to refine our understanding of weak entities that lack a primary key While a weak entity can possess a primary key, it may also be categorized under item (b) To address this, we introduce part (c) to the grammar concerning keys.

(a) More than one candidate key (strong entity):

One key is chosen as the primary key (covered previously) (b) One candidate key (strong or weak entity):

Te primary key is the candidate key For each weak entity, it is assumed no weak entity will be recorded without a corresponding owner (strong) entity

For each weak entity with a primary key, we also must record the pri- mary key of the owner entity

(c) No candidate keys (weak entity):

For each (weak) entity, we do not assume any attribute will be unique enough to identify individual entities

In this case, the aaaaaaaaa entity would be depicted as:

For each aaaaaaaaa entity, we do not assume any attribute will be unique enough to identify individual entities

We now enhance this description to include the identifying entity:

A weak entity lacks a candidate key and is identified through the keys of its associated strong entity Each weak entity is defined by the combination of its partial key and the primary key of its owner, such as DEPENDENT.DEP_NAME concatenated with EMPLOYEE.EMPLOYEE_ID.

In this case, the aaaaaaaaa entity is identifed by the aaaaaaaa entity, and this second statement becomes:

Since the aaaaaaaaa entity does not have a candidate key, each aaaaaaaaa entity will be identifed by key(s) belonging to the aaaaaaaa entity plus aaaaa in the aaaaaaaaa entity.

Mapping Weak Entities to a

In this section, we develop the mapping rules for mapping weak entities to a relational database

Mapping Rule 11—Mapping weak entities Develop a new table

(relation) for each weak entity As is the case with the strong entity, include any atomic attributes from the weak entity in the table

When dealing with composite attributes, it is essential to include only the atomic components while ensuring they are properly qualified to maintain the integrity of the information Additionally, to establish a relationship between a weak entity and its owner, the primary key of the owner entity must be incorporated into the weak relationship.

Te primary key of the weak relation will be the partial key of the weak entity concatenated to the primary key of the owner entity

192 • Database Design Using ER Diagrams

When a weak entity possesses other weak entities, it is essential to first map the weak entity associated with the strong entity This requires defining the key of the weak owner-entity prior to mapping the weaker entity that is further removed from the strong entity.

For example, refer to Figure 7.3 Te aaaaaaaa relation and aaaaaaaaa relation would be mapped as aaaaaaaaaAaaaaaAaAAaaaaaAaAAaaaaaAaAAAaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaAAaaaaaAa aaaaaaaaaAaAAaaaaaaaaaaAa and with data shown in Tables 7.2 and 7.3

In this context, aaaaaaaaaaa serves as the primary key for the aaaaaaaa table The aaaaaaaaa aa from the owner relation aaaaaaaa is incorporated into the weak relation aaaaaaaaa Consequently, aaaaaaaaaaa is now included in the primary key of aaaaaaaaa Given that aaaaa is the partial key of the aaaaaaaaa relation, the final primary key for the aaaaaaaaa relation is the concatenation of aaaaa and aaaaaaaaaaa.

Now, refer to Figure 7.6 Here, the aaaaaaaaa entity is dependent on the aaaaaaaa entity, and the aaaaa entity is dependent on the

Sample Data of aaaaaaaa in Figure 7.3 fname lname minit employee_id

Sample Data of aaaaaaaaa in Figure 7.3 dname birth_date insurance employee_id

Sample Data of aaaaa in Figure 7.6 dname employee_id type years_involved

Abbie 384 singing 4 aaaaaaaaa entity Te aaaaaaaa relation and aaaaaaaaa rela- tion would be mapped as shown, and then the aaaaa relation would be mapped as: aaaaaaaaaaaaAAaaaaaaaaaaaAaAAaaaaAa aaaaaaaaaaaaaaa

And, with some sample data as shown in Table 7.4

Te partial key of aaaaa was aaaa Te primary key of the aaaaa relation now becomes aaaaaAaAAaaaaaaaaaaa, and aaaa all together

1 What are the rules for mapping weak entities? Map Figure 7.5 and show some sample data

2 When mapping weak entities, what becomes their new primary key?

3 How would you map multivalued attributes in a weak entity? Explain with an example

4 How are weak entities generally identifed?

Chapter Summary

This chapter explores the concept of weak entities, enhancing the grammar associated with them and refining the ER design methodology It also establishes mapping rules specifically for weak entities While the weak entity concept is present in the Chen-like model, it is approached differently in other ER models.

194 • Database Design Using ER Diagrams

Exercises

To create an ER diagram using a Chen-like model for a database, include attributes such as employee name, employee number, employee address, and skills, noting that an employee can possess multiple skills Enhance the diagram by adding details like skill level, certification date (if applicable), and the date the skill was first utilized Assess whether there are any weak entities within this database structure Finally, translate the ER diagram into a relational database format, ensuring proper representation of the relationships and attributes.

To create an ER diagram for sports and players, define the attributes of the SPORT entity, which include sport name, type of sport, and whether it is timed or untimed The PLAYERS entity should include attributes such as name, person ID, and date of birth Since players can participate in multiple sports, it is essential to represent this many-to-many relationship In this scenario, the PLAYERS entity may be considered weak, as its existence relies on the association with the SPORT entity.

ER diagram Map this ER diagram to a relational database

What mapping rules would be used to map Figure 7.4? Map Figure 7.4 to a relational database and show some sample data

To map Figure 7.6 to a relational database, create the following tables and attributes: TABLE1 (attribute1, attribute2, attribute3), TABLE2 (attribute1, attribute2, attribute3), TABLE3 (attribute1, attribute2, attribute3), ensuring to include the appropriate keys for each table.

“Write out” the cardinality and the participation for Figures 6.3 and 7.5 Make sure that the cardinality and participation ratios are written out from both sides

In the previous chapters, we selected our primary entities, defned the attri- butes and relationships for this case study, and mapped it to a relational

In Chapter 6, we analyzed the structural constraints of relationships within the database, making necessary adjustments to the mappings This process culminated in identifying the characteristics of a weak entity, which is essential for understanding database design and integrity.

Step 7 Present the “as designed” database to the user complete with the English for entities, attributes, keys, and relationships Refine the diagram as necessary

Suppose we obtained some additional input from the user:

A store is required to have one or more departments, as departments cannot exist independently Each department is characterized by its name, number, and manager, and must have at least one employee assigned to it.

To effectively manage employee information in the store, it is essential to maintain accurate records for each staff member, including their name, Social Security number, and designated department Each employee is assigned to one specific department, ensuring clear organizational structure and accountability within the workplace.

In Chapter 5, we established that departments were a multivalued attribute of stores, indicating that one store can have multiple departments Upon further review, it has become clear that a new entity, aaaaaaaaaa, must be created to record specific information about it Additionally, we need to account for another new entity, aaaaaaaa Consequently, these updated specifications introduce two new strong entities, aaaaaaaaaa and aaaaaaaa, each with their respective keys, aaaaaaaaaaaaaaaaa and aaaaaaaaaaa.

First, we select an entity, aaaaaaaaaa

Now, repeating step 2 for aaaaaaaaaa:

Tis database records data about a aaaaaaaaaa

For each aaaaaaaaaa in the database, we record a department name (aaaaa) and department number (aaaa)

For each aaaaaaaaaa there will be one and only one aaaaa T e value for aaaaa will not be subdivided

196 • Database Design Using ER Diagrams

For each aaaaaaaaaa, there will be one and only one aaaa T e value for aaaa will not be subdivided

To connect each aaaaaaaaaa with the corresponding owner-entity, it is essential to link the department number to the owner entity using its unique key, aaaaaaaa Consequently, the key for aaaaaaaaaa will be formulated as store_ID combined with dnum.

Note, the language leads you to think of aaaaaaaaaa as a weak entity Next, we select our next entity, aaaaaaaa

Now, repeating step 2 for aaaaaaaa:

Tis database records data about an aaaaaaaa

For each aaaaaaaa in the database, we record an employee name (aaaaa) and employee Social Security number (aaaa)

For each aaaaaaaa, there will be one and only one aaaaa T e value for aaaaa will not be subdivided

For each aaaaaaaa, there will be one and only one aaaa T e value for aaaa will not be subdivided

For each aaaaaaaa, we will assume the aaaa will be unique (so aaaaaaaa will be a strong entity).

Tese entities have been added to the diagram in Figure 7.7

Using step 6 to determine the structural constraints of relationships, we get:

First, for the relationship, dept_of:

From aaaaa to aaaaaaaaaa, this fts pattern 3, 1(full):N:

Stores, recorded in the database, must have many (one or more) departments

ER Diagram of West Florida Mall Developed So Far.

From aaaaaaaaaa to aaaaa, this fts pattern 1, M(full):1:

Many departments (one or more) must be in one store

The relationship between aaaaa and aaaaaaaaaa can be classified as either strong or weak According to mapping rule 11, we will incorporate the key from the owner, aaaa, into the weak side, aaaaaaaaaa Consequently, the relation for aaaaaaaaaa is modified to: aaaaaaaaaaaaaaaaAaAAaaaaAaAAaaaaa.

And, with some data as shown in Table 7.5

198 • Database Design Using ER Diagrams

TABLE 7.5 Sample Data for aaaaaaaaaa in Figure 7.7 dname dnum snum

Tall Men’s Clothing 501 1 Men’s Clothing 502 1 Women’s Clothing 503 1 Children’s Clothing 504 1 Men’s Clothing 601 2

The aaaaa table remains unchanged from Chapter 6, but the store_depts relation is no longer necessary Initially, aaaaaaaaaaa were considered a multivalued attribute of STORE, leading to the existence of both the aaaaa and store_depts relations However, based on the specifications provided at the start of this case study, it is clear that aaaaaaaaaa functions as an independent entity, resulting in the replacement of the store_depts relation with the aaaaaaaaaa table.

Ten, for the relationship works_for:

From aaaaaaaa to aaaaaaaaaa, this fts pattern 1, 1(full):1:

Employees, recorded in the database, must work for one and only one department

From aaaaaaaaaa to aaaaaaaa, this fts pattern 3, 1(full):N:

Departments, recorded in the database, must have one or more employ- ees working for it

In a 1:1 relationship between aaaaaaaa and aaaaaaaaaa, both sides actively participate, allowing us to choose which side will store the other's key based on mapping rule 8 However, since the relationship is a binary 1(full):N relationship, we apply mapping rule 9 to extract the key from the 1 side (aaaaaaaaaa) and concatenate it with aaaa and aaaa This concatenated key then serves as the foreign key on the N side (aaaaaaaa), resulting in the relationship being represented as aaaaaaaaaaaaaaAaAAAaaaaAaAAaaaaAaAAaaaaAa.

AaaaaA AaaaaaA AaaaaA AaaaaaaaaaAA AAaaaaaaaaA AaaaaaaA aaaaa

We continue the development of this case study at the end of the next chapter

Sample Data for aaaaaaaa in Figure 7.7 ename essn dnum snum

Kaitlyn 987-754-9865 501 1 Dumas 276-263-9182 502 1 Katie 98-928-2726 503 1 Seema 837-937-9373 501 1 Raju 988-876-3434 601 2

And, with some sample data, as shown in Table 7.6

In summary, our relational database has so far been mapped to (without the data)

200 • Database Design Using ER Diagrams

Chen, P.P (1976) Te entity-relationship model—toward a unifed view of data ACM Transactions on Database Systems , 1(1)

Connolly, E., and Begg, C (2009) Database Systems, a Practical Approach to Design, Implementation, and Management Reading, MA: Addison-Wesley

Elmasri, R., and Navathe, S.B (2016) Fundamentals of Database Systems Reading, MA:

Ramakrishnan, R., and Gehrke, J (2003) Database Management Systems New York: McGraw Hill.

Further Extensions for ER Diagrams

Introduction

This chapter extends the basic entity-relationship (ER) model by introducing attributes of relationships and providing various examples It revisits step 6 of the ER design methodology to incorporate these attributes and explores how additional entities and relationships can be integrated into the ER model, allowing attributes and relationships to evolve into entities, including the creation of intersection entities The chapter also presents the grammar and structured English for intersection entities and introduces recursive relationships Furthermore, it examines scenarios where two entities can have multiple relationships, leading to a redefinition of step 5 in the ER design methodology The discussion includes derived and redundant relationships, refining the methodology with the addition of step 6b Lastly, an optional section presents an alternative ER notation for specifying structural constraints on relationships.

Attributes of Relationships

In Chapter 6, we considered the M:N relationship aaaaaaaaaaaaaa

T e aaaaaaaaaaaaaa relationship is M:N because students take many

In the context of Database Design Using ER Diagrams, adding the attribute "aaaaa" requires careful consideration of its placement within the diagram If "aaaaa" is associated with the "aaaaaaa" entity, it results in a multivalued attribute that necessitates a relationship with the "aaaaaa" entity for clarity Conversely, placing "aaaaa" with the "aaaaaa" entity also creates a multivalued attribute The optimal placement for "aaaaa" is within the "enroll" relationship, as it inherently connects both the "aaaaaaa" and "aaaaaa" entities For a visual representation, refer to Figure 8.1, which illustrates the correct positioning of the "aaaaa" attribute in an M:N, full:full participation model.

Figure 8.1 includes additional attributes that illustrate the relative positioning of these attributes Since aaaaa is identified by both aaaaaaa and aaaaaa, it cannot exist independently within either entity.

An attribute like aaaaa is called a relationship attribute or intersection attribute

During the design process, an intersection attribute may initially appear as a multivalued attribute on an entity, but its necessity can later be questioned, particularly when it requires another entity for identification Once it becomes clear that the attribute needs to be defined by multiple entities, it is then relocated to the relationship connecting those entities.

Relationship attributes can be represented in an Entity-Relationship (ER) diagram with various cardinalities, but they are most commonly found in binary, many-to-many (M:N) models It is essential to refine our methodology to include guidelines for identifying and defining the attributes of a relationship.

Step 6 State the exact nature of the relationships in structured English from all sides, for example, if a relationship is A:B::1:M, then there is a relationship from A to B, 1 to Many, and from B back to A, Many to 1

In Step 6a, review the list of attributes to identify any that require recognition by two or more entities If such attributes are found, assign them to the relevant relationship that connects the entities involved.

Te grammar to describe the attribute of a relationship is discussed next

Further Extensions for ER Diagrams • 203

Atomic attributes, composite attributes, multivalued attributes, and attributes of relationships are defined similarly to previous chapters, ensuring consistency in understanding their roles within data structures.

For the relationship between ENTITY1 and ENTITY2, we will record an att(j) T e att(j) depends on both entities ENTITY1 and ENTITY2 for identif cation

204 • Database Design Using ER Diagrams

For the relationship between the aaaaaaa entity and the aaaaaa entity, we will record a aaaaa attribute The aaaaa attribute depends on both aaaaaaa and aaaaaa entities for identifi cation.

Relationships Developing into Entities

In our previous discussion, we defined the M:N relationship and highlighted the presence of an attribute that should be linked to the relationship rather than to a single entity For instance, the attribute "aaaaa" does not align with either the "aaaaaaa" or the "aaaaa" entities, suggesting that the relationship itself possesses an "entity-like" quality.

There are two ways to represent the relationship-attribute situation One approach is to keep the attributes attached to the relationship itself If there are only one or two attributes, the clarity of the diagram remains intact, ensuring that users can easily understand the grammatical representation.

An alternative approach to defining relationship attributes is to treat the relationship as an entity, linking it to both identifying entities This method is illustrated in Figure 8.2, where the central entity is shown as weak, indicating its complete dependence on the strong entities It's important to note that the participation between this weak "intersection entity" and the strong entities is always full, as the weak entity requires a corresponding strong entity for its existence.

A partial relationship in the database indicates that certain classes exist but are not offered, resulting in no students enrolled in them In contrast, a full relationship would mean all classes are available and populated with students.

Now, with a aaaaaaaaaaaaaa entity (an intersecting entity), our gram- matical description of this intersecting entity would be as discussed next

Further Extensions for ER Diagrams • 205

M:N Relationship Tat Has Been Replaced with by 1:M Relationships

Tis database records data about aaaaaaaaaaaaaa combinations: aaaaaaaaaaaaaa For each aaaaaaaaaaaaaa in the database, we record a aaaaa

206 • Database Design Using ER Diagrams

For each aaaaaaaaaaaaaa combination, there will be one and only one aaaaa Te value for aaaaa will not be subdivided

(d) Intersecting entity: Te key of the intersection entity will consist of the concatenation of the foreign keys of the owner entities

In the given example, the intersection entity includes two foreign keys, aaaaaaaaaa and aaaaaaaaa The primary key for this entity is represented as aaaaaaaaaaAaAaaaaaaaaa, indicating that these attributes must be concatenated Both attributes are essential for uniquely identifying each row in the database.

Te last statement is very close (and for a user, it is hoped to be indistin- guishable) from the key statements found in the “attribute on a relation- ship” grammar given:

For the relationship between aaaaaaa and aaaaaa, we will record a aaaaa T e aaaaa depends on both entities aaaaaaa and aaaaaa for identif cation.

More Entities and Relationships

In database management, it is essential to model the information accurately, particularly when dealing with multiple entities and binary relationships A binary relationship refers to the connection between two entities While this section focuses on scenarios that require the inclusion of additional entities, all relationships will remain binary For more complex relationship combinations, such as ternary and higher, refer to Chapter 9.

Let us again reconsider the aaaaaaaaaaaaaa ER diagram (Figure 8.1) If this database were oriented toward a college, the courses would

Further extensions for ER diagrams involve incorporating instructors who are associated with specific courses Following our methodology steps 4 and 5, we can enhance our database by integrating these relationships effectively.

Step 4 If another entity is appropriate, draw the second entity with its attributes Repeat step 2 to see if this entity should be further split into more entities

Step 5 Connect entities with relationships (one or more) if rela- tionships exist

By incorporating instructors into the diagram, we illustrate the relationship where instructors teach multiple courses, and each course is assigned to a specific instructor While the exact participation may vary based on real-world scenarios, for our example, we will define this relationship clearly.

8.4.1.1 Pattern 4—x:y::1:M, From the 1 Side, Partial Participation Short version: An instructor may teach many courses which actually means:

Longer version: An instructor, but not necessarily all instructors (recorded in the database), may teach many (one or more) courses Some instruc- tors may not teach courses

8.4.1.2 Pattern 1—x:y::M:1, From the M Side, Full Participation

Short: Courses must be taught by instructors which means:

Long: Courses, recorded in the database, must be taught by one and only one instructor No course is taught by more than one instructor

In this diagram ( Figure 8.3 ), the aaaaaaaaaa entity is related to the aaaaaa entity Tere could be a relationship between the

208 • Database Design Using ER Diagrams

An ER Diagram depicting a database showcases primary keys and specific entities, with Figure 8.3 illustrating the exclusive relationships considered While alternative relationships such as advisor, mentor, counselor, and coach could be proposed, the focus remains on modeling only the existing relationships rather than potential ones.

Further Extensions for ER Diagrams • 209 be We assume the diagram represents the information given by a user to an analysist and only the information given

8.4.2 Adding More Attributes That Evolve into Entities

Incorporating buildings into our database involves recognizing them as attributes of entities such as students, courses, and instructors Initially, buildings are not considered entities since there is no intention to record detailed information about them However, as our needs evolve, we may decide to treat buildings as entities, necessitating the establishment of relationships with other entities This progression is illustrated in the ER diagrams, where we first depict buildings as attributes and later as entities with key attributes Throughout this process, it is crucial to continually assess whether each item in the ER diagram warrants detailed recording, ultimately leading to a more comprehensive design that includes additional attributes and cardinalities.

210 • Database Design Using ER Diagrams

An ER Diagram (with Only Primary Keys) Showing a aaaaaaaaaaaaaaaaaaaaaaaaa Database with Building Attributes

Further Extensions for ER Diagrams • 211

ER Diagram (with Only Primary Keys) Showing a aaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaa database

212 • Database Design Using ER Diagrams

ER Diagram Showing a aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa Database

1 In Figure 8.6, why is aaaaaaaa an entity and not an attribute of another entity?

2 Why is the aaaaaaaaaaa attribute attached to the lives_in relationship rather than the aaaaaaa entity?

3 What will make you decide whether an attribute should be connected to aaaaaaa or aaaaaaa or on the relationship con- necting aaaaaaa and aaaaaaa?

4 Why are all the lines leaving aaaaaaaa (on Figure 8.6) single lines (partial participation)?

5 According to Figure 8.6, does a student have to enroll in a course?

6 According to Figure 8.6, how many courses can an instructor teach?

Further Extensions for ER Diagrams • 213

7 According to Figure 8.6, does an instructor have to teach a course?

8 According to Figure 8.6, does a course have to be taught by an instructor?

9 According to Figure 8.6, a course can be taught by how many instructors?

More Evolution of the Database

Upon reviewing the ER diagram in Figure 8.6, users may question the absence of certain attributes, such as why a specific attribute is not part of the class relationship or why there is no designation for the office relationship There could be multiple reasons for these omissions.

(a) Tis information was not mentioned in the analysis stage

(b) Te data is not necessary (there may be only one classroom per build- ing, or ofce numbers may not be recorded for advisors)

(c) It was an oversight, and the data should be added

In the context of identifying relationships among various entities, it is crucial to associate room numbers with instructors, buildings, courses, and students To enhance clarity and organization, we can evolve the existing diagram to better represent these connections, as illustrated in Figure 8.7.

In Figure 8.7, we have room number as a relationship attribute In this case, we have also added information attached to aaaaaaaaa aaaaa aaaaaaaAaaaaaaaaaaAaaaaaaaaaaaAaaaaaaaaaaaAandAaaaAaaA

8.6 ATTRIBUTES THAT EVOLVE INTO ENTITIES

In this section, we emphasize the importance of modeling "what is" rather than "what might be," highlighting how an attribute can evolve into an entity When tasked with designing a database based on user-provided data, such as course name, course number, credit hours, instructor, and book, the initial Entity-Relationship (ER) diagram serves as a foundational representation of these elements, as illustrated in Figure 8.8.

Why “might look like ”? Te answer lies in eliciting correct require- ments from our user

214 • Database Design Using ER Diagrams

ER Diagram Showing a aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa Database with room_number Added to the Relationships Where Needed

A single-entity ER diagram could represent a portion of the database if all relevant information were included However, it's important to recognize that what we consider attributes may also be entities For instance, both the instructor and the book could be represented as entities in the database design We have already established that the instructor should be treated as an entity.

To enhance our database, we should incorporate additional information about instructors This would involve not only documenting basic details but also including attributes such as qualifications, teaching experience, and areas of expertise By doing so, we can create a more comprehensive profile for each instructor, ultimately improving the quality of our educational offerings.

Further Extensions for ER Diagrams • 215

The COURSE entity, featuring specific attributes, has achieved the terminal degree Incorporating additional information about the aaaaaaaaaa, the corresponding section of the ER diagram will include two entities, as illustrated in Figure 8.9.

In Figure 8.9, the entity is classified as weak due to its reliance on another entity, and it is assumed that instructor names may not be unique If instructors could be uniquely identified by an attribute such as a Social Security number and could exist independently of courses, the entity would then be considered strong, as illustrated in Figure 8.10 This section emphasizes that an entity cannot be deemed valid simply based on the desire to record information in the future; there must be a clear intent to include the data associated with the entity Additionally, whether an entity is classified as weak or strong depends on the information provided It is crucial to determine how instructors will be uniquely identified.

If there were no plans to include information about instructors, Figure 8.8 could effectively represent the database structure As an exercise, consider extending Figure 8.10 to incorporate 'aaaa' as an additional entity.

216 • Database Design Using ER Diagrams

ER Diagram of the aaaaaaaaaaaaaaaaa Database with aaaaaaaaaa as a Weak Entity

In a recursive relationship, the same entity participates more than once in diferent roles Recursive relationships are also sometimes called unary relationships

In a company's human resources department, each employee is identified by a unique employee number and name This department not only serves as a collective entity for all staff members but also facilitates relationships among individuals within the personnel framework.

Further Extensions for ER Diagrams • 217

The ER diagram for the aaaaaaaaaaaaaaaaa database illustrates a strong entity, represented by aaaaaaaaaa, focusing on the employee-supervisor relationship To effectively depict this relationship with a single entity, refer to Figure 8.11 for a visual representation.

Figure 8.11 illustrates the aaaaaaaaa entity along with its attributes The relationship "supervise" is established, linking aaaaaaaaa at both ends This relationship has a cardinality of 1:N, indicating that a supervisor can oversee multiple employees, while each employee reports to a single supervisor Additionally, there is partial participation from the supervisor's side in this relationship.

218 • Database Design Using ER Diagrams

In a classic recursive relationship, not all personnel act as supervisors, as one employee can oversee multiple others While most employees report to a single supervisor, certain individuals occupy the highest position in the hierarchy without direct supervision This structure illustrates a hierarchy where the top level is devoid of oversight, reflecting the nature of recursive relationships Furthermore, all hierarchies maintain a partial-partial relationship, emphasizing the complexity of supervisory dynamics within organizations.

So, when a relationship between individuals arises within the same entity set, it would be improper to have two entities since most of the

To avoid redundancy in a database, it's essential to recognize that entities containing similar information should not be duplicated For instance, if two separate entities were created instead of utilizing a recursive relationship, an employee's details would be recorded in multiple locations, leading to potential inconsistencies.

8.7.1 Recursive Relationships and Structural Constraints

Recursive relationships exhibit partial participation, with cardinalities that can be one-to-one, one-to-many, or many-to-many Full participation in such a relationship implies that every instance of an entity interacts with itself, which is illogical To illustrate these cardinalities in recursive relationships, we will explore examples using a human resources database.

8.7.1.1 One-to-One Recursive Relationship (Partial

Figure 8.12 illustrates an entity, aaaaaaaaa, which is connected to itself via a "married_to" relationship This indicates that an individual in this database can be married to one other person within the same database Notably, this example demonstrates a relationship that does not form a hierarchical structure.

Some instances of this relationship are shown in Figure 8.13 From Figure 8.13, we can see that Seema is married to Dev Anand, Arpan is married to Rekha, and so on

One-to-One Recursive Relationship (Partial Participation on Both Sides)

220 • Database Design Using ER Diagrams

Instances of One-to-One Recursive Relationship (Partial Participation on Both Sides)

8.7.1.2 One-to-Many Recursive Relationship

(Partial Participation on Both Sides)

Recursive Relationships

In a recursive relationship, the same entity participates more than once in diferent roles Recursive relationships are also sometimes called unary relationships

In a company's human resources department, each employee is identified by unique attributes such as an employee number and name This department serves not only as a collective entity for all staff but also facilitates various relationships among individuals within the personnel framework.

Further Extensions for ER Diagrams • 217

The ER Diagram of the aaaaaaaaaaaaaaaaa Database features aaaaaaaaaa as a strong entity, illustrating the employee-supervisor relationship To represent this relationship with a single entity, refer to Figure 8.11 for a clear depiction.

Figure 8.11 illustrates the aaaaaaaaa entity along with its attributes The relationship "supervise" is introduced, linking aaaaaaaaa at both ends This relationship exhibits a cardinality of 1:N, indicating that a supervisor can oversee multiple employees, while each employee has a single supervisor Additionally, there is partial participation from the supervisor side in this relationship.

218 • Database Design Using ER Diagrams

In a classic recursive relationship, not all personnel serve as supervisors; an employee may oversee multiple others while only partially participating in supervision Typically, most employees report to a single supervisor, yet some hold the top position in the hierarchy without a supervisor themselves Recursive relationships effectively illustrate this hierarchy, where every structure has a peak position devoid of supervision from a database perspective Moreover, all hierarchies are inherently partial-partial in nature.

So, when a relationship between individuals arises within the same entity set, it would be improper to have two entities since most of the

To avoid redundancy in database design, it is essential to utilize recursive relationships instead of creating separate entities for similar information For instance, if two distinct entities were established for employees, it would lead to duplicate records, with the same employee being listed in multiple locations within the database.

8.7.1 Recursive Relationships and Structural Constraints

Recursive relationships exhibit partial participation, allowing for cardinalities of one-to-one, one-to-many, and many-to-many Full participation in such relationships is illogical, as it implies that every instance of an entity relates to itself To illustrate these cardinalities in recursive relationships, we will refer to examples from a human resources database.

8.7.1.1 One-to-One Recursive Relationship (Partial

Figure 8.12 illustrates an entity, aaaaaaaaa, connected to itself via a married_to relationship This indicates that an individual within this database can be married to another person also listed in the same database Notably, this relationship does not represent a hierarchical structure.

Some instances of this relationship are shown in Figure 8.13 From Figure 8.13, we can see that Seema is married to Dev Anand, Arpan is married to Rekha, and so on

One-to-One Recursive Relationship (Partial Participation on Both Sides)

220 • Database Design Using ER Diagrams

Instances of One-to-One Recursive Relationship (Partial Participation on Both Sides)

8.7.1.2 One-to-Many Recursive Relationship

(Partial Participation on Both Sides)

The one-to-many recursive relationship, characterized by partial participation on both sides, is the most prevalent form of recursive relationship cardinality A typical example of this relationship is when one employee supervises multiple other employees, illustrating a hierarchical structure that is inherently partial on both ends.

Figure 8.15 illustrates the supervisory relationships within the team, highlighting that Tom Smith oversees Sudip Bagui and Tim Vaney, while Rishi Kapoor supervises Mala Saha and Korak Gupta Additionally, Korak Gupta is responsible for supervising Roop Mukerjee, demonstrating the hierarchical structure of supervision in the organization.

8.7.1.3 Many-to-Many Recursive Relationship

In a many-to-many recursive relationship, courses can serve as prerequisites for zero or more other courses, illustrating a network of interrelated subjects rather than a strict hierarchy This concept is visually represented in Figure 8.16.

Further Extensions for ER Diagrams • 221

One-to-Many Recursive Relationship (Partial Participation on Both Sides)

Instances of One-to-Many Recursive Relationship (Partial Participation on Both Sides)

222 • Database Design Using ER Diagrams

Many-to-Many Recursive Relationship (Partial Participation on Both Sides).

Multiple Relationships

So far, we have mostly discussed two entities with one relationship Tis section discusses how two entities can have more than one binary relationship

The diagram features two entities: aaaaaaa and aaaaaaa, with no other entities present in the database The aaaaaaa entity includes attributes such as aaaaAaAAaaaaaaaaAaaAaAAaaaaaa and the institution from which the student graduated The aaaaaaa entity may have attributes like aaaaAaAAaaaaA (Social Security number) and aaaaaaaaaaAaA AaaAAaaaaaaaaaA In the diagram, two distinct verbs describe the relationships between the entities: aaaaaaaa are instructed by aaaaaaa, and aaaaaaa advise aaaaaaaa Consequently, the diagram includes two unique relationships labeled as instruct and advise, each represented by its own diamond shape, as illustrated in Figure 8.17A.

The diagram illustrates that relationships are depicted as partial, indicating that some faculty members may not advise students, and some students may not receive instruction from faculty When creating ER diagrams, it is essential to represent all distinct relationships accurately Attempting to use a single relationship to represent two different concepts would be incorrect.

Further Extensions for ER Diagrams • 223

ER Diagram with Two Entities and Two Relationships

In this case, embellishments may involve intersection data related to the instruct relationship, such as a specific course example Additionally, intersection data for the advise relationship could include detailed information, as illustrated in Figure 8.17B.

Te placing of relationships in the ER diagram is covered in our ER design methodology in step 5, which we redef ne here:

Step 5 Connect entities with relationships as they are elicited

224 • Database Design Using ER Diagrams

ER Diagram with Two entities and Two Relationships and Some Intersection Attributes

In cases where multiple relationships exist, they should be included in the diagram; however, to avoid redundancy, we will simply use the phrase (one or more) to represent them.

Step 5 Connect entities with relationships (one or more) as rela- tionships are elicited

8.9 THE DERIVED OR REDUNDANT RELATIONSHIP

Many authors describe a redundant or derived relationship arising in a relationship “loop” as in Figure 8.18 Te loop notion comes from the

Further Extensions for ER Diagrams • 225

An ER diagram illustrates a database with a redundant relationship, depicted as a loop In this context, redundancy arises because students enroll in courses, and each course is associated with an instructor Therefore, a "taught_by" relationship is unnecessary, as the information can be derived without it If such a redundant relationship is present, it should be removed, but there are important considerations to keep in mind.

226 • Database Design Using ER Diagrams

First, one has to be sure the redundant relationship is truly redundant

If the added relation were advised_by instead of taught_by, then the relationship should stay because it has a completely diferent sense than taught_by

When a relationship loop is identified, it often indicates that only one of the two redundant relationships should be retained, with clear semantics guiding the choice For instance, in a scenario where "aaaaaaaaaa" is more closely associated with "aaaaaaa" than "aaaaaaa," the preferred relationship to maintain would be the original "teach." A designer may have initially introduced the "taught_by" relationship before adding "teach," and upon reviewing the diagram for loops, it becomes evident that "taught_by" is redundant.

In some cases, relationships may share an intersection attribute that indicates which relationship should be retained For instance, in Figure 8.19, the 'aaaa' attribute is associated with the teaching relationship, signifying that an instructor teaches a course during a specific time.

Te idea of derived or redundant relationships causes us to suggest one more step in our methodology:

Step 6b Examine the diagram for loops that might indicate redun- dant relationships If a relationship is truly redundant, excise the redundant relationship

Further Extensions for ER Diagrams • 227

ER Diagram Showing a aaaaaaaaaaaaaaaaaaaaaaaaa Database with a “Redundant” Relationship and a Time Attribute

2 What would you look for if you are trying to see if a relationship is recursive?

3 What kinds of structural constraints can recursive relationships have?

228 • Database Design Using ER Diagrams

4 Can recursive relationships have full participation? Why or why not?

5 How is the recursive relationship denoted diagrammatically in the Chen-like ER model?

6 Can the same two entities have more than one relationship?

7 How would you determine if a relationship is redundant?

Section 8.10 is considered an "optional section" as it provides additional information to the ER diagram, though this information is not essential for translating the diagram into a functional database While some may find this section to be a valuable descriptive tool, others may view it as unnecessary.

In previous discussions, we explored cardinality ratios, focusing on their maximum limits represented by M or N in ER diagrams As highlighted in Chapter 6, cardinality serves as an approximate indicator of how many instances from one entity set can correspond to instances in another entity set.

This section introduces an alternative Entity-Relationship (ER) notation that effectively specifies structural constraints on relationships by associating a pair of numbers, known as (min, max), with each constraint.

T is min and max may provide more information about the entities and how they are related.

The term "T e min" refers to the minimum number of instances in one entity set that are associated with an instance of another entity This value can range from zero to a defined maximum When T e min is zero, it indicates that not every instance of an entity is required to engage in the relationship, signifying partial participation Conversely, if T e min is greater than zero, it denotes full participation in the relationship.

We now present an ER diagram with (min, max) in place of 1 and M

First, let us start with the recursive relationship shown in Figure 8.20

In the aaaaaaaaa entity, the marriage status is represented by a range of (0, 1), indicating that individuals may or may not be married, as denoted by the minimum value of zero Additionally, each person can be married to a maximum of one other individual, reflecting the maximum value of one.

Next, look at Figure 8.21 From this fgure, we can say a student may not be advised by any faculty member and may be advised by up to two

Further Extensions for ER Diagrams • 229

Recursive Relationship with (min, max ) Ratios

ER Diagram Showing an Alternative ER Notation for Specifying Structural Constraints

In database design using ER diagrams, faculty members can have a minimum of zero and a maximum of two advisors Each faculty member may advise between 0 to 30 students and instruct between 0 to 40 students Conversely, every student must be instructed by one faculty member, with the possibility of being instructed by up to two While the minimum/maximum notation provides clarity, it is often redundant to maintain the single/double-line participation constraint However, due to its common usage, it is advisable to retain this notation for consistency.

1 What lower bound of cardinality does full participation imply?

2 What does a min/max ratio of (1, 1) between two entities imply?

3 What kind of participation ratio (full participation or partial participation) does a min/max ratio of (0, 1) imply?

To review, our methodology for designing ER diagrams has now evolved to:

Step 1 Select one primary entity from the database requirements description and show attributes to be recorded for that entity Label keys if appropriate and show some sample data

Step 2 Use structured English for entities, attributes, and keys to describe the elicited database

Step 3 Examine attributes in the existing (primary) entities (pos- sibly with user assistance) to find out if information about one of the entities is to be recorded

(We change “primary” to “existing” because we redo step 3 as we add new entities.)

Step 3a If information about an attribute is needed, then make the attribute an entity, and then

Step 3b Define the relationship back to the original entity

Further Extensions for ER Diagrams • 231

Step 4 If another entity is appropriate, draw the second entity with its attributes Repeat steps 2 and 3 to see if this entity should be further split into more entities

Step 5 Connect entities with relationships (one or more) if rela- tionships exist

Step 6 State the exact nature of the relationships in structured English from all sides For example, if a relationship is A:B::1:M, then there is a relationship from A(1) to B(M) and from B(M) back to A(1) Step 6a Examine the list of attributes and determine whether any of them need to be identified by two (or more) entities If so, place the attribute on an appropriate relationship that joins the two entities

Step 6b Examine the diagram for loops that might indicate redun- dant relationships If a relationship is truly redundant, excise the redundant relationship

Step 7 Show some sample data

Step 8 Present the “as designed” database to the user complete with the English for entities, attributes, keys, and relationships Refine the diagram as necessary

Te grammar to describe our entities, attributes, and keys has evolved as discussed next

Tis database records data about aaaaaa For each aaaaaa in the data- base, we record att(1), att(2), att(3), att(n)

For each aaaaaa, there will be one and only one att(j) Te value for att(j) will not be subdivided

For each aaaaaa, we will record att(j) , which is composed of x, y, z,

( x, y, z) are the component parts of att(j)

For each aaaaaa, we will record att(j ) ’s Tere may be more than one att(j) recorded for each aaaaaa

For attributes of relationships, att(j):

232 • Database Design Using ER Diagrams

For the relationship between aaaaaaa and aaaaaaa, we will record a(n) att(j) T e att(j) depends on both entities aaaaaaa and aaaaaaa for identif cation

(a) More than one candidate key (strong entity):

For each aaaaaa, we will have the following candidate keys: att(j), att(k), (where j, k are candidate key attributes)

(b) One candidate key (strong entity):

For each aaaaaa, we will have the following primary key: att(j) (c) No candidate keys (perhaps a weak entity):

For each aaaaaa, we do not assume any attribute will be unique enough to identify individual entities

(d) No candidate keys (perhaps an intersecting entity):

For each aaaaaa, we do not assume any attribute will be unique enough to identify individual entities

Recursive relationships are binary 1:1, 1:N, or M:N relationships We dis- cussed the mapping rules for these types of relationships in Chapter 6 Normally, the cardinality is 1:N

In Chapter 6, the mapping rule for two entities is explored, highlighting that in cases of recursive relationships, the same rules apply, treating a single entity as two distinct entities The primary key is re-recorded within the same table, assuming a different connotation or role Two types of mapping rules can be established to effectively map recursive entities.

Mapping Rule 12—Mapping 1:N recursive relationships Re-include the primary key of the table with the recursive relationship in the same table, giving it some other role name

For example, Figure 8.11 will be mapped to

Further Extensions for ER Diagrams • 233 aaaaaaaaaaAAaaaaaAaAAaaaaaAaAAaaaaaAaAAAaaaaaaaaaaaAaAAaaaaaaaaaA

Optional: An Alternative ER Notation

Section 8.10 is considered an "optional section" as it provides additional information to the ER diagram, but this information is not essential for translating the diagram into a functional database While some may find this section descriptively useful, others may view it as unnecessary.

In previous discussions, we explored cardinality ratios, focusing on their upper bounds represented by M or N in ER diagrams Cardinality serves as an approximate measure of how many instances from one entity set can be associated with instances from another entity set, as highlighted in Chapter 6.

This section presents an alternative Entity-Relationship (ER) notation for defining structural constraints on relationships This notation involves associating a pair of numbers, referred to as (min, max), with each structural constraint of a relationship.

T is min and max may provide more information about the entities and how they are related.

The term "T e min" refers to the minimum number of instances in one entity set that are associated with an instance of another entity This value can range from zero to a maximum limit When T e min is zero, it indicates that not every instance of an entity is required to engage in the relationship, signifying partial participation Conversely, if T e min is greater than zero, it denotes full participation in the relationship.

We now present an ER diagram with (min, max) in place of 1 and M

First, let us start with the recursive relationship shown in Figure 8.20

In the aaaaaaaaa entity, the parameter T e (min, max) of (0, 1) indicates that individuals may or may not be married, as represented by the minimum value of zero Additionally, each person can be married to a maximum of one other individual, reflecting the maximum value of one.

Next, look at Figure 8.21 From this fgure, we can say a student may not be advised by any faculty member and may be advised by up to two

Further Extensions for ER Diagrams • 229

Recursive Relationship with (min, max ) Ratios

ER Diagram Showing an Alternative ER Notation for Specifying Structural Constraints

In database design using ER diagrams, faculty members can advise between 0 and 30 students and instruct between 0 and 40 students, while each student must be instructed by at least one faculty member and may have up to two instructors The minimum and maximum constraints indicate that a faculty member can have between zero and two roles, and while the single/double-line participation notation is optional, it is recommended to retain it due to its common usage.

1 What lower bound of cardinality does full participation imply?

2 What does a min/max ratio of (1, 1) between two entities imply?

3 What kind of participation ratio (full participation or partial participation) does a min/max ratio of (0, 1) imply?

Review of the Methodology

To review, our methodology for designing ER diagrams has now evolved to:

Step 1 Select one primary entity from the database requirements description and show attributes to be recorded for that entity Label keys if appropriate and show some sample data

Step 2 Use structured English for entities, attributes, and keys to describe the elicited database

Step 3 Examine attributes in the existing (primary) entities (pos- sibly with user assistance) to find out if information about one of the entities is to be recorded

(We change “primary” to “existing” because we redo step 3 as we add new entities.)

Step 3a If information about an attribute is needed, then make the attribute an entity, and then

Step 3b Define the relationship back to the original entity

Further Extensions for ER Diagrams • 231

Step 4 If another entity is appropriate, draw the second entity with its attributes Repeat steps 2 and 3 to see if this entity should be further split into more entities

Step 5 Connect entities with relationships (one or more) if rela- tionships exist

Step 6 State the exact nature of the relationships in structured English from all sides For example, if a relationship is A:B::1:M, then there is a relationship from A(1) to B(M) and from B(M) back to A(1) Step 6a Examine the list of attributes and determine whether any of them need to be identified by two (or more) entities If so, place the attribute on an appropriate relationship that joins the two entities

Step 6b Examine the diagram for loops that might indicate redun- dant relationships If a relationship is truly redundant, excise the redundant relationship

Step 7 Show some sample data

Step 8 Present the “as designed” database to the user complete with the English for entities, attributes, keys, and relationships Refine the diagram as necessary

Te grammar to describe our entities, attributes, and keys has evolved as discussed next

Tis database records data about aaaaaa For each aaaaaa in the data- base, we record att(1), att(2), att(3), att(n)

For each aaaaaa, there will be one and only one att(j) Te value for att(j) will not be subdivided

For each aaaaaa, we will record att(j) , which is composed of x, y, z,

( x, y, z) are the component parts of att(j)

For each aaaaaa, we will record att(j ) ’s Tere may be more than one att(j) recorded for each aaaaaa

For attributes of relationships, att(j):

232 • Database Design Using ER Diagrams

For the relationship between aaaaaaa and aaaaaaa, we will record a(n) att(j) T e att(j) depends on both entities aaaaaaa and aaaaaaa for identif cation

(a) More than one candidate key (strong entity):

For each aaaaaa, we will have the following candidate keys: att(j), att(k), (where j, k are candidate key attributes)

(b) One candidate key (strong entity):

For each aaaaaa, we will have the following primary key: att(j) (c) No candidate keys (perhaps a weak entity):

For each aaaaaa, we do not assume any attribute will be unique enough to identify individual entities

(d) No candidate keys (perhaps an intersecting entity):

For each aaaaaa, we do not assume any attribute will be unique enough to identify individual entities.

Mapping Rules for Recursive Relationships

Recursive relationships are binary 1:1, 1:N, or M:N relationships We dis- cussed the mapping rules for these types of relationships in Chapter 6 Normally, the cardinality is 1:N

In Chapter 6, the mapping rules for two entities are examined, highlighting that in cases of recursive relationships, a single entity can be treated as two distinct entities This involves re-recording the primary key within the same table, but with a different meaning or function To effectively map recursive entities, two types of mapping rules can be established.

Mapping Rule 12—Mapping 1:N recursive relationships Re-include the primary key of the table with the recursive relationship in the same table, giving it some other role name

For example, Figure 8.11 will be mapped to

Further Extensions for ER Diagrams • 233 aaaaaaaaaaAAaaaaaAaAAaaaaaAaAAaaaaaAaAAAaaaaaaaaaaaAaAAaaaaaaaaaA

The primary key of the specified relation is the aaaaaaaaaaa, while the aaaaaaaa serves a different role and connotation within the same context Sample data can be found in Table 8.1.

Mapping Rule 13—Mapping M:N recursive relationships Create a separate table for the relationship (as in mapping rule 5)

In the context of course-prerequisite recursion, where courses can have multiple prerequisites and also serve as prerequisites for various other courses, it is effective to establish a dedicated table to manage these relationships An example of such a recursive relationship is illustrated in Table 8.2.

1 Map the recursive relationship shown in Figure 8.14 to a rela- tional database and show some sample data

2 If Figure 8.14 was an M:N relationship, how would you map this recursive relationship to a relational database? Show the map- ping with some sample data

Sample Data for aaaaaaaaa in Figure 8.11 fname lname minit employee_id super_id

Richard Earp W 8945 9090 Boris Yelsen 9090 null Helga Hogan H 3841 9090 Sudip Bagui K 8767 9090 Tina Tanner 5050 8945

TABLE 8.2 Mapping Recursive Relationship of Figure 8.16

COP2222 COP1111 COP2223 COP1111 COP3333 COP2222 COP1111 null COP2222 COP1112 COP1112 COP1111

234 • Database Design Using ER Diagrams

Chapter Summary

This chapter explores various aspects of binary relationships in Entity-Relationship (ER) diagrams and refines the steps in the ER design methodology The refinement process involves ongoing evaluation and reassessment of the ER diagram based on discussions with users, ensuring that the design remains aligned with their needs.

The concept of relationships possessing attributes, their evolution into entities, and the exploration of recursive, derived, and redundant relationships was illustrated through examples and diagrams The steps of the ER design methodology were refined to incorporate these insights into an evolving framework Additionally, an alternative approach was presented towards the conclusion of the chapter.

This chapter introduces ER notation for defining structural constraints on relationships, enabling readers to effectively design databases with binary relationships The subsequent chapter will explore ternary and other higher-order relationships.

Exercises

In each of the following exercises, the admonition to “construct an ER dia- gram” implies not only the diagram but also the structured grammatical description of the diagram

In Figure 8.6, the student/course/instructor/building database illustrates specific cardinalities and participation constraints The cardinality defines the numerical relationships between entities, such as how many courses a student can enroll in or how many instructors can teach a course Participation specifies whether an entity's existence depends on its relationship with another entity, indicating whether participation is total or partial The depicted cardinalities and participations are correct when they accurately reflect real-world scenarios; for instance, if a student must enroll in at least one course, this indicates total participation Conversely, they may be incorrect if they misrepresent the relationships, such as allowing a student to be associated with multiple courses without restrictions or failing to account for the possibility of an instructor teaching multiple courses.

To effectively organize and analyze horse racing data, we need to create an Entity-Relationship (ER) diagram that includes key attributes such as horse name, race, owner, post odds, post position, race date, order of finish, year-to-date earnings, and owner details including name and address By employing structured grammar, we can clearly define the relationships and constraints among these entities, ensuring a coherent representation of the data for better management and insights.

Further Extensions for ER Diagrams • 235

In this chapter, we introduced a database featuring two entities: aaaaaa and aaaaaaaaaa, as illustrated in Figure 8.10 The 'Book' attribute is associated with the entity aaaaaa We propose expanding the database to incorporate a new entity, aaaa, which could include attributes such as aaaaAaaaaaaAaaaaaaaAaaaaaaAaaaaaaaaAaaaaaaaaaA.

In the updated Figure 8.7, it is specified that each building can accommodate a maximum of 99 students Students are required to enroll in at least one class, with a limit of five classes per student Each class must have a minimum of five students and can enroll up to 35 students Instructors may choose whether to teach a class, but they can teach no more than three classes at a time Each course must have one designated instructor, and only one instructor can be assigned to a specific course Instructors may or may not have an office, with a maximum of two offices allowed per instructor Similarly, a building may or may not contain an office, with a capacity of up to 15 offices Finally, each course must be offered in one designated classroom, which cannot be shared with other courses.

In our case study, we have identified key entities and relationships, successfully mapping them to a relational database that includes sample data.

Step 7 Present the “as designed” database to the user complete with the English for entities, attributes, keys, and relationships Refine the diagram as necessary

Suppose we got some additional input from the user:

An employee may serve as a department manager, with each manager overseeing only one department It is essential to record pertinent information about the department manager, including their name, Social Security number, the store they are associated with, and the specific department they manage.

A A A A or she is working for A department manager supervises at least one employee and may manage several employees

On reviewing these additional specifcations, we can see we have a recursive relationship developing since an employee can be a department manager supervising other employees

So, using mapping rule 12, we will reinclude the primary key of the aaaaaaaa entity in itself, giving us the following aaaaaaaa relation: aaaaaaaaaaaaaaAaAAAaaaaAaAAaaaaAaAAaaaaAaAAaa

And, with some sample data as shown Table 8.3

Tis recursive relationship is also shown in Figure 8.22

So, in summary our relational database has now developed to (without the data) aaaa

AaaaaA AaaaaaA AaaaaA AaaaaaaaaaA AaaaaaaA AaaaaaaA aaaaa

AaaaaaaA AaaaaaaaAA AaaaaaAaaaaaaAA AAaaaaaaaaaaAA aaaaaaaaaaaaa

236 • Database Design Using ER Diagrams

Sample Data for aaaaaaaa ename essn dnum snum dm_ssn

AAaaaaaAA AAAaaaaAAA AAaaaaAA AAaaaaaaAA AAaaaaAA

We continue the development of this case study at the end of the next chapter

Further Extensions for ER Diagrams • 237

ER Diagram of West Florida Mall Developed So Far

238 • Database Design Using ER Diagrams

Earp, R., and Bagui, S (2000) Binary relationships in entity-relationships in entity-relationship (ER) diagrams Data Base Management Journal, 22: 10–43

Elmasri, R., and Navathe, S.B (2016) Fundamentals of Database Systems Reading, MA:

Sanders, L (1995) Data Modeling Danvers, MA: Boyd & Fraser

Teorey, T.J., Nadeau, T., and Lightstone, S.S (2005) Database Modeling and Design: Logical Design San Francisco, CA: Morgan Kaufman.

Ternary and Higher-Order ER Diagrams

Introduction

Most relationships we encounter are binary, involving two entities However, there are instances where it's essential to connect three or more entities A relationship that connects three entities is known as a ternary or 3-ary relationship When a relationship involves more than three entities, it is classified as an n-ary relationship, with 'n' representing the total number of participating entities Relationships involving three or more entities (n ≥ 3) are often referred to as higher-order relationships.

In this chapter, we consider relationships to connect three or more enti- ties First, we look at ternary (3-ary) relationships arising for three main reasons:

(a) If we have intersection attributes requiring three diferent entities to identify an intersection attribute

(b) If we have a relationship of a relationship

(c) If we are reverse engineering

Since we discuss reverse engineering in Chapter 11, we do not discuss the development of ternary relationships from reverse engineering in this chapter

This chapter explores the creation of ternary relationships through intersection attributes and examines the associated structural constraints It highlights that ternary and other n-ary relationships can coexist with binary relationships involving the same entities, and discusses how certain ternary diagrams can be transformed into binary relationships The evolution of ternary relationships is also addressed.

This chapter redefines Step 6 of the entity-relationship (ER) design methodology to encompass ternary and higher-order relationships, enhancing the understanding of complex relationships within database design.

Binary or Ternary Relationship?

Ternary relationships become essential when binary relationships fail to adequately represent the semantics of an association involving three entities This section clarifies the distinction between binary and ternary relationships through an illustrative example, highlighting how the presence of an intersection attribute necessitates the use of a ternary relationship.

In binary relationships between entities, structural constraints such as cardinality and participation are essential Additionally, we identified that relationships can possess attributes, particularly in M:N relationships, which often lead to the emergence of what we term an intersection attribute For instance, the M:N relationship illustrated in Figure 8.1 demonstrates that the presence of an attribute like 'aaaaa' indicates the existence of an M:N binary relationship.

N-ary relationships typically arise when data must connect more than two entities For instance, in a company database featuring entities such as suppliers, products, and customers, a standard binary relationship exists where the company purchases products from a supplier The intersection attribute for this relationship is defined However, when considering the relationship between customers and the products they buy, if all customers purchase products regardless of the supplier, a simple binary relationship is established between customers and products, with its own intersection attribute.

Sample data for Figure 9.1A is illustrated in Table 9.1, while Table 9.2 presents sample data for Figure 9.1B In a different scenario, the price of a product is influenced not only by the product itself but also by the supplier To determine the price, it is essential to have a customerID, a productID, and a supplierID, indicating that the pricing attribute relies on three distinct entities.

Ternary and Higher-Order ER Diagrams • 241

Binary Relationship between aaaaaaa and aaaaaaaa and an Intersection Attribute, aaaaaaaaaaaaaaa

Binary Relationship between aaaaaaa and aaaaaaaa and an Intersection Attribute, aaaaaaaaaaaa

242 • Database Design Using ER Diagrams

Sample Data for Figure 9.1A productID supplierID wholesale_price

TABLE 9.2 Sample Data for Figure 9.1B customerID productID retail_price

Jones Beans 2.67 Smith Beans 2.67 Jones Carrots 1.57

An ER diagram illustrating a ternary relationship among three entities is presented, featuring an intersection attribute labeled "aaaaa." This scenario is visually represented in Figure 9.2.

Figure 9.2 represents the entities aaaaaaa, aaaaaaaa, and aaaaaaaa and a relationship, buy, among all three entities, shown by a single relationship diamond attached to all three entities

Ternary and Higher-Order ER Diagrams • 243

TABLE 9.3 Sample Data for Figure 9.2 customerID productID supplierID price

Jones Beans Acme 2.65 Jones Beans Baker 2.77 Jones Carrots Joe’s 1.57

Some sample data for Figure 9.2 would be as shown in Table 9.3

In a typical market scenario, customers often encounter varying prices for the same product from different manufacturers or suppliers Additionally, prices can fluctuate over time for the same supplier Customers may also purchase items at discounted rates or at full price Another relevant factor to consider is the sale date of a product, which links the customer, the specific supplier, and the pricing dynamics.

Next, we look at the structural constraints of ternary relationships.

Structural Constraints for Ternary

Ternary relationships exhibit various structural constraints represented as x:y:z, where x, y, and z can each be either 1 or "M," with "M" potentially representing different values for each variable Examples of these relationships include configurations like 1:1:1, 1:M:N, M1:M2:M3, and M:1:N Each relationship can involve full or partial participation from any of the three sides, as demonstrated by the M1:M2:M3 relationship, which illustrates partial participation across all dimensions.

9.3.1 Many to Many to Many (M1:M2:M3)

Figure 9.3 illustrates a M1:M2:M3 relationship among three entities, aaaaaaa, aaaaaaaa, and aaaaaaaa, each exhibiting partial participation This diagram demonstrates that numerous customers can purchase various products from multiple suppliers, each at different price points.

Instances of this relationship can be illustrated as shown in Figure 9.4

244 • Database Design Using ER Diagrams

ER Diagram Showing a Ternary Many-to-Many-to-Many Relationship (Partial Participation on All Sides)

Instances of a ternary many-to-many-to-many relationship for aaaaaaaaaaaaaaaaa aaaaaaaaA

Ternary and Higher-Order ER Diagrams • 245

2 What is an n -ary relationship?

3 What are higher-order relationships?

4 Using the three entities presented (aaaaaaa, aaaaaaaa, and aaaaaaaa), draw an ER diagram that depicts the follow- ing: A customer must buy one and only one product from a sup- plier at a particular price on a particular date

5 Using the three entities presented (aaaaaaa, aaaaaaaa, and aaaaaaaa), draw an ER diagram depicting the follow- ing: A supplier must supply many products to many customers at diferent prices on dif erent dates

6 Tink of some more intersection attributes for the aaaaaaa, aaaaaaaa, and aaaaaaaa ternary example presented in Figure 9.3

7 What situations might create each of the following structural constraints? a aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa, partial par- ticipation on all sides b aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa, partial par- ticipation on all sides c aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa, full participa- tion on all sides

An Example of an n -ary Relationship

An n-ary relationship represents the connection among n entities, with a ternary example illustrating how price depends on three factors In a scenario where price is influenced by four entities, it constitutes a 4-ary relationship In this type of relationship, a single diamond shape links the four entities, as depicted in Figure 9.5 The intersection attribute is identified as aaaaa, although additional non-intersection attributes may exist for the entities but are not shown in the diagram.

246 • Database Design Using ER Diagrams

ER Diagram Showing n -ary Relationship

n -ary Relationships Do Not Preclude

Ternary relationships in a database do not exclude the possibility of existing binary relationships among entities For instance, retail vendors and suppliers may have a unique relationship that does not involve customers, characterized by a distinct pricing structure for wholesaling This binary relationship can be represented independently alongside the ternary relationship Refer to Figure 9.6 for an illustrative ER diagram showcasing both the binary and ternary relationships within the same database context.

Figure 9.6 illustrates a binary relationship between aaaaaaa and aaaaaaa, with full participation from both aaaaaaaa and aaaaaaa In this relationship, both aaaaaa and aaaaaaaa engage in purchasing aaaaaaaa, albeit in a specific manner.

Ternary and Higher-Order ER Diagrams • 247

This article discusses an ER diagram that illustrates both a tree-way and a two-way relationship, focusing on a binary relationship defined by wholesale buying The relationship is labeled to indicate its nature, excluding customer involvement Additionally, the ternary relationship has been modified for clarity, distinguishing the two relationships effectively.

Methodology and Grammar for the

We need to revisit step 6 in the ER design methodology to cover the pos- sibility of the n-ary relationship Te old version was:

248 • Database Design Using ER Diagrams

Step 6 State the exact nature of the relationships in structured English from all sides, for example, if a relationship is A:B::1:M, then there is a relationship from A to B, 1 to Many, and from B back to A, Many to 1

We add the following sentence to step 6:

When defining ternary and higher-order (n-ary) relationships, it is essential to articulate the relationship in structured English, ensuring that all entities involved in the n-ary relationship are clearly identified Additionally, it is important to specify the existing structural constraints that govern these relationships.

The grammar for an n-ary relationship requires the inclusion of all associated entities A suitable informal expression could be: "aaaaaaa Relationship (from/to/by) aaaaaaa (and) (from/to/by) aaaaaaa." It is essential to name all n entities to accurately define the relationship's attributes.

Here, if we choose some combination for Entity1, Entityn , this process resolves into

Entity3: aaaaaa aaaaaaaaa buy aaaaaaaa from aaaaaaaa It is understood aaaaaaaaaaaa will necessitate referencing all three entities to identify it

In a binary relationship, two distinct relationships are defined, while one might assume that ternary relationships require the specification of three However, since the relationship attribute has already been established, it's important to explore other potential configurations.

Ternary and Higher-Order ER Diagrams • 249

Entity3: aaaaaa aaaaaaaa are bought by aaaaaaaaa from aaaaaaaa

The informal version of the statement reveals that repetition adds minimal value to the information presented It is recommended to explore alternative combinations, but it appears that a single, well-inferred statement could effectively convey the essence of the relationship based on the situation's semantics.

An accurate grammar for n-ary relationships can be seen as an extension of the grammar designed for binary relationships In a formal grammatical framework, defining a ternary relationship requires three distinct statements, each originating from one of the involved entities This contrasts with the binary relationship, particularly in the M:N full participation scenario, where a single relationship description suffices.

9.6.1.1 Pattern 3—M:N, From the M Side, Full Participation

Short: x must be related to many y which actually means

In the database, every entry of x is associated with one or more instances of y Conversely, there are no instances of x that relate to a non-y, and similarly, non-x entries do not connect to any y The interpretation of the negative relationship will vary based on the context of the statement.

We could generalize the structural constraint patterns to the pattern given next

9.6.1.2 Pattern 3—k:M, from the k Side, Full

Short: Same as in Section 9.6.1.1

Long: Same as in Section 9.6.1.1

For the n-ary relationship, we extend the notation of the generalized state- ment using the Boolean operator “and” as shown next

250 • Database Design Using ER Diagrams

9.6.1.3 Pattern 5 ( n -ary)—x:y:z::a:b:c, From the a Side, Full/Partial Participation

Short: x must/may be related to many y and many z

Te “must” represents full participation; “may” represents a partial one

T e a cardinality will not matter T e b and c force us to say “one” or

“many” in the statement So, for example, for x as full:

In the database, the relationship between long: x and b is defined such that b can be associated with many (one or more) y values, while also being linked to a unique y value Additionally, c is similarly related to many (one or more) z values, but must also correspond to a single, unique z value.

No x is related to more than one z

No x is related to more than one y

For aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa full participation all around:

Short: aaaaaaaaa must buy many aaaaaaaa from many aaaaaaaa

Long: aaaaaaaaaa recorded in the database, must buy many (one or more) aaaaaaaa from many (one or more) aaaaaaaa

Other grammatical expressions are derived similarly

Products, recorded in the database, must be bought by many (one or more) customers from many (one or more) vendors

Vendors, recorded in the database, must sell many (one or more) prod- ucts to many (one or more) customers

A negative could be:No customer (in this database) buys products from nonvendors

As with the binary cases, the negative statements would be optional, if they make sense

Ternary and Higher-Order ER Diagrams • 251

9.6.2 Grammar in a Partial Participation, Ternary

Figure 9.7 illustrates a database representation of a graduation ceremony, highlighting the attendance of students and faculty It shows that certain students attend specific ceremonies alongside faculty members, while faculty also participate in various ceremonies with students Additionally, all ceremonies are attended by a mix of students and faculty, with the intersection attribute being a key element of the data structure.

Here, we have partial participation on the M cardinality relationships and a one-relationship with full participation Using the grammar pre- sented, we have this outcome: aaaaaaaaaaaaaaaaaaaaaaaaaa ::M1:1:M2

Short: Students may attend one graduation with many faculty

ER Diagram (with Only Primary Keys) Showing Tree-Way Relationship with Partial Participations and a One-Relationship a aa a aa a aa

252 • Database Design Using ER Diagrams

Long: Students, recorded in the database, may attend (b = 1) one and only one graduation with

(c = m) [many (one or more)] faculty

No student attends more than one graduation [with many faculty]

We put the [with many faculty] in square brackets because it is not really needed to make sense of the diagram

Faculty members listed in the database can participate in a graduation ceremony alongside numerous students, although some faculty may choose not to attend Each graduation requires the presence of both students and faculty, ensuring that no ceremony occurs without their participation.

Ternary Relationships From Relationship-Relationship Situations

Ternary relationships are essential when a scenario involves a relationship between two other relationships Traditional Chen-like ER diagrams do not accommodate this complexity, making it necessary to create a ternary relationship to accurately represent such situations.

For example, let us start with two entities: aaaaaaaaaaaaaa and aaaaaaaaaa We can initially relate the two entities as shown in Figure 9.8A A aaaaaaaaaaaaaa may review many aaaaaaaaaaa

In later stages, if certain results necessitate a review, it establishes a complex relationship, illustrated in Figure 9.8B This intricate connection is essential because the combined factors lead to a significant outcome, as depicted in Figure 9.8C.

In Figure 9.8C, the aaaaaaaaaaaaaa, the a aaaaaaa relationship, and aaa aaaaaaaaaa taken together are like creating a higher-level aggregate class composed of aaaaaaaaaaaaaa, a aaaaaaa, and aaa aaaaaaaaaa

Ternary and Higher-Order ER Diagrams • 253

A Binary Relationship between aaaaaaaaaaaaaa and aaaaaaaaaa

Tis aggregate class (of the two entities and a relationship) then needs to be related to aaaa, as shown in Figure 9.8C

To effectively represent this situation, we must establish a weak entity and connect it to the primary entity, as illustrated in Figure 9.8D The relationship serves to link the primary entity with the weak entity, which may lead to the creation of an additional entity, as depicted in the figure.

254 • Database Design Using ER Diagrams

In Figure 9.8D, we emphasize the importance of the weak entity, as the existence of a book relies on a thorough review process involving both the manuscript and the publisher Attempting to link the book solely to the manuscript, without considering the publisher's role, would fail to accurately represent the true dynamics of the publishing process.

9.8 N -ARY RELATIONSHIPS THAT MAY BE

Not all relationships among three entities qualify as ternary relationships This section explores the conditions under which certain ternary relationships can be transformed into binary relationships, as well as those that cannot be resolved in this manner.

Ternary and Higher-Order ER Diagrams • 255

A Relationship of a Relationship with a Higher-Level Aggregate Class Composed of aaaaaaaaaaaaaaaAa aaaaaaa and aaaaaaaaaa aaa

Just as the binary M:N relationship may be decomposed into two 1:M relationships, so may many n-ary relationships be decomposed First, note the decomposition of the M:N into two 1:M relationships in Figure 9.9

Te idea is to make the relationship an entity and hence form two simpler binary relationships

In Figure 9.10, we observe three binary relationships derived from Figure 9.7, where the new entity "aaaaaaaaaa" is classified as weak This entity relies on the existence of the entities "aaaaaaaaAaaaaaaa" and "aaaaaaaaaa." Essentially, "aaaaaaaaaa" represents a roll of attendees for a specific ceremonial event.

256 • Database Design Using ER Diagrams

A Relationship of a Relationship Resolved into a Ternary Relationship

Ternary and Higher-Order ER Diagrams • 257

ER Diagram of an M:N Relationship Replaced with Two 1:M Relationships

258 • Database Design Using ER Diagrams

ER Diagram (with Only Primary Keys) Showing a Tree-Way Relationship “Decomposed” into Tree Binary Relationships

1 Can all ternary relationships be expressed in the form of binary relationships? Explain

2 Come up with some attributes and entities of a relationship you think could be a ternary relationship Can this relationship be expressed in the form of a binary relationship?

Ternary and Higher-Order ER Diagrams • 259

In this section, we develop mapping rules to map n-ary relationships to a relational database

When mapping n-ary relationships, it is essential to create a new relation that includes the keys of the connected entities along with any relevant attributes of the relationship The primary key of this new relation should be formed by concatenating the keys of the connected entities.

In a ternary relationship exemplified in Figure 9.2, the entities aaaaaaaaA, aaaaaaaa, and aaaaaaaa are interconnected, with the intersection attribute being price The corresponding mapped relations include aaaaaaaaaaAAaaaaaaaaaaAaaaaaaaaaaaAaaaaaaaaaaAa, aaaaaaaaaaaaaaaaaAaAaAaAaaA, aaaaaaaaaaaaaaaaaaaAaAAaAaAaaA, and aaaaaaaaaAaaaaaaaaaaAaAaAaAaaA.

And, some sample data would as shown in Tables 9.4–9.7

Sample Data for aaa in Figure 9.2

Sample Data for aaaaaaa in Figure 9.2

260 • Database Design Using ER Diagrams

Sample Data for aaaaaaaa in Figure 9.2

TABLE 9.7 Sample Data for aaaaaaaa in Figure 9.2

1 Could Figure 9.5 be described in the form of binary relation- ships? Discuss

2 What mapping rules would you follow to map Figure 9.5?

3 Map Figure 9.5 to a relational database and show some sample data

Our ER design methodology has now fnally evolved to the following presentation:

Step 1 Select one primary entity from the database requirements description and show the attributes to be recorded for that entity Label keys if appropriate and show some sample data Step 2 Use structured English for entities, attributes, and keys to describe the elicited database

Ternary and Higher-Order ER Diagrams • 261

Step 3 Examine attributes in the existing entities (possibly with user assistance) to find out if information about one of the enti- ties is to be recorded

(We change primary to existing because we redo step 3 as we add new entities.)

Step 3a If information about an attribute is needed, then make the attribute an entity, and then

Step 3b Define the relationship back to the original entity

Step 4 If another entity is appropriate, draw the second entity with its attributes Repeat steps 2 and 3 to see if this entity should be further split into more entities

Step 5 Connect entities with relationships (one or more) if rela- tionships exist

Step 6 State the exact nature of the relationships in structured English from all sides, for example, if a relationship is A:B::1:M, then there is a relationship from A(1) to B(M) and from B(M) back to A(1) For ternary and higher-order ( n -ary) relationships, state the rela- tionship in structured English being careful to mention all enti- ties for the n -ary relationship State the structural constraints as they exist

In Step 6a, review the list of attributes to identify any that should be associated with two or more entities, and appropriately place these attributes within the relevant relationship connecting the entities In Step 6b, analyze the diagram for any loops that suggest redundant relationships; if a relationship is found to be unnecessary, remove the redundant connection to streamline the design.

Step 7 Show some sample data

Step 8 Present the “as designed” database to the user complete with the English for entities, attributes, keys, and relationships Refine the diagram as necessary

Binary relationships are the most prevalent type of relationship found in data modeling Many ER diagram notations lack the capability to represent ternary or higher-order relationships, meaning that all relationships are typically articulated in terms of binary connections.

Mapping n -ary Relationships to a

In this section, we develop mapping rules to map n-ary relationships to a relational database

Mapping Rule 14 involves creating a new relation for each n-ary relationship This new relation should include the keys of the connected entities along with any relevant attributes of the relationship The primary key for this new relation is formed by concatenating the keys of the connected entities.

In a ternary relationship, such as the one illustrated in Figure 9.2, three entities are interconnected, with a key intersection attribute, price The mapped relations include various combinations of these entities, highlighting the complexity of their interactions Understanding these relationships is crucial for effective data modeling and analysis.

And, some sample data would as shown in Tables 9.4–9.7

Sample Data for aaa in Figure 9.2

Sample Data for aaaaaaa in Figure 9.2

260 • Database Design Using ER Diagrams

Sample Data for aaaaaaaa in Figure 9.2

TABLE 9.7 Sample Data for aaaaaaaa in Figure 9.2

1 Could Figure 9.5 be described in the form of binary relation- ships? Discuss

2 What mapping rules would you follow to map Figure 9.5?

3 Map Figure 9.5 to a relational database and show some sample data.

Review of the Methodology

Our ER design methodology has now fnally evolved to the following presentation:

Step 1 Select one primary entity from the database requirements description and show the attributes to be recorded for that entity Label keys if appropriate and show some sample data Step 2 Use structured English for entities, attributes, and keys to describe the elicited database

Ternary and Higher-Order ER Diagrams • 261

Step 3 Examine attributes in the existing entities (possibly with user assistance) to find out if information about one of the enti- ties is to be recorded

(We change primary to existing because we redo step 3 as we add new entities.)

Step 3a If information about an attribute is needed, then make the attribute an entity, and then

Step 3b Define the relationship back to the original entity

Step 4 If another entity is appropriate, draw the second entity with its attributes Repeat steps 2 and 3 to see if this entity should be further split into more entities

Step 5 Connect entities with relationships (one or more) if rela- tionships exist

Step 6 State the exact nature of the relationships in structured English from all sides, for example, if a relationship is A:B::1:M, then there is a relationship from A(1) to B(M) and from B(M) back to A(1) For ternary and higher-order ( n -ary) relationships, state the rela- tionship in structured English being careful to mention all enti- ties for the n -ary relationship State the structural constraints as they exist

In Step 6a, review the list of attributes to identify any that require association with two or more entities, placing these attributes on the relevant relationship connecting the entities In Step 6b, analyze the diagram for any loops that signify redundant relationships; if a relationship is indeed redundant, remove it to streamline the design.

Step 7 Show some sample data

Step 8 Present the “as designed” database to the user complete with the English for entities, attributes, keys, and relationships Refine the diagram as necessary.

Chapter Summary

Binary relationships are the most prevalent type of relationship in data modeling Many ER diagram notations lack the capability to represent ternary or higher-order relationships, meaning that all relationships are typically expressed in binary terms.

In this chapter on database design using ER diagrams, we explored the concept of ternary relationships, which arise from specific scenarios that require the identification of all three involved entities We examined how intersection attributes necessitate the consideration of these entities collectively Additionally, we discussed the process of developing ternary relationships through reverse engineering, as elaborated in Chapter 11 The chapter also covered the structural constraints and grammar associated with ternary relationships, highlighting how some ternary or n-ary relationships can be simplified into binary relationships Finally, we outlined the mapping rules applicable to n-ary relationships.

Exercises

In Chapter 8, we initially defined a database featuring two entities, aaaaaa and aaaaaaaaaa, with "Book" as an attribute of aaaaaa To enhance this database, we propose transforming "Book" into an entity with attributes such as book title, author, price, edition, and publisher This new structure allows us to explore various relationships, including "in," "by," "write," and "teach." An ER diagram should be created to illustrate at least two relationships, one of which must be ternary Important attributes of these relationships could include roles such as author, publisher, and educator, facilitating a deeper understanding of the connections among the entities.

To create an ER diagram for a broker, security, and buyer, include essential elements such as the broker's name and address, the buyer's name and address, and the security's exchange, symbol, and price Additionally, incorporate the price of the security and the commission paid It's important to represent the number of shares held by a buyer, which can be organized by broker if desired This structured approach ensures a clear visualization of relationships between the broker, security, and buyer within the financial context.

Using three entities—aaaaaaaaaaaA aaaaa, and aaaa—draw an

An Entity-Relationship (ER) diagram illustrates the relationship between entities in a system In this case, each "aaaaa" in a "aaaa" is associated with one "aaaaaaaaaa," while each "aaaaaaaaaa" in a "aaaa" can have multiple "aaaaaes." Additionally, each "aaaaaaaaaa" of a "aaaa" occupies several "aaaas." The diagram should clearly include the cardinalities to represent these relationships accurately.

Ternary and Higher-Order ER Diagrams • 263

To create an ER diagram featuring a ternary relationship, begin by illustrating the entities involved and their interactions Clearly define the cardinalities for each relationship within the diagram It is important to note that while a ternary relationship can sometimes be represented as multiple binary relationships, doing so may complicate the data model and obscure the direct associations between all three entities Therefore, maintaining the ternary structure is often more effective for clarity and accuracy in representing complex relationships.

Elmasri, R., and Navathe, S.B (2016) Fundamentals of Database Systems Reading, MA:

Teorey, T.J., Nadeau, T., and Lightstone, S.S (2005) Database Modeling and Design: Logical Design San Francisco, CA: Morgan Kaufman

Teorey, T.J., Yang, D., and Fry, J.P (1986) A logical design methodology for relational data- bases using the extended entity-relationship model ACM Computing Surveys , 18(2): 197–222.

The Enhanced Entity-Relationship (EER) Model

Introduction

In the initial chapters of this book, we introduced the entity-relationship (ER) diagram as a conceptual tool for database design, focusing on modeling reality for users However, the basic ER model, originally developed by Chen, often falls short in capturing the complexities of modern database applications, particularly in scenarios involving generalizations and specializations, or class hierarchies To address these limitations, the enhanced entity-relationship (EER) model has been developed, which incorporates additional semantic concepts This chapter explores generalizations and specializations within the EER model, outlines a methodology and grammar for its extension, and examines shared subclasses and union types Additionally, we provide a methodology for mapping the EER diagram to a relational database.

What Is a Generalization or Specialization?

The Enhanced Entity-Relationship (EER) model expands upon the original Entity-Relationship (ER) model by incorporating concepts of generalizations and specializations, which are essential for creating class hierarchies This model introduces the notions of superclasses and subclasses, along with the concept of attribute inheritance, allowing for a more nuanced representation of data relationships For instance, a class can be defined with specific attributes that are inherited by its subclasses, enhancing the organization and structure of the database.

266 • Database Design Using ER Diagrams hierarchy, suppose we have this aaaaaaaa entity within a database for a store selling sports equipment: aaaaaaaaaaaaaaaaaaaaaaaaAaAAaaaaAaAAaaaaaaaAaA

Now, suppose the database evolves to a situation for which we want to keep information pertinent to specifc sports for some customers: aaaaaAaaaaaaaaaaaaaaaaAAaaaaaaaaAaAAaaaaaaaaaaaaaaaaaaa aaaaaaaaaAaaaaaaaaaaaaaaaaAAaaaaaaaaAaAAaaaaaaaaaaaaaaaaaaA

The entities T e aaaa and aaaaaaaa represent subclasses within the broader category of aaaaaaaa This example demonstrates a hierarchical structure where aaaaaaaa serves as the generalization at the top, while the specific sports function as its subclasses.

In an object-oriented setting, we might designate the entities like this: aaaaaaaaaAaaaaaaaaaaaaaaaAaAAaaaaAaAAaaaaaaaAa aaaaaaaaaaaaaaAaaaaaaaaA aaaaaaaaaaaaaaaaaaA aAA a aaaaaaaaaaaaaaaaaaaaaaaaaaaAaaaaaaaaaaaaaaaaaAa

In object-oriented programming, inference is a subclass of a broader category While not explicitly mentioned, this broader category encompasses various attributes The subclass inherits all the characteristics of the parent class and adds unique attributes relevant to its specific context This process exemplifies specialization, beginning with a general class and then refining it to focus on particular sports.

Te idea of classes in a database infers the ability to describe subclasses and superclasses with inheritance features

In a scenario where we aim to organize information about an entire institution, we can establish a superclass, referred to as "Entity," which encompasses various subclasses, including "Students," "Staff," and "Faculty." Each subclass contains specific information relevant to its category, while the superclass retains shared attributes applicable to all subclasses For instance, the "Entity" superclass may include common attributes such as "Name" and "ID," which the "Students" subclass would inherit, along with additional attributes unique to students This hierarchical structure allows for efficient data management and retrieval across different categories within the institution.

AAAAA aaa AAAA AAAAA AAAA AAAAA a AAAA AAAAAa a AAAA

Te Enhanced Entity-Relationship (EER) Model • 267 aaaaa Te superclass in a database is called a generalization, and the sub- classes (aaaaaaaaAaaaaa, and aaaaaaa) are called specializations

In object-oriented programming, classes encompass both simple attributes and the actions performed by their members While data typing emphasizes attributes, databases typically prioritize these characteristics over procedural actions.

Variants

In the past, programmers addressed the challenge of specializations by developing variants, which are record components that change based on other elements within the record To demonstrate the application of variants in Entity-Relationship (ER) diagrams, let’s examine a specific problem.

In modeling student-athletes, it is essential to capture various attributes that may change based on the context Each athlete is identified by specific details such as their name and a unique identifier, like a student number Additionally, it is important to document the sport that each student-athlete participates in This information can be organized in a structured format, such as a table, to effectively manage and analyze the data related to their athletic involvement.

The attribute "te aaaa" exhibits different values, known as variants, across various sports While introducing these variants appears to address data representation issues, it creates challenges in database storage and retrieval Ideally, a database should maintain consistency within a column, such as ensuring that student numbers contain nine digits uniformly However, the presence of variants leads to inconsistencies in the "aaaa" column of the "aaaaaaa" table Over the years, various solutions have been developed to tackle this variant problem within databases.

TABLE 10.1 aaaaaaa Table aaaaaaa aaaaaaa AaaaaaaaaaaAA a aa aAaaaaaAA AaaaaAA

Baker 123456789 tennis 220, state rank 14 Adams 123456788 football tackle, neck brace Jones 123455676 golf handicap 3

268 • Database Design Using ER Diagrams

To address the issue of variations in records and differing attributes within entities in ER diagrams, one effective solution is to eliminate the variants and link them back to the primary key of the associated "parent" information This approach systematically resolves the inconsistencies in data representation.

In ER diagrams, we store information about two related concepts: a generalization referred to as "aaaaaa" and its specializations, which include specific sports like tennis, football, and golf, each with unique attributes To effectively manage this data, it is beneficial to create a distinct entity for "aaaaaa" and relate the specialized sports to it Using a single "aaaaaa" entity would be overly broad, as it would not accommodate the diverse information required for each specific sport Additionally, it is essential to capture details about each sport in relation to individual student-athletes.

Instead of creating a series of weak entities for each sport based on specific attributes, there is a more effective approach that can achieve the same database outcome This method not only simplifies the representation of the data but also allows for the inclusion of inheritance concepts in the Entity-Relationship (ER) diagrams, resulting in more expressive and informative visualizations.

Examples of Generalizations or

Generalizations and specializations categorize entities, where a specialized entity emerges from generalizations that include variants To manage these variants effectively, they can be separated from the generalization and treated as subclass entities, while the original fixed component remains as the superclass or parent type In this context, the superclass is considered the parent class, and the variant components are referred to as subclasses or child classes.

In the context of parent-child class relationships, the child class inherits the attributes defined in the parent class, allowing for shared characteristics This concept of inheritance is crucial in relational databases, where the connection between parent and child data is established through table combination operators known as joins.

In our aaaaaa example, we would consider the aaaaaaa as a parent

The Enhanced Entity-Relationship (EER) Model allows for the definition of a sport as a child class, inheriting information from its parent class This structure facilitates the maintenance of data integrity and enables efficient information retrieval through join operations.

When designing a database for student-athletes, we begin by identifying essential elements such as personal identifiers and relevant details This process starts with a generalization, or superclass, to organize the data effectively Next, we focus on recording specific player information related to their respective sports, establishing a specialization from the broader class This top-down design approach ensures a structured and coherent database that meets the needs of managing student-athlete information.

When designing a database for sports, a bottom-up approach may lead to the creation of individual athlete entities, which can later be generalized into a superclass entity representing all athletes, with specific sports as subclasses This generalization relationship allows multiple entities with common attributes to be categorized under a higher-level entity Ultimately, the design process results in a hierarchy where one entity serves as the superclass (parent) and others as subclasses (children) The choice between specialization and generalization in design depends on the initial recognition of the problem at hand.

To illustrate how we might handle this generalization-specialization, parent-child class situation, suppose we have defned our entity, aaaaaaa, like this:

The ER diagram for this entity is clear and uncomplicated During the database design process, we choose to incorporate details regarding the sports played by athletes, which may lead us to create a diagram similar to Figure 10.1, featuring a variant sports flag.

Figure 10.1 presents an issue due to the presence of attributes containing other attributes The sports flag is not a composite attribute, as it lacks component parts To resolve this, we will establish separate entities for each specific sport and then connect these entities back to the main structure.

In Figure 10.2, we illustrate the creation of weak entities for each sport instead of relying on attributes of attributes These sports are classified as weak entities due to their lack of a primary key, as they depend on other entities for identification However, this diagram does not fully capture the complete relationship and context of the sports involved.

270 • Database Design Using ER Diagrams

Tea, with an attempt to add a variant attribute, transcends the notion of being merely weak entities; they represent distinct "choices." If sports were solely categorized as weak entities, one would anticipate all superclass entities to connect with each subclass, which is not the case Additionally, it is essential to respect the principle of inheritance in this context.

The specialization process allows a subclass to inherit all characteristics from its superclass In Enhanced Entity-Relationship (EER) terminology, the superclass is referred to as the primary entity, while the specific entities, such as SPORTS, are classified as subclasses.

Te Enhanced Entity-Relationship (EER) Model • 271

T e aaaaaaa Shown as a Strong-Weak Relationship Variant Attribute subclasses Te attributes like aaaaaaaa may be termed specif c attributes as they are specifc to the particular subclass

The EER scheme, illustrated in Figure 10.3, showcases distinct sports entities and their specializations In this diagram, we have created three unique sports entities, which will serve as individual information pieces for data storage.

272 • Database Design Using ER Diagrams

FIGURE 10.3 aaaaaaa with Superclass/Subclass Overlap Relationship

In the aaaaaaa entity, we define an attribute called aaaaa, which serves as a defining predicate to specify our specializations Initially, we consider athletes participating in a single sport, each with variant information However, if an athlete competes in multiple sports, the defining predicate must be multivalued As illustrated in Figure 10.3, the defining predicate can be represented on the line connecting the aaaaaaa entity to a circle marked with an "o," which indicates an "overlapping" constraint This overlap signifies that subclass entities linked to the superclass may share common characteristics, allowing a superclass entity to encompass multiple subclasses or specializations for a specific aaaaaaa.

10.3 means an athlete may participate in more than one sport

If there were a d in the circle (in place of the o) in Figure 10.3, then the enti- ties would not overlap—they would be disjoint A d would indicate athletes

Te Enhanced Entity-Relationship (EER) Model • 273

An office database can be structured with specialized entities, ensuring full participation and disjoint relationships among athletes Each athlete would engage in only one sport—either golf, tennis, or football—without overlapping participation in multiple sports In this scenario, the designation would shift from 'o' to 'd', indicating that the defining predicate is single-valued Additionally, the superclass's participation in the subclasses is optional, as represented by a single line connecting the 'o/d' designation, allowing for the possibility that some athletes may not participate in any sport.

An example of a disjoint constraint is shown in Figure 10.4 According to Figure 10.4, all the furniture in the database is a chair, a desk, or a table

In this scenario, overlapping subclasses are not applicable, as there is complete participation from the aaaaaaaaa entity within the o/d circle Every piece of furniture is required to belong to a specific subclass, which differs from the partial participation observed in the aaaaaaa example Additionally, the disjoint constraint is relevant in this context.

When designing databases using ER diagrams, it is essential to determine if subclasses of a generalization are disjoint In such cases, an entity can belong to only one subclass or specialization Additionally, the defining predicate for these disjoint subclasses must be single-valued.

Methodology and Grammar for Generalization/

We need to revisit step 6 in the ER design methodology to include gener- alization/specialization relationships Te previous version of step 6 was:

Te Enhanced Entity-Relationship (EER) Model • 275

Step 6 State the exact nature of the relationships in structured English from all sides; for example, if a relationship is A:B::1:M, then there is a relationship from A to B, 1 to Many, and from B back to A, Many to 1

When defining ternary and higher-order (n-ary) relationships, it is essential to articulate the relationship clearly in structured English, ensuring that all entities involved in the n-ary relationship are explicitly mentioned Additionally, it is important to outline the existing structural constraints associated with these relationships to provide a comprehensive understanding.

We add the following sentence to step 6:

For specialization/generalization relationships, state the relation- ship in structured English, being careful to mention all entities (subclasses or specializations) State the structural constraints as they exist

The grammar we suggest for managing specialization and generalization relationships closely resembles our approach for weak relationships Additionally, we enhance the grammar to incorporate participation constraints along with overlapping and disjoint (o/d) constraints.

Te grammatical description for weak entities was:

Weak entities lack unique attributes for individual identification, as they do not possess a candidate key Instead, they are identified through keys associated with a strong entity.

In the case of the subclasses of aaaaaaa, a frst attempt to describe the subclass identifed by a superclass becomes:

In the realm of sports, no single attribute can uniquely identify individual sports entities As a result, since sports lack a candidate key, each sport will be recognized by inheriting keys from a designated entity.

So, a more complete EER diagram grammatical pattern would say:

Each specialization lacks a unique attribute to identify individual entities, meaning there is no candidate key available Consequently, every specialization will be identified through alternative means.

276 • Database Design Using ER Diagrams key(s) inherited from the generalization Further, specializations overlap [or are disjoint] [Explain the overlapping/disjoint situation.]

Individual specialization is characterized by a defining predicate, known as the attribute name, which is included in the generalization In cases where specializations overlap, the defining predicate will be multivalued.

In the realm of sports, no single attribute can distinctly identify individual entities, as each sport lacks a unique candidate key Consequently, sports will be identified using keys inherited from a broader category Additionally, it is important to note that there is an overlap among sports, with athletes often participating in multiple disciplines.

Individual sports are characterized by a defining predicate attribute, which is essential for their identification As individuals can participate in multiple sports, this defining predicate serves as a multivalued attribute, highlighting the diverse nature of sports engagement.

Te sports we will record are aaaa,Aaaaaaa, and aaaaaaaa.

Mapping Rules for Generalizations

This section outlines mapping rules for translating generalizations and specializations into relational databases The mapping of these concepts can be achieved through various methods, and the specific rules for this process are influenced by multiple factors (Elmasri and Navathe, 2016).

(a) Te total/partial constraints of the generalization/specialization relationships

(b) Te overlapping/disjoint constraints of the generalization/special- ization relationships

(c) Te number of attributes on the specializations

(d) Whether the specializations are predicate def ned

Table 10.2 summarizes the mapping rules for generalizations and specializations, highlighting the optimal situations for their application Specifically, mapping rules 15 and 16 establish multiple relationships, while mapping rule 17 further elaborates on these connections.

Te Enhanced Entity-Relationship (EER) Model • 277

Where Each Mapping Rule Works Best

Mapping Rule Relations Created Works Best with

Mapping rule 15 Multiple • Total or partial participation

• Disjoint or overlapping relationships Mapping rule 16 Multiple • Disjoint relationships

• Total participation Mapping rule 17 Single • Only disjoint relationships

• Can be total or partial participation

• Not many attributes on specialization

• Single type attribute Mapping rule 18 Single • Better for overlapping relationships but can be used for disjoint relationships

• Many type felds—one for each subclass and 18 create single relations In each case, there are relational database trade-ofs with regard to the result

In the next few sections, we explain each of the mapping rules and the resulting trade-of s

As per Table 10.2, mapping rule 15 works well for either disjoint or over- lapping scenarios Tis rule would also work well if the specializations have many attributes

Mapping Rule 15 involves creating relations for generalization and specialization entities with disjoint or overlapping subclasses and varying participation constraints For each generalization/specialization scenario, establish a relation for the generalization entity, if not already done, and create separate relations for each specialization Ensure that all attributes are included in their respective relations, and incorporate the primary key of the generalization entity into the specialization relations, which should share the same primary key as the generalization relation.

So, using mapping rule 15, we create a separate table for the generaliza- tion (superclass) as well as for each of the specializations (subclasses) Refer to

Figure 10.3 Te generalization/specialization relationship between aaaa aaaa and aaaaaa, aaaa, and aaaaaaaa would be mapped as follows: aaaaaaaaAaaaaaAaaaaaaaAaaaaaAaaaaaaaAaaaaaaAaAAaaaaaaaAa aaaaaaaaaaaAaAaaaaaaaaaaaaaa aaaaaaaaaAaAAaaaaaaaaAaA aaaaaaaaaAaaaaAaAaaaaaaaaAa

Since aaaaaaa in Figure 10.3 contains a multivalued attribute, aaaa aaaa now becomes aaaaaaaaaAaaaaAaAaaaaaaaAaaaaaAaaaaaaaAaaaaaaAaA and

The primary key of the generalization entity 'aaaa' is incorporated into the specialization entities 'aaaaaa', 'aaaa', and 'aaaaaaaa' Additionally, the ATHLETE entity, as illustrated in Figure 10.3, includes a multivalued attribute, which is mapped by 'aaaaaaaa'.

AAAAAa aaAAAAA AAAAAa aa AAAA AAAAA AAAA AAAAA AAAA AAAAA AAAA AAAAA aaa AAAA a aaa a aa

278 • Database Design Using ER Diagrams

Sample Data for aaaaaaa in Figure 10.3

AaaaaA a AaaaaaaAA a aa AaaaaAA aaaa AaaaaaaAA aaaaaa AaaaaaaAA aaaaaa a AaaaaaAA a

398–08–0928 200 Kelvin M 6.02 football 322–00–1234 135 Sarah F 5.6 tennis 873–97–9877 165 Arjun M 6.01 golf

Sample Data for aaaaaaaa in Figure 10.3 a aaaaaa a aaaa aaaaaa aaaaaa aaaa a aaaa aaaaaa aaaaaa

239-92-0983 140 Kumar M 5.95 398-08-0928 200 Kelvin M 6.02 322-00-1234 135 Sarah F 5.6 873-97-9877 165 Arjun M 6.01 876-09-9873 145 Deesha F 5.5

AAAAAa aaAAAAA AAAAA aaa AAAA

AAAAAa aaAAAAA AAAAAa aaaa AAAA

Te Enhanced Entity-Relationship (EER) Model • 279

Showing some sample data, the table would look like Table 10.3 which would resolve to Tables 10.4–10.8

Sample Data for aaaaaaaaaaaaaA in Figure 10.3

AaaaaA a aAaaaaaAAa 239-92-0983 golf 398-08-0928 football 322-00-1234 tennis 873-97-9877 golf 876-09-9873 tennis 876-09-9873 golf

Sample Data for aaaaaa in Figure 10.3

Sample Data for aaaa in Figure 10.3

TABLE 10.8 Sample Data for aaaaaaaa in Figure 10.3

AaaaaA a aaa aaaa AaaaaaaaaA 398-08-0928 tackle 239-92-0983 quarterback 398-08-0928 full back

280 • Database Design Using ER Diagrams

The primary challenge with this mapping lies in its creation of tables that are somewhat unconventional for a relational database Typically, one anticipates a table featuring a joining attribute; however, in this case, the joining attribute is represented by a table name instead of an attribute value If the one-to-many constraint were defined differently, it could further complicate the mapping process.

If the sport-defining predicate is disjoint, it would be single-valued, meaning the original table would contain only one value for the sport Consequently, decomposing the table into separate components would be unnecessary.

Mapping rule 16 works best with disjoint subclasses and when the rela- tionship is total between the generalization and specializations

Mapping Rule 16 involves establishing generalizations and specializations with disjoint relationship constraints and total participation For each specialization entity, a distinct subclass relation should be created, incorporating the relevant attributes of each specialization Additionally, the primary key and other attributes of the generalization entity must be included in all subclass relations The primary key of these subclass relations will correspond to the primary key of the generalization entity.

To illustrate this rule, we map Figure 10.4 as follows: aaaaaaaaaaaaaaaaaaa aaAAaaaaaa aaaaaaaaaaaaaaaa aaaaaaaaaaaaaaa aaaaa aaaaaaaaaaaaaaaaaa aaAAaaaaaa aaaaaaaaaaaaaaaa aaaaaaaaaaaaaaa aaaaaaaa aaaaaaaaaaaaaaaaaaa aaAAaaaaaa aaaaaaaaaaaaaaaa aaaaaaaaaaaaaaa aaaaaa aaaaaaaaaaa

Mapping rule 16 allows for the creation of distinct relations for each subclass while omitting a separate relation for the superclass entity This approach is particularly effective in scenarios with a limited and fixed number of disjoint subclasses However, applying this rule in overlap relationship scenarios can lead to database redundancy, as attributes from the generalization entity would be unnecessarily duplicated In the presented tables, the inclusion of the attribute "aaaaaaaaaaaaaa" exemplifies redundancy, as it repeats information already conveyed by the table name.

The Enhanced Entity-Relationship (EER) Model is designed to accurately represent complex data structures, particularly when subclasses possess numerous attributes This approach ensures clarity and efficiency in data organization.

Although mapping rule 17 will work for both total and partial participa- tion, it will work only for disjoint relationships and,

(a) If the specializations do not have many attributes

(b) If the specializations are predicate def ned

Applying this mapping rule can lead to numerous null values when specializations possess multiple attributes Additionally, utilizing mapping rule 17 in the presence of overlapping relationships may result in redundancy within the database.

Mapping Rule 17 involves creating a single relation that encompasses both the attributes of a generalization (superclass) and its specializations (subclasses), particularly in cases of disjoint relationships with total or partial participation constraints The primary key for this relation is derived from the primary key of the generalization, ensuring that all relevant attributes are consolidated within one cohesive structure.

In the context of Figure 10.4, the defining predicate is represented as "aaaaaaaaaaaaaaaaaa," which establishes a condition of membership for the subclass "aaaaaaaaaaAaaaaAA." This defining predicate is visually depicted in the EER diagram by placing the predicate condition adjacent to the arc that links the subclass to the relationship constraint circle Additionally, the name of the defining predicate is indicated on the arc connecting the superclass to the relationship constraint circle Consequently, the mapping of Figure 10.4 follows mapping rule 17, illustrating the relationship between the subclass and the superclass through the specified predicates.

282 • Database Design Using ER Diagrams

The mapping rule in relational databases generates null values for nonparticipating attributes, such as drawers for chairs, leading to undesired nulls While allowing nulls can reduce the number of tables, the ideal approach would involve creating a separate table for each furniture type, adhering to mapping rules 15 or 16 However, this configuration, despite appearing plausible, does not conform to the third normal form (3NF), highlighting another trade-off in database design.

Mapping rule 18 is applicable for both overlapping and disjoint relationships This rule utilizes unique predicates or flags for each specialization, ensuring clarity in relationships It is particularly useful when there are multiple overlaps within a generalization.

Mapping Rule 18 involves creating a unified relation for overlapping relationships and generalizations/specializations that have multiple flags This relation should incorporate both the attributes of the superclass and the subclasses, along with the subclass flag The primary key for this relation is derived from the primary key of the superclass.

Subclasses of Subclasses

In this chapter, we explored a generalization class featuring a single superclass, which may have multiple subclasses, each potentially containing various attributes Additionally, subclasses can further subdivide into additional subclasses, leading to multiple sets of subclasses We will illustrate this concept through examples of specialization hierarchies, specialization lattices, and shared subclass parents.

Figure 10.5 illustrates subclasses of subclasses, where aaaaa and aaaaaaaaaaaa are identified as subclasses of aaaaaaaa, inheriting its attributes, which in turn inherits from aaaaaaa The EER diagram simplifies the representation by omitting subclass attributes within the football tree Among athletes, some engage in football, categorized further into hobbyists and professionals Each instance of aaaaa inherits from aaaaaaaa, and similarly, aaaaaaaaaaaa also inherits from aaaaaaaa, forming a specialization hierarchy where each subclass derives from a single parent subclass, exemplified by aaaaaaaa in Figure 10.5.

284 • Database Design Using ER Diagrams

An enhanced EER diagram can feature multiple sets of specializations, exemplified by an athlete engaged in various sports while possessing both professional and hobbyist classifications In cases where subclasses overlap, a subclass with multiple parent subclasses is termed a specialization lattice This concept is visually represented in Figure 10.6, where the subclass "aaaaaaaaaaaaAaaaaaaaaAaaaaaa" derives attributes from both the "aaaaaaaa" subclass and the "aaaaaaaaaaaa aaaaaaaa."

A shared subclass is a type of subclass that derives from multiple parent classes, allowing it to inherit attributes from each For example, a subclass named "A" can inherit from both "B" and "C," thus gaining the properties of all its superclasses This means that every instance of a shared subclass encompasses all attributes associated with its parent classes.

Te Enhanced Entity-Relationship (EER) Model • 285

We present mapping rule 19 to map shared subclasses

Mapping Rule 19 focuses on the mapping of shared subclasses, applying the same criteria used for generalizations and specializations Typically, the most effective approach for creating an optimal database in this context is Mapping Rule 15.

As an example of applying mapping rule 15, consider the mapping of Figure 10.6 : aaaaaaaaaaaaa aaaaaaAa aaaaAaAAaaaaaaAaAAaaaaaaAaAAaaaaaaaaaaaaaaaaaaaA aAaaaAaA

286 • Database Design Using ER Diagrams aaaaaaaaaaaa aaaaaaaaaaaaaa aaaaaaaaaa aaaaaaaaAaA aaaaaaaaaaaaaa aaaaaaaaa aaaaaaaaaaaaaaaaaa aaaaaa aAaAaaA aaaaaaaaaaAaAAaaaaaaAaaaaaaAaAaAaAaaA aaaaaaaaaaaaaaaaaaaaaaaaaaaa aAaaaaa aaaaaAaAaAaaA

In this section, we applied mapping rule 15 to interpret Figure 10.6, while acknowledging that mapping rules 16 to 18 may also be suitable in different scenarios It's crucial to understand that a shared subclass has a single superclass, ensuring that all subclasses retain the same key attribute Additionally, the presence of multivalued attributes requires normalization for effective relational database management To achieve this normalization, two additional tables must be created to address the multivalued predicates.

Becomes: aaaaaaaaaaaaaa aaaaaaa aaaaa aaaaaaa aaaaaaa aaaaaaaAaaaaaaaaaaa aaaaaa aaaaaaaAaaaaaaAaaaaAaAAaaaaaaaaaaAaAaaAa

Categories or Union Types

In this chapter, we explored scenarios involving multiple superclasses and their subclasses When a subclass inherits from several superclasses, it is identified as a category or union type, distinguishing it from previous discussions that focused on multiple subclasses Each superclass functions as a standalone entity, while a category or union type allows for inheritance from multiple superclasses, unlike a shared subclass, which is limited to a single superclass in its hierarchy.

A category or union type will inherit information from any one of the superclasses; hence, the term union is used to describe the combination

The Enhanced Entity-Relationship (EER) Model illustrates how subclasses inherit information from their superclasses, represented by a union type symbol (u) connecting them Typically, superclasses have distinct keys as they represent different entities; however, there are instances where a subclass may inherit the same key from two superclasses For example, a subclass can be a subset of the union of two superclasses, sharing the same key while maintaining its unique identity.

A player may be a student or faculty member

A category or union type inherits all attributes from its associated class or classes For instance, if a player is part of the aaaaaaa class (super-class), they inherit all the characteristics of the aaaaaaa entity type This inheritance ensures that players possess the essential attributes defined by their super-class.

A Category or Union Type with Same Primary Keys (Partial Participation)

288 • Database Design Using ER Diagrams

A Category or Union Type with the Same Primary Keys (Full or Total Participation) to the aaaaaaa class (superclass), it inherits all the attributes of the aaaaaaa entity type

In the ER diagram depicted in Figure 10.8A, we illustrate a union where a bill payer can belong to multiple superclasses This means that the payer may inherit data from various sources, such as superclass A, superclass B, or other related classes, showcasing the flexibility and complexity of data relationships within the model.

10.8.1 Participation Ratios in Categories or Union Types

Categories or union types can exhibit participation constraints, as illustrated in Figure 10.7A The category or union type "aaaaaa" demonstrates partial participation, indicated by single lines connecting it to the subclass "aaaaaa." This partial participation suggests that "aaaaaa" may or may not encompass students or faculty, indicating the existence of faculty and students who do not participate as players.

If a category such as aaaaaa has full participation, as shown in Figure 10.7B, this would imply the category (or union type or subclass) aaaaaa

Te Enhanced Entity-Relationship (EER) Model • 289

Full Participation between aaaaaaa and aaaaaa

A Category or Union Type with Diferent Primary Keys (with Partial Participation)

The database design illustrated in Figure 10.7B utilizes ER diagrams to represent a specific school, focusing on tracking players rather than the entire student body It indicates that the database contains at least one entity from the union of its superclasses, which include either the aaaaaaa or aaaaaaa entities This structure emphasizes the database's purpose of managing player information effectively.

If double lines connect the aaaaaaa to the circle containing the u, as illustrated in Figure 10.7C, the player entity encompasses all faculty members but excludes some students.

In Figure 10.8B, there is complete integration between PAYOR and the superclasses aaaaaaa, aaaaaaaaaaaaaaaaa, and aaaaaa aaaaaaaaaaaaaaaaa Each payor is categorized under one of the superclasses, ensuring that relevant information is inherited from aaaaaaa, aaaaaaaaaaaaaaaaa, and aaaaaaaaaaaaaaaaaa aaaaa.

A Category or Union Type with Diferent Primary Keys (with Full or Total Participation)

Te Enhanced Entity-Relationship (EER) Model • 291

10.8.2 Mapping Categories or Union Types When

Superclasses Have the Same Primary Keys

When subclasses inherit from superclasses that share the same primary key, the mapping process is straightforward because this primary key is incorporated into the subclass relation This concept is illustrated in Figure 10.7B, and we introduce mapping rule 20 to effectively map categories or union types in scenarios where superclasses possess identical primary keys.

Mapping Rule 20 involves creating a new relation for a subclass or union type when superclasses share the same primary keys In this process, the primary key of the superclass should be included in the subclass relation, along with the subclass's additional attributes Furthermore, separate relations must be established for each superclass, which should be mapped similarly to regular entities.

Figure 10.7a would map to: aaaaaaaaaaaa aaaaa aaaaaaaa aaaaaaaaaaaa aaaaa aaaaaaaa aaaaa aaaaaaaaaaa aaaaa aaaaaa

10.8.3 Mapping Categories or Union Types When

Superclasses Have Different Primary Keys

Since superclasses are generally diferent entity types, superclasses gener- ally have diferent primary keys For example, see Figures 10.8A and 10.8B

If the superclasses have diferent primary keys, we would need to create a common key between the superclasses Tis common key is referred to as the surrogate key

We present mapping rule 21 to map categories or union types when the superclasses have diferent primary keys

Mapping Rule 21 involves handling mapping categories or union types when superclasses possess different primary keys To address this, a new relation should be established for the subclass or union type, accompanied by the creation of a surrogate key that serves as the primary key for this relation Additionally, all attributes associated with the subclass must be included in this new relation Furthermore, it is essential to create distinct relations for each superclass and map them accordingly.

292 • Database Design Using ER Diagrams them as you would map regular entities Add the surrogate key to the superclass relations as a foreign key

Figure 10.8A would map to aaaaaaa aaaaaaaaaaaaaaaaaAaaaaaaaa aaaaaaaaa aaaaaaaaaaaa aaaaa aaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaAaAaaaaa aaaaaaaa aaaaaaaaa aaaaaaAaaaaaaaaa aaaaaaaaaaaa aaaaAa

In this mapping, aaaaaaaa is the surrogate key

1 Figure 10.7A says, “A player may be a student or a faculty.” We show this as a union Could we have shown this as a disjoint rela- tionship? Discuss

2 What is the diference between a disjoint relationship and a union?

3 How would you map a category or union type with the same keys on the superclasses?

4 How would you map a category or union type with diferent keys on the superclasses?

5 When would you create a generalization/specialization rela- tionship, and when would you create a category or union type? Explain with examples

6 A shared subclass inherits attributes from ?

7 A category or union type inherits attributes from ?

8 What is the diference between a shared subclass and category or union type?

Final ER Design Methodology

Our fnal ER design methodology has fnally evolved to the presentation discussed next

Te Enhanced Entity-Relationship (EER) Model • 293

Step 1 Select one primary entity from the database requirements description and show attributes to be recorded for that entity Label keys if appropriate and show some sample data

Step 2 Use structured English for entities, attributes, and keys to describe the elicited database

Step 3 Examine attributes in the existing entities (possibly with user assistance) to find out if information about one of the enti- ties is to be recorded

(We change primary to existing because we redo step 3 as we add new entities.)

Step 3a If information about an attribute is needed, then make the attribute an entity, and then

Step 3b Define the relationship back to the original entity

Step 4 If another entity is appropriate, draw the second entity with its attributes Repeat steps 2 and 3 to see if this entity should be further split into more entities

Step 5 Connect entities with relationships (one or more) if rela- tionships exist

Step 6 State the exact nature of the relationships in structured English from all sides; for example, if a relationship is A:B::1:M, then there is a relationship from A(1) to B(M) and from B(M) back to A(1) For ternary and higher-order (n-ary) relationships, state the relation- ship in structured English, being careful to mention all entities for the n-ary relationship State the structural constraints as they exist For specialization/generalization relationships, state the relationship in structured English, being careful to mention all entities (subclasses or specializations) State the structural constraints as they exist Step 6a Examine the list of attributes and determine whether any of them need to be identified by two (or more) entities If so, place the attribute on an appropriate relationship that joins the two entities Step 6b Examine the diagram for loops that might indicate redun- dant relationships If a relationship is truly redundant, excise the redundant relationship

Step 7 Show some sample data

294 • Database Design Using ER Diagrams

Step 8 Present the “as designed” database to the user complete with the English for entities, attributes, keys, and relationships Refine the diagram as necessary.

Chapter Summary

This chapter explores key concepts such as generalizations and specializations, overlapping and disjoint relationships, and shared subclasses within union types It examines Enhanced Entity-Relationship (EER) diagrams as outlined by Elmasri and Navathe (2016) and Connolly, Begg, and Strachan (1998) Notably, some authors, including Sanders (1995), refer to the specialization/generalization relationship as an "IsA" relationship, highlighting its significance in data modeling.

Tis chapter also concluded the development of the EER design meth- odology and mapping EER diagrams into a relational database

To design an Entity-Relationship (ER) diagram for library holdings, include key attributes such as call number, book name, authors, and library location Define the holding type predicate and incorporate disjoint, partial specializations for journals and reference books, where journals feature a renewal date attribute and reference books include checkout constraints Subsequently, map this structure to a relational database and provide sample data to illustrate the implementation.

To create an ER diagram for school computers, each computer is uniquely identified by an ID number, make, model, date acquired, and location The computers are classified into two categories: student computers and staff computers Student computers include an attribute for hours available, while staff computers have an attribute for the responsible party, indicating the owner This structure can be mapped to a relational database, which would include tables for computers, their categories, and relevant attributes, accompanied by sample data to illustrate the relationships and functionalities.

Te Enhanced Entity-Relationship (EER) Model • 295

The EER diagram features a union type, illustrating distinct categories that share common attributes, while also showcasing a disjoint relationship where subclasses do not overlap, and an overlapping relationship where subclasses can share common instances Additionally, shared subclasses are represented with different primary keys, ensuring clarity in their unique identification The diagram incorporates a minimal set of attributes to maintain simplicity and effectiveness Finally, this structure is mapped to a relational database, demonstrating how the entities and relationships are translated into tables, keys, and attributes for effective data management.

In our case study, we have identified key entities and their relationships, successfully mapping them to a relational database with sample data included.

Step 7 Present the “as designed” database to the user complete with the English for entities, attributes, keys, and relationships Refine the diagram as necessary

Suppose we obtained some additional input from the user:

A aaaaaa may be an owner, employee, or manager For each aaaaaa, we will record the aaaaAaAaAaaaaaAaaaaaaaaAaaaaaaAaAAaaaaaaa, and aaaaaA aaaaaaA

On reviewing these additional specifcations, we came up with one new entity, aaaaaa

Now, repeating step 2 for aaaaaa, we obtain the information as described next

Tis database records data about a aaaaaa

For each aaaaaa in the database, we record a person’s name (aaaaa), person’s Social Security number (aaaa), person’s phone ( aaaaaa), and person’s address (aaaa)

For each aaaaaa, there will be one and only one aaaaa(person’s name) Te value for aaaaa will not be subdivided

296 • Database Design Using ER Diagrams

Each individual will have a unique Social Security number, which cannot be divided Similarly, each person will possess a distinct phone number, also indivisible.

For each aaaaaa, there will be one and only one aaaa (person’s address ) Te value for aaaa will not be subdivided

For each aaaaaa, we will assume that the aaaa will be unique

Tese entities have been added to the diagram in Figure 10.9

Using step 6 to determine the structural constraints of relationships, we obtain the following:

Final ER Diagram of West Florida Mall

AaaaaA AaAaaaaA AaaaaA AaaaaaaaaaA AaaaaaaA AaaaAaaaAA aaaaa

AaAaaaAA AaaaaA AaaaaA AaaaaaaA AaaaaA

Te Enhanced Entity-Relationship (EER) Model • 297

As shown in Figure 10.9, there is a disjoint relationship between aaaaaa and aaaaaaaaaaaaa, aaaaa, and aaaaaaaa

Tis means a person may be an owner, store manager, or an employee (a disjoint generalization/specialization relationship)

To map this relationship, we would normally have:

Due to the presence of fields such as Social Security number, name, address, and phone number, which may pertain to an owner, store manager, or employee, we have streamlined the original entities by removing redundant attributes For instance, the employee's name can be derived from the Social Security number, eliminating the need to retain it in the relevant entity Additionally, since certain fields are identical across multiple entities, we have opted not to include them again, ensuring a more efficient and coherent data structure.

So, in summary, our relational database would fnally develop to (with- out the data): aaaa

AaaaaAA AaaaaaAA AaaaaAA AaaaaaaaaaAA AaaaaaaAA AAaaaaaaAA aaaaa

AaaaaAA AaaaaAA AaaaaAA AaaAaAaaaAA aaaaaa

Tis ends our case study

298 • Database Design Using ER Diagrams

Connolly, T., Begg, C., and Strachan, A (1998) Database Systems, a Practical Approach to

Design, Implementation, and Management Harlow, UK: Addison-Wesley

Elmasri, R., and Navathe, S.B (2016) Fundamentals of Database Systems Reading, MA: Addison-Wesley

Sanders, L (1995) Data Modeling Danvers, MA: Boyd and Fraser

Teorey, T.J., Nadeau, T., and Lightstone, S.S (2005) Database Modeling and Design: Logical

Design San Francisco, CA: Morgan Kaufman.

Relational Mapping and Reverse

Introduction

This book outlines essential rules for converting entity-relationship (ER) and enhanced entity-relationship (EER) diagrams into relational databases In this chapter, we summarize these mapping rules and explore the process of reverse engineering, which involves deriving an ER/EER diagram from an existing database It is common to encounter databases that lack accompanying ER/EER diagrams.

The ER/EER diagram serves as essential documentation for databases, much like how computer programs require documentation To facilitate understanding, we have included a chapter on reverse engineering, outlining a series of steps to create an ER/EER diagram from existing tables and data.

Steps Used to Map ER/EER Diagrams

This section outlines the essential steps for mapping an ER/EER diagram to a relational database, ensuring that the resulting tables are aligned with the third normal form (3NF) While these guidelines aim to minimize redundancy, it is crucial to verify the normalization of the database afterward Adhering to the prescribed order of steps is recommended to enhance the effectiveness of the mapping process.

300 • Database Design Using ER Diagrams correct mapping to result in tables as close to 3NF as possible Here are the steps:

Step 1: Map all the strong entities in the ER diagram

We start with mapping rule 1 to map strong entities:

When mapping strong entities, create a new table for each strong entity and designate the specified key as the primary key of that table If the entity has multiple candidate keys shown in the ER diagram, select one to serve as the primary key.

To ensure accurate mapping of attributes in a strong entity, it is essential to examine the specific rules for atomic, composite, and multivalued attributes Each type of attribute has distinct mapping guidelines, starting with the rules for atomic attributes.

Mapping Rule 2—Mapping atomic attributes For entities with atomic attributes, map the entities to a table and form columns for each atomic attribute

In a relational database, it is essential for all columns to be atomic Non-atomic attributes in the data diagram must be converted to atomic form to ensure proper mapping to the relational database.

For composite attributes, we achieve atomicity by recording only the component parts of the attribute Our next mapping rule concerns com- posite attributes

Mapping Rule 3—Mapping composite attributes For entities with composite attributes, map entities to a table and form columns of each elementary (atomic) part of the composite attribute

Te mapping rule for multivalued attributes is:

Mapping Rule 4 involves handling multivalued attributes by creating a separate table dedicated to these attributes In this new table, each value of the multivalued attribute is recorded in its own row, along with the corresponding key from the original table The key for this new table is formed by concatenating the values from the original table’s key.

Relational Mapping and Reverse Engineering • 301 multivalued attribute plus the key of the owner entity Remove the multivalued attribute from the original table

At the end of this step, all the strong entities should be mapped

Step 2 Map all the weak entities in the ER diagram

For weak entities, we use mapping rule 11

Mapping Rule 11 involves creating a new table for each weak entity in a database Similar to strong entities, it is essential to include all atomic attributes from the weak entity within this new table.

When dealing with composite attributes, it is essential to include only the atomic components and to appropriately qualify them to preserve information To establish a relationship between a weak entity and its owner, the primary key of the owner entity must be included in the weak entity table Consequently, the primary key of the weak entity table will be formed by concatenating the weak entity's partial key with the primary key of the owner entity.

In cases where a weak entity possesses other weak entities, it is essential to first map the weak entity that is connected to the strong entity The key for the weak owner entity must be established prior to mapping the weaker entity, which is the one most distant from the strong entity.

After mapping the strong entities, the next crucial step is to map the weak entities, as their keys consist of the strong entity's key combined with the weak entity's partial key Once both strong and weak entities are successfully mapped, the focus shifts to mapping the relationships between them.

When mapping relationships, it's most effective to start with binary M:N relationships At this stage, ensure you have tables for all strong and weak entities The next step is to incorporate attributes into these tables or to create new tables to accommodate the relationships.

Mapping Rule 5—Mapping M:N relationships For each M:N rela- tionship, create a new table (relation) with the primary keys of

In database design using ER diagrams, when establishing a many-to-many (M:N) relationship between two owner entities, a new table is created to represent this relationship The primary key of this new table consists of the concatenated keys from the owner entities involved Additionally, any attributes associated with the M:N relationship should be included in this new table to ensure comprehensive data representation.

Next, we will map the binary 1:1 relationships Mapping 1:M or 1:1 rela- tionships depends on participation constraints Most of the rules that fol- low involve either:

(a) Adding an attribute (a foreign key) to a table created by a previous mapping rule

(b) Adding a new table in a process similar to mapping M:N relationships

Mapping Rule 6 addresses the handling of binary 1:1 relationships where one side has full participation and the other has partial participation In such cases, the primary key from the side with partial participation should be stored as a foreign key on the side with full participation This foreign key is not underlined and should be accompanied by any attributes related to the relationship, which are included in the same table where the key is added.

Mapping Rule 7—Mapping binary 1:1 relationships when both sides have partial participation constraints

When both sides have partial participation constraints in binary 1:1 relationships, the relationships can be mapped in one of two ways:

Mapping Rule 7A Select either one of the tables to store the key of the other as a foreign key

Tis choice depends on semantics Perhaps a safer choice for mapping this type of relationship is rule 7B:

Mapping Rule 7B Depending on semantics, you can create a new table to house the relationship that would contain the key of the two related entities

Relational Mapping and Reverse Engineering • 303

When mapping binary 1:1 relationships with full participation constraints on both sides, it is essential to determine which table should contain the key of the other based on the semantics of the relationship If the decision is ambiguous, it is advisable to follow mapping rule 7B, which suggests creating a separate table to manage the relationship effectively.

Now that the M:N relationships and binary 1:1 relationships have been mapped, the next step will be to map the common binary 1:N relationships

Mapping Rule 9—Mapping binary 1:N relationships when the N side has full participation Include the key of the entity on the 1 side of the relationship as a foreign key on the N side

Mapping Rule 10 addresses binary 1:N relationships where the N side has partial participation, treating it similarly to a binary M:N relationship In this case, a separate table is created to manage the relationship, with the new table's key formed by concatenating the keys of the related entities Additionally, any attributes pertaining to the relationship should be included in this new table.

Partial participation in database relationships can lead to null values, particularly when a key from the one side is placed in the many side table, resulting in incomplete data To avoid this issue, it's advisable to establish a separate table for the one-to-many (1:N) relationship, which helps eliminate null entries Additionally, when examining 1:N relationships, it's important to note that an M:N relationship can be transformed into two 1:N relationships, ultimately yielding the same set of tables as those derived from 1:N mappings.

Our next step would be to map recursive relationships

Mapping Rule 12—Mapping 1:N recursive relationships Reinclude the primary key of the table with the recursive relationship in the same table, giving it some other role name

Mapping Rule 13—Mapping M:N recursive relationships Create a separate table for the relationship (as in mapping rule 5)

We will use mapping rule 14 to map n -ary relationships

Mapping Rule 14—Mapping n -ary relationships For each n -ary relationship, create a new table In the new table, include the keys

Reverse Engineering

After establishing a methodology for creating ER/EER diagrams and mapping them to relational databases, we now focus on reverse engineering—transforming a relational database back into an ER/EER diagram In many real-world scenarios, we encounter databases without any accompanying diagrams that illustrate their development process A reverse engineered diagram (RED) is beneficial for several reasons, including enhancing understanding of the database structure and facilitating better database management.

The reverse-engineered diagram offers a comprehensive grammatical and visual representation of the database, helping users grasp the overall structure often overlooked By translating data and tables into a diagram, we can articulate the database's meaning more effectively With the Entity-Relationship (ER) or Enhanced Entity-Relationship (EER) diagram, alongside its grammatical representation, we can enhance the database while preserving its significance Additionally, the ER diagram significantly aids in formulating queries, streamlining database interactions.

Reverse engineering a relational database involves retracing the steps from the top down to identify the diagram that could have been used in its creation This process is best initiated by identifying strong entities and subsequently filling in the remaining components of the database However, it's important to note that there are specific considerations to keep in mind during this approach.

Relational mapping and reverse engineering can be valuable tools for analyzing databases, particularly when they are not in Third Normal Form (3NF) By employing reverse engineering techniques, one can identify the sources of redundancy within the database and propose necessary modifications to enhance its structure and efficiency.

We propose a process for creating ER diagrams that includes drawing the diagram, adding English descriptions, and presenting the results to users Once the reverse engineering is complete, the final diagram will be accompanied by an English version for user review This reverse engineering process mirrors the original elucidation by involving user feedback to ensure accuracy The provided rules serve as guidelines to transition from an absence of an ER diagram to a well-documented database These rules should be viewed as flexible starting points for negotiating the development of a documented database.

The evolution of the ER diagram, along with potential changes in the database, will be influenced by user input in the final product Different mapping strategies can be applied to superclasses and subclasses, and it may become evident that an alternative mapping could be more effective than the current one Without an original ER diagram, various scenarios can emerge when the overall concept is revealed.

We suggest the following rules to afect reverse engineering (Figure 11.1)

11.3.1 Reverse Engineering Rule 1 Develop Strong Entities

When designing an ER diagram for tables with a single-attribute key, create a strong entity labeled R that encompasses all attributes of the table For instance, if the table is represented as R(aaAaaAaaAaaAa) with 'a' as the key, establish a strong entity named R and include both 'a' and the other attributes (aaAaaAaaAaAa) within the entity, clearly indicating 'a' as the key attribute.

308 • Database Design Using ER Diagrams a

AaA AaAA AAaAA AAaAA AAaAA

11.3.2 Reverse Engineering Rule 2 Look for

When identifying additional strong entities, it is essential to examine the tables for foreign keys Extract these foreign keys from the existing entity diagram to establish relationships between the entities The presence of foreign keys in a strong entity typically suggests a one-to-many (1:N) relationship, where N represents multiple occurrences.

When creating an ER diagram involving two strong entities, such as tables R and S, it is essential to recognize the relationships between their keys If table S has a key attribute 'a' that is also present in table R as a foreign key, you should exclude 'a' from the diagram for table R Instead, establish a relationship between R and S, designating the side with the foreign key as the N side in a 1:N or 1:1 relationship.

Here is another example of this situation:

A database has a list of items ordered by some customer:

And, you have another table for orders like this:

The "aaaaaa" in the ITEM table serves as a foreign key, linking it to the ORDER table, where it acts as a unique identifier for the ITEM table The corresponding ER diagram for this database is depicted in Figure 11.2B.

To accurately define cardinality and participation constraints in relationships, it's essential to analyze both the semantics of the database and the data itself The structure of the tables can provide insights; for instance, if tables R and S suggest a relationship, it is likely N:1, with R as the N side due to its foreign key Examining the data for the number of child entries associated with parent entries can reveal whether participation is partial or full Additionally, identifying null values in the data can further indicate the cardinality and participation constraints.

Relational Mapping and Reverse Engineering • 309

310 • Database Design Using ER Diagrams

Another Example of Reverse Engineering 1:N Relationships

Relational Mapping and Reverse Engineering • 311 use the word indicate because only the true (albeit unknown) semantics and the data itself would “prove” the full/partial participation

11.3.3 Reverse Engineering Rule 2a Check for

In an ER diagram, when a foreign key is removed from entity R because it serves as the key for entity S, it is essential to evaluate whether the remaining attributes in R should remain with entity R, be associated with the relationship RS, or be transferred to entity S This assessment is particularly important during the reverse mapping of a 1:x relationship, as it is possible that an attribute from the 1:x relationship was initially linked to the foreign key in the original ER diagram, or that it was part of the relationship itself.

When determining the appropriate placement of a remaining attribute, assess whether it is primarily defined by the key of an entity; if so, assign the attribute to that entity Conversely, if the attribute's identity relies on both keys, it should be categorized under the relationship RS.

In the discussed scenario, we eliminated the attribute d from the entity R in the ER diagram since d serves as the key for entity S However, after establishing S, we realized that the attribute e is meaningful only when defined in relation to both a and d, which are the keys for entities R and S, respectively.

This would imply that e was an intersection attribute on the relationship between R and S and hence would be depicted as such, as shown in Figure 11.3A

In the ORDER database example, if items ordered are uniquely identified by item_no but are ordered multiple times, the date attribute must be specified using both item_no and order_no Consequently, this adjustment transforms Figure 11.2B into Figure 11.3B.

Tis concludes the reverse mapping of obviously strong tables We now look for weak tables and multivalued attributes

11.3.4 Reverse Engineering Rule 3 Look for Weak

Examine the tables for any concatenated keys to see whether they contain any of the keys of the strong entities If they do, this could indicate a weak

312 • Database Design Using ER Diagrams

Relational Mapping and Reverse Engineering • 313

Reverse Engineering 1:N Relationships with Relationship Attributes

314 • Database Design Using ER Diagrams entity, a multivalued attribute, or a table resulting from M:N or higher- order relationship Which of these it is may depend on non-key attributes

11.3.5 Reverse Engineering Rule 3a Checking for Weak Entities

Chapter Summary

This chapter summarizes the mapping rules for converting ER and EER diagrams into relational databases, which were developed throughout the book Additionally, it introduces a comprehensive set of rules for reverse engineering ER and EER diagrams from existing relational databases.

Relational Mapping and Reverse Engineering • 323

Come up with an ER diagram for the following relational database:

Come up with an ER diagram for the following relational database:

Come up with an EER diagram for the following relational database:

324 • Database Design Using ER Diagrams

Come up with an EER diagram for the following database:

Elmasri, R., and Navathe, S.B (2016) Fundamentals of Database Systems Redwood City,

A Brief Overview of the Barker/Oracle-Like

Introduction

The entity-relationship (ER) diagram has several variations, one of which is the Barker model introduced by Richard Barker in 1990 This model was later adopted and slightly modified by Oracle Corporation In this chapter, we will explore the conventions of the Barker/Oracle model as they relate to our ER Design Methodology, highlighting its key concepts.

The combined Barker/Oracle model serves as a simplified version of both ER diagrams, facilitating a smooth transition to the more intricate designs of either system This model is not intended to serve as a comprehensive guide to the pure methodologies of Barker or Oracle, but rather highlights the minimal adjustments needed to adapt between the two.

This chapter aims to explore various models, such as Barker and Oracle, not to promote one over another but to understand how users perceive their data By examining different models, analysts can better clarify user requirements and preferences Ultimately, this discussion offers a fresh perspective on the same issue, highlighting the essential needs for effective data mapping to a relational database.

The Barker/Oracle-like model is of significant interest due to its prevalence in Oracle literature and its distinct approach to database design Unlike the Chen-like model, which primarily emphasizes data modeling, the Barker/Oracle-like model integrates data adaptation with relational database design, offering a unique perspective Understanding this model requires a solid grasp of relational database theory, highlighting its complexity We present the Barker/Oracle-like model here to showcase its differences and advantages over traditional modeling approaches.

326 • Database Design Using ER Diagrams

The ER design methodology for the Barker/Oracle-like model differs significantly from the Chen-like model, as it does not incorporate certain conventions such as composite attributes, multivalued attributes, or weak entities Instead, these concepts are addressed in the context of the relational model Due to its close alignment with the relational model from the outset, the mapping rules for the Barker/Oracle-like model are straightforward and occur directly within the diagram itself.

A First “Entity-Only” ER Diagram

We begin by creating an initial "entity-only" ER diagram using the Barker/Oracle-like model, focusing on a primary entity from a student information database: STUDENT This entity represents the key information we aim to store, highlighting the definition of an entity, while temporarily excluding any consideration of additional entities.

In the STUDENT entity, we identified five key attributes that define its characteristics Each attribute has been assigned a generic label, ensuring clarity and consistency in the data structure These attributes encompass essential information that contributes to a comprehensive understanding of the student profile.

We begin our venture into the Barker/Oracle-like model with Figure 12.1A

The Barker/Oracle-like model employs soft boxes to represent entities, with the entity names displayed in capital letters A distinct line separates the entity names from their attributes, which are listed in lowercase This structured approach facilitates clear identification and organization of data within the model.

Barker/Oracle-like Notation of ER Diagram with One Entity and Five Attributes

Overview of Barker/Oracle-like Model • 327

The Barker/Oracle-like notation for an Entity-Relationship (ER) diagram presents a single entity accompanied by five attributes, each with specified datatypes Unlike the Chen-like model, which utilizes ovals to represent attributes, this model organizes the attributes in a list directly beneath the entity name, as illustrated in Figure 12.1A.

Figure 12.1B presents an ER diagram featuring a single entity, STUDENT, along with its attributes, aaaaaAaaaaaaaaAaaaaaaaAaaaaa and aaaaa Additionally, in the Oracle-like variant of the Barker/Oracle ER diagram, the associated data types are specified.

Attributes in the Barker/Oracle-Like

In a Barker/Oracle-like model, all attributes are treated as simple or atomic, similar to those in relational databases, and the model lacks the concept of composite attributes To represent parts of composite attributes, our adaptation utilizes a dot (.) notation, as illustrated in Figure 12.2.

When designing a database, it is crucial to determine if an entity can have unknown values for its attributes For instance, in the STUDENT entity, certain attributes may be mandatory, requiring data entry, while others can be optional, allowing for blanks This distinction between mandatory and optional fields is essential for effective data management.

A missing value is called a null Hence, the mandatory attribute is said to be not null Not null means in no occasion would there be an instance of the

328 • Database Design Using ER Diagrams

Barker/Oracle-like Notation of an ER Diagram with a Composite Attribute: aaaa

The Barker/Oracle-like notation for ER diagrams effectively distinguishes between primary keys or unique identifier attributes and their mandatory or optional nature In this model, optional attributes are represented without a "not null" indication, while mandatory attributes are explicitly labeled with the phrase "not null." It's important to note that while a mandatory attribute can serve as a key, it is not required to do so Unlike the Barker/Oracle model, the Chen-like model typically does not explicitly indicate the mandatory and optional status of attributes.

In the Barker model, a primary key is denoted by a "#" symbol preceding the attribute name, as illustrated in Figure 12.3 While a primary key must be a mandatory attribute within a relational database, it is important to note that not all mandatory attributes serve as unique identifiers.

1 What do mandatory attributes (in the Barker/Oracle-like model) translate into in the Chen-like model? Discuss with examples

Overview of Barker/Oracle-like Model • 329

2 What do optional attributes (in the Barker/Oracle-like model) translate into in the Chen-like model? Discuss with examples

3 How are the primary keys shown diagrammatically in the Barker/ Oracle-like model?

Relationships in the Barker/Oracle-Like

In the Barker/Oracle-like model, relationships between entities are depicted as lines connecting them, without the use of diamonds that characterize the Chen-like model Instead, relationship phrases are positioned in lowercase near the corresponding entity ends For instance, the relationship from the STUDENT entity to the SCHOOL entity can be informally described to illustrate their connection.

And, from the other direction, from the SCHOOL entity to the STUDENT entity, we would say,

SCHOOL S are attend ed by STUDENT S.

Structural Constraints in the Barker/ Oracle-Like Model

In Barker/Oracle notation, a cardinality of 1 is represented by a single line connecting entities As illustrated in Figure 12.5, the connection between the STUDENT and AUTOMOBILE indicates a 1:1 relationship, meaning each student is associated with only one automobile, and each automobile corresponds to only one student.

The dashed line in the diagram indicates that the STUDENT entity has optional participation in the relationship, while the AUTOMOBILE entity has mandatory participation.

330 • Database Design Using ER Diagrams

Barker/Oracle-like Notation of the STUDENT Entity with a Relationship to the SCHOOL Entity

An enhanced grammar from the STUDENT entity to the AUTO- MOBILE entity would be:

A STUDENT may drive one and only one AUTOMOBILE

And, from the AUTOMOBILE entity to the STUDENT entity it would be:

An AUTOMOBILE must be driven by one and only one STUDENT

In the context of entity-relationship modeling, a solid line extending from the AUTOMOBILE entity indicates mandatory participation in a relationship, while a dashed line from the STUDENT entity signifies optional participation.

As another example, refer to Figure 12.6

Overview of Barker/Oracle-like Model • 331

A 1:1 Relationship in the Barker/Oracle-like Notation

STUDENT s must occupy DORM s, but a DORM may be occupied by many STUDENT s

In database design, a cardinality of M (many) is represented by a "crow's-foot" notation that connects to the corresponding entity For instance, Figure 12.6 illustrates a one-to-many (1:M) relationship between the DORM and STUDENT entities, highlighting how multiple students can be associated with a single dormitory.

A DORM may be occupied by zero or more STUDENTs. and

A STUDENT must occupy one and only one DORM

332 • Database Design Using ER Diagrams

A 1:M Relationship in the Barker/Oracle-like Notation

1 How is the optional relationship shown diagrammatically in the Barker/Oracle-like model?

2 How is the many relationship shown diagrammatically in the Barker/Oracle-like model?

3 Show the following using the Barker/Oracle-like notation: a A movie theater must show many movies, and movies must be shown in a movie theater b A movie theater may show many movies, and movies may be shown in a movie theater.

Dealing with the Concept of the Weak

ENTITY IN THE BARKER/ORACLE-LIKE MODEL

Te Barker or Oracle-like models do not have a concept of the “weak entity,” and the weak entity notation is also not used in Oracle literature

Overview of Barker/Oracle-like Model • 333

Unique Identifer (to Include a Weak Entity-Relationship) Shown by Placing a Bar Across the Contributing Relationship Line(s)

In our exploration of unique identifiers within relationships, we expand the concept to incorporate weak entities In the Barker/Oracle-like model, a unique identifier is visually represented by a bar that intersects the contributing relationship, as illustrated in Figure 12.7 To uniquely identify a dependent, the employee's Social Security number is required, indicating that the DEPENDENT entity cannot exist independently and is therefore classified as a weak entity This weak entity will be mapped according to the mapping rules outlined in Chapter 7.

Dealing with the Concept of Multivalued

Barker or Oracle-like models lack the concept of "multi-valued" attributes, which are essential for representing scenarios where an entity can have multiple values For instance, a student may have attended several schools, illustrating the need for multivalued attributes in data modeling.

334 • Database Design Using ER Diagrams

Te Multivalued Attribute and Te Foreign Key

In the Barker/Oracle-like model, the foreign key is shown in the appro- priate entity, whereas in the Chen-like model, foreign keys may not be

“discovered” until the database is mapped and normalized In this model, a foreign key is identifed with an asterisk (*) in front of the attribute (see

Overview of Barker/Oracle-like Model • 335

Sample Data for aaaaaaa in Figure 12.8 aaaaa aaaaa AaaaaaaaAA

Sumona Gupta 111 Mirabelle Circle, Pensacola, FL Tom Smith 198 Palace Drive, Mobile, AL

Tony Jones 329 Becker Place, Montgomery, AL Sita Pal 987 Twin Lane, North Canton, OH

Neetu Singh 109 Bombay Blvd, Calicut, CA

Sample Data for aaaaaa in Figure 12.8 aaaaa

Sumona Gupta Ferry Pass Elementary

Tom Smith Mobile Middle School

Sita Pal Tagore Primary School

Sita Pal Nehru Secondary School

Figure 12.8) An instance of this database shown in Figure 12.8 is pre- sented in Tables 12.1 and 12.2

1 Does the Barker-like model or the Oracle-like model have the concept of the weak entity? Discuss

2 Show the following using the Barker/Oracle-like notation: For a student, we are trying to store the student’s name, address, phone, books (that is, books the student borrows from the library) Map this to a relational database and show some sam- ple data

3 Does the Barker/Oracle-like notation have the concept of the multivalued attribute? Discuss

336 • Database Design Using ER Diagrams

Treatment of Foreign Keys

In the updated Barker model, foreign keys are now indicated with an asterisk, unlike the original version where they were not marked Conversely, the Oracle model incorporates foreign keys directly within the corresponding relations.

A STUDENT may drive one and only one AUTOMOBILE

An AUTOMOBILE must be driven by one and only one STUDENT

In our Barker/Oracle-like model, the primary key from the STUDENT relation, denoted as aaaaaaaA aaaaaaA, is included in the AUTOMOBILE relation, representing the N side of the relationship We indicate the foreign key by preceding it with an asterisk, as illustrated in Figure 12.9.

In the Barker/Oracle-like model, recursive relationships are illustrated in Figure 12.10, where a dotted line indicates an optional relationship, a solid line represents a mandatory relationship, and a "crow's-foot" denotes a many relationship The relationships are clearly labeled for better understanding.

Figure 12.10 shows an EMPLOYEE may supervise other EMPLOYEE s, and an EMPLOYEE may be supervised by one and only one supervisor

EMPLOYEE Note the foreign key aaaaaaaaa in the EMPLOYEE relation.

Mapping M:N Relationships

In the Barker/Oracle-like model, a key aspect is the handling of M:N relationships, which are transformed into two 1:M relationships This approach ensures clarity and simplifies the representation of complex relationships within the model.

Overview of Barker/Oracle-like Model • 337

Barker/Oracle-like Notation Showing Foreign Key intersection entity in the middle In the Chen-like model, the M:N may be presented as two 1:M relationships

Figure 12.11 is an example of an M:N relationship in the Chen-like format

In the Barker/Oracle-like model, this would be shown as in Figure 12.12

338 • Database Design Using ER Diagrams

Barker/Oracle-like Notation Recursive Relationship

An ER Diagram of an M:N Relationship in the Chen-like Model

Overview of Barker/Oracle-like Model • 339

Barker/Oracle-like Notation of an M:N Relationship Broken into Two 1:M Relationships

340 • Database Design Using ER Diagrams

1 How are recursive relationships shown in the Barker/Oracle-like model?

2 Why is it difcult to show M:N relationships in the Barker/ Oracle-like model?

3 How are the foreign keys treated in the Barker/Oracle-like model?

Chapter Summary

This chapter provides an overview of the key features of the Barker/Oracle-like model, highlighting the one-entity diagram with attributes and the distinction between optional and mandatory attributes It briefly addresses relationships and structural constraints within this model, noting that while the Barker/Oracle-like notation does not explicitly incorporate weak entities and multivalued attributes, these concepts can still be represented diagrammatically An example of a recursive relationship in the Barker/Oracle-like model is illustrated, and the chapter concludes by explaining how to convert an M:N relationship into two 1:M relationships, along with the relevant mapping rules for the Barker/Oracle-like notation.

Exercises

Redraw Figure 8.17a using the Barker/Oracle notation Map this to a rela- tional database and show some sample data

Redraw Figure 8.7 using the Barker/Oracle notation Map this to a rela- tional database and show some sample data

Overview of Barker/Oracle-like Model • 341

Barker, R (1990) Case*Method, Entity-Relationship Modeling Reading, MA: Addison Wesley

Hay, D.C (1996) Data Model Patterns New York: Dorset House

Rodgers, Ulka (1991) ORACLE: A Database Developer’s Guide Englewood Clif s, NJ:

Siau, K (2004) Advanced Topics in Database Research Hershey, PA: Idea Group

Attribute: Data pertaining to an entity For example, if an entity is

EMPLOYEE, attributes of the entity might be aaaaaAaaaaaaaaA aaaaaaaaAaA

Binary relationship: A relationship between two entities

Candidate key: An attribute or set of attributes that uniquely identif es individual occurrences of an entity type (uniquely identifes a row in a relational table)

The cardinality ratio defines the relationship between two entities by indicating how many instances of one entity are associated with another For example, in a dormitory setting, a single dorm room can accommodate multiple occupants, illustrating a one-to-many (1:M) relationship between DORM and OCCUPANT.

Category: A subclass of a superclass from which the subclass may inherit attributes of the superclass; also referred to as union type of super- class to subclass

Composite attribute: An attribute composed of multiple components, each with an independent existence Example: the composite attribute aaaa is usually stored as a aaaaaaAaaaaAaaaaaaaaaaA aaaaaaa T e name is called a “qualif er.”

Database: A shared collection of logically associated or related data Degree of a relationship: Te number of participating entities in a relationship.

Derived attribute: An attribute where a value is calculated or derived from other data Example: Aaaa is derived from Aaaaaaaaa.

An entity refers to an important element in the real world that holds significance for a user and requires representation in a database for information recording Entities can possess a physical presence, like a student or a building, or they may exist conceptually, such as a course.

Entity set: A collection of all entities of a particular entity type

Entity type: A set of entities of the same type

First normal form (1NF): A table arrangement whereby the domain of all attributes in the table must include only atomic (simple, indivis- ible) values

Foreign key: An attribute referring to a primary key of another table A foreign key is how relationships are implemented in relational databases.

Full participation occurs when every member of a set is involved in a specific relationship Additionally, functional dependency describes the connection between two attributes within a relation, where attribute Y is determined by attribute X This means that for each unique value of X, there is a consistent corresponding value of Y.

Generalization: Te process of minimizing the diferences between enti- ties by identifying their common features and removing the com- mon features into a superclass entity

Hierarchical model: All data are arranged in a top-down fashion All relationships have the cardinality one to many (1:M).

Identifying owner: Te strong entity (the owner) on which a weak entity is dependent.

Identifying relationship: A relationship of a strong/weak entity where by the weak entity is identifed by the owner entity

Key: An attribute that uniquely identifes a row of a table

Mandatory relationship: All of one entity set must participate in a rela- tionship Also known as full participation

In a many-to-many relationship, multiple rows in one table correspond to multiple rows in another table, exemplified by the scenario where numerous CUSTOMERS purchase various PRODUCTS This type of relationship often includes intersection data, such as pricing information or the date of sale.

Many to one: Many rows of one table can be related to one only one row of another table Example: Many PATIENTs are examined by one

DOCTOR or many STUDENT s attend one SCHOOL

Mapping: Te process of taking an agreed-to conceptual model (the ER dia- gram) and creating tables in a database based on the ER diagram

Multivalued attribute: An attribute that may have multiple values for a single entity

One to many: A relationship where one row of a table can be related to more than one row in another table Example: An EMPLOYEE may have many DEPENDENT s.

One to one: A relationship where one row of a table can be related to only one row of another table Example: An EMPLOYEE possesses one ID_CARD.

Optional participation: A relationship structural constraint specifying a relation may exist Example: A PERSON may be a fan of a base- ball TEAM

Participation constraint (also known as optionality): Determines whether all or some of an entity occurrence is related to another entity Example: A CUSTOMER in a restaurant may buy WINE

Primary key: A unique identifer for a row in a table in a relational data- base; a selected candidate key of an entity

Recursive relationship: A relationship among entities in the same class

Example: An EMPLOYEE is supervise d by another EMPLOYEE

A relation is defined as a populated table that contains unique, single-value entries without any duplicate rows In this table, the meaning of each column remains consistent across all rows, and while the arrangement of rows and columns is flexible, it is fixed once established and does not change thereafter.

Relationship: An association between entities

Reverse engineering: Te process of going from relational tables to a logi- cal model (or ER diagram) A backward mapping process used for documenting an existing database

Second normal form: A relation that is in frst normal form and in which each non-key attribute is fully functionally dependent on the pri- mary key.

Shared subclass: A subclass of a superclass where the shared subclasses all inherit the same attributes from the superclass and locally def ned attributes are shared as well

Simple attribute: Attribute that will always be composed of a single value. Specialization: Te process of maximizing the diferences between mem- bers of a superclass entity by identifying their distinguishing characteristics

Specialization hierarchy: A subclass inheriting from only one superclass Specialization lattice: A subclass having more than one subclass as its parent

Strong entity: An entity that is not dependent on another entity for its existence

Structural constraints define the relationship between entities by specifying the quantity of one entity related to another, such as one or more They also indicate whether these entities are required to have a relationship or not Together, the cardinality ratio and participation constraints form what is known as "structural constraints."

Subclass: An entity type that has a distinct role and is also a member of a superclass

Superclass: An entity type that includes distinct subclasses that are required to be represented in a data model

A table is a structured representation of data that consists of one or more columns, serving as an implementation of an entity Unlike a relation, which is a populated table, a table provides a tabular view that organizes and holds data effectively.

Third Normal Form (3NF) is a database normalization standard that builds upon Second Normal Form (2NF) It requires that no non-key attribute is functionally dependent on another non-key attribute, effectively eliminating transitive dependencies within the relation This ensures that the database structure is efficient and reduces redundancy, leading to improved data integrity.

Union type: A subclass having more than one superclass from which it may inherit; also referred to as a category

Unique identif er: Any combination of attributes or relationships that serves to uniquely identify an occurrence of an entity (a row in a table)

Waterfall model: A series of steps in sofware development where each step in the process is not repeated once agreed upon T e process fows one way.

Weak entity: An entity that is dependent on some other entity for its existence

A abstraction, describing databases, 72 address attribute, 81 aggregate, 81 analyst, 4 anomalies, in table maintenance,

48 – 49 , 55 atomic (simple) attributes about, 79–80, 87 in Barker/Oracle-like model, 327 in classes, 267 def ned, 34 mapping into relational database,

The article discusses various attributes in database models, including address attributes (94–97), atomic attributes (81), and their definitions within the Barker/Oracle-like model (326–328) It highlights the importance of composite attributes, derived attributes (82, 88), and the concept of concatenated keys (41, 49, 67) Additionally, it covers the transition of attributes into entities (108), the role of mandatory and optional attributes, and the principles of inheritance (265–266, 270, 286–288) The article also addresses the significance of joining attributes and the intersection of data (202, 226, 240, 242).

The Oracle-like model encompasses various concepts in relational database theory, including multivalued attributes and primary keys, which serve as unique identifiers It emphasizes the importance of atomic (simple) attributes and their evolution into entities, highlighting the properties of relationships within the database framework Additionally, the augmentation rule plays a crucial role in understanding these attributes and their sets.

The Barker/Oracle-like model, discussed on pages 325 to 341, includes essential concepts such as attributes (326–328), cardinality (329–331), and multivalued attributes (333–335) It introduces the first "entity-only" ER diagram (326–327) and addresses the significance of foreign keys (336) in database design Additionally, the model provides exercises (340) for practical application and explores the mapping of M:N relationships (336–340) It also clarifies the distinction between null and not null values (327–328) and the differences between optional and mandatory attributes.

The article discusses various aspects of database design, including mandatory and optional participation in relationships, recursive relationships, and structural constraints It highlights the concept of weak entities and the importance of Boyce-Codd Normal Form (BCNF) in ensuring database integrity Additionally, it covers the mapping of binary relationships to relational databases, including patterns observed during reverse engineering The article also addresses the significance of attributes, such as the birthdate attribute, in the context of database management.

348 • Index bottom up design approach, 269

Boyce–Codd Normal Form (BCNF) is crucial for ensuring data integrity in relational databases, addressing unique identifiers known as candidate keys Understanding cardinality within the Barker/Oracle-like model is essential, particularly in defining one-to-many relationships and their corresponding ratios In the context of reverse engineering, it is vital to recognize structural constraints, especially when dealing with weak and strong entities The cardinality ratio further informs the design of Entity-Relationship (ER) diagrams, which can be enhanced through case studies that explore design extensions for binary relationships Comprehensive design of ER diagrams, particularly beyond the first entity, is fundamental for effective database modeling, with the Enhanced Entity-Relationship (EER) model providing advanced methodologies for relationship and structural constraint analysis.

172–178 weak entities, 194–199 categories or union types, 286 – 292 def ned, 286–287 inheritance of attributes, 286–288 mapping when superclasses have diferent primary keys, 291–292,

305 mapping when superclasses have same primary keys, 291, 305 participation ratios, 288–290 in reverse engineering, 321–322

Chen-like model cardinality of relationship, 133, 135,

137 compared with Barker/Oracle-like model, 325–326

ER diagram, 76–79, 84, see also ER diagrams multivalued attribute, 84 relationship for new entity, 109, 114 relationships of relationships, 252 unique identif ers, 84 weak entities, 182 child class, in parent-child hierarchical relationship, 20 – 25 , 268 – 269 class hierarchies, 265 – 267

COBOL, 16 – 17 columns, 35 , 94 composite attributes, 80 – 81 in Barker/Oracle-like model, 327 in Chen-like model, 84 defnition of, 80–81 grammar for, 88 mapping to relational database, 96, 300 concatenated key, 41 , 49 , 67 concatenation, 41 , 206 conceptual model, 72 coral rings, 26

Data is defined as a collection of related information, which can include customer files, records, and various data models essential for effective storage and retrieval The evolution of data management has transitioned from traditional 3 × 5 cards to modern computer systems, highlighting the importance of database design, including fields of data and foreign keys Understanding different database models, such as network and relational models, is crucial for database construction, which is guided by software engineering processes Proper file design and the utilization of software packages play a significant role in building a robust database system.

The software engineering process encompasses various critical aspects, including the design and specification of databases, which are essential for effective data management Key components such as entity relationship diagrams play a vital role in the software engineering life cycle, while database models, including hierarchical, network, and relational models, provide foundational structures for data organization Understanding cardinality and relationship terminology is crucial for accurate data representation Additionally, the article discusses the importance of data modeling schemas and the principles of decomposition in database design It also highlights the significance of defining predicates and the various design approaches, such as top-down and bottom-up methodologies, in developing robust database systems.

ER diagram, beyond frst entity, 111,

ER diagram with binary relationships, extensions in design, 202, 207, 223,

176 ternary and higher-order ER diagrams,

248, 260–261 weak entities, 189–190, 195 disjoint constraint, 272 – 274 in mapping generalizations or specializations, 277, 280–282, 304 in reverse engineering, 319–321 double augmentation rule, 64

Ngày đăng: 08/05/2024, 08:36