Design Considerations

Avoid

Redunancy

We don't want the same attributes to appear in multiple places.

  • Wastes space

  • Makes it hard to keep things consistent

Incompleteness

Everything that's important should be expressed. In other words, we want every important attribute of some element of the data to be represented exactly once somewhere in our diagram.

Example: Person-Sit-Seat

Suppose that we're trying to model the idea of each person sitting in a seat. We would have seats that are characterized by their rows and columns.

Redundancy: include seat_num in Person as an attribute

Incompleteness: omit Seat and Sit, including info about seat that person is sitting in as attributes of Person

Including row, column in Person:

  • muddies the fact that the row and the column are about the seat, not the person who happens to be sitting in it.

  • makes it harder to ask questions about seats themselves. We can't have extra information about the seats.

  • can't model other aspects of seats or other relationships involving seats. For example, it's harder to model to have constraints about the seats.

  • can't model data about seats that no one is sitting in. In other words, if there are seats that no one sitting in, I wouldn't have any information about them at all in this model.

Better:

Design issues

Entity set vs. attribute

So then some of the issues that come up are there's something that I want to represent should be an entity set or is it enough for it to just be an attribute of some other entity set?

  • Does it have further attributes?

  • Will it be used as participant in relationship set?

Relationship set vs. attribute

  • Does it pair (or n-tuple) elements from several entity sets?

Attributes of relationship sets

  • Is it an attribute of one of the participating entity sets or of the pairings between entities?

Example: Person-Has-Phone

For example, think about a a person's phone. For now, let's just assume each person has one phone.

  • It's fine if we only want to associate Person's phone number with the Person

  • What if we want to also know other things about phones? Then it would be better to have the phone number as a separate attribute of the relationship set of which person has which phone.

Notice what I don't do is repeat a phone number

Design Process

Example: customers buy items. Each item has unique id, description, and baseprice. Each customer has unique id, name, etc. There may be a discount on a purchase.

  • Nouns are represented by entity sets.

  • Verbs/prepositions are represented by relationship sets.

Some possible ER diagrams for the example

If the same person wants to buy the same object multiple times, that's not represented here. In other words, customer only can buy one item of given itemID.

To avoid confusion later on, let's make it a customer ID and an item ID.
  1. We can add quantity attribute to Buy relationship set. However, this one also can't represent multiple separate purchaases.

  2. We can make Purchaase an entity set:

  1. We can make ternary relationship set instead of having Purchase associated with using two different relationship sets to connect the Purchase with the customer and with an item.

Back to Primary Keys

Recall: PK of an entity set is a collection of attributes that uniquely identifies elements of the entity set. Relationship set also has a primary key, composed of the union of the primary eys of the participating entity set.

Similar to what we did with people purchasing multiple items, if we want to allow multiple viewings, we could do that with making this into a ternary relationship set or having a separate entity set for viewings. We can represent the venue where they saw the movie and other things such as date and time as well . Then we could distinguish different viewings of the same movie.

Weak Entity Sets

Sometimes there isn't a suitable choice for primary key for an entity set without associating it with another entity set:

Some situations where this often happens:

  1. Want to represent an entity set and also partition it into subgroups. In other words, we have some entity sets, but we'd also like to partition it into different kinds of instances of the same entity set and different subgroups.

  2. Want to represent multiple copies or instances of entities. This is the most common situation where we want to use a weak entity set. Abstraction and then concrete versions of it.

  3. Want to avoid coordination that might be needed to create unique IDs

Weak Entity Set doesn't have a primary key. It has a discriminator (represented by dotted underline) and other attributes. The discriminator by itself is not unique. However, weak entity set is associated with a regular or strong entity set. Then the discriminator along with the primary key of the strong entity set is unique.

We'll have this special notation where we're going to put double boxes around the weak entity set and the relationship set between them.

Situation 1: Hat shop

Suppose I have a hat shop and I have some different kinds of hats that I'm selling. Hats have style, size, color, material, description which is based on the style (not the material), and price which depends on the material and style.

  1. Put all of those attributes as attributes of hat

I'd have the same description for all of those different materials that I have for this particular hat style, so I'm going to have some redundancy in that description.

  1. Put weak entity set

So if I know that I'm looking at baseball caps, then knowing baseball cap along with Corduroy tells me exactly what kind of cap I'm talking about. And then I would know the price of that. But just knowing corduroy by itself isn't enough.

By subdividing each style of hat into these subcategories, corduroy, canvas, cotton, etc. Now I can represent that style and its description one time.

Situation 2: Library

Let's say I have a library and in my library I have books. Each book has an ISBN number, title and multiple authors. I have multiple different copies and I'd like to know who borrowed which copy of the book, not just that they borrowed the book in general.

I'll have a separate weak entity set which could be now partially identified with its copy number - copy 1, copy 2, copy 3 etc. I might have additional information about the specific copies, like what condition they're in, when they were purchased, etc.

The copy number by itself doesn't tell me which physical book I'm talking about, but the copy number along with the ISBN of the book does uniquely identify it.

Just making a a single entity set where I throw all of the elements together makes you have a redundancy becayse the ISBN which is going to determine what the title is and who the authors are and so forth would be repeated for each particular copy of the book.

Situation 3: Student ID

I've got students and every student has a student ID. I want to be able to distinguish between two different students who have the same student ID but they are from different universities.

  • One thing I could do is coordinate all the universities in the world are going to use a universal student ID system, then I won't have any two students who have the same ID. But it's a lot of coordination.

  • Another thing is to make the student ID just a discriminator rather than a primary key of student and the student ID along with the primary key of the university that they attend is going to uniquely identify the student.

Occasionally useful to associate a weak entity set with multiple strong entity sets

There are cases where we might have two strong entity sets that we need in order to uniquely identify elements of the weak entity set.

Let's say we have different branches of the library. Brooklyn branch has copy one and Washington Square branch has its copy one with the same ISBN number. Those are two different books, and to identify a book we need to know the discriminator, which is the copy number and the book that it took the ISBN of the book, and which library branch it's coming from.

Putting that double diamond around the relationship set is important for telling us what we need as well as the discriminator in order to identify elements of the weak entity set.

Another thing that could happen is that we could have a week entity set with its discriminator 1 and some other attributes. Rather than associating it directly to a strong entity set, it could be associated with another week entity set with its discriminator 2. And then finally that's associated with a strong entity set with its primary key.

Now to uniquely identify something from weak entity set 1, we need discriminator 1, discriminator 2 and the primary key.

Last updated