thoughtSpace
TwitterGithubRSS Feed

Note Space

Hints, cheat sheets and notes on code.

Home

Data Modelling

Posted on April 2, 2022
data-modelling

. A logical data modeler designs the data model to suit business requirements, creates and maintains the lookup data, compares the versions of data model, maintains change log, generate reports from data model and whereas a physical data modeler has to know about the source and target databases properties. A physical data modeler should know the technical-know-how to create data models from existing databases and to tune the data models with referential integrity, alternate keys, indexes and how to match indexes to SQL code. It would be good if the physical data modeler knows about replication, clustering and so on.

Relational Data Modeling is used in OLTP systems which are transaction oriented Dimensional Data Modeling is used in OLAP systems which are analytically based.

Comparisons on different data models to know the difference between them or to know changes in latest version, create scripts and make neccesaary changes to db. Toad data modeler software can be used

Logical data model

• How to create entity and add definition, business rule? Qn as data modelers are we to create both the physical and logical data models or are there cases where the logical model is existent and work covers the physical modelling

Qn will there be situations where we will be working concurrently on a similar model

Qn what software should we use and what is your advice, toad vs erwin

Qn supertype and subtype relationship, a case where you have supertype employee and there's an attribute tax_scheme that can be either of 2 values.

Do you create a table tax_scheme and have it contain these 2 values and referenced by the employees table, or as done by an example, have child tables set as sub types

Tools IBM Infosphere data architect, Erwin, Toad, Aginity

• How to create domains? • How to create an attribute and add definition, business rule, validation rules like default values and check constraint? • How to create supertypes, subtypes? • How to create primary keys, unique constraint, foreign key relationships, and recursive relationships? • How to create identifying and non-identifying relationship? • How to assign relationship cardinality? • How to phrase relationship connecting two tables? • How to assign role names? • How to create key groups? • How to create sequence no’s?

Physical data model

• How to rename a table? • How to rename a column, validation rules like default and check constraints? • How to assign NULL and NOT NULL to columns? • How to name foreign key constraints? • How to connect to databases like MS Access, Oracle, Sibase, Terradata etc? • How to generate sql code from data model to run against databases like MS Access, Oracle, Sibase, Terradata etc.? • How to create a data model from an existing database like MS Access, Oracle, Sibase, Terradata etc.? • How to add database related properties to tables, indexes? • How to check different versions of the data model? • How many data modelers can concurrently work on the same version of a data model?

Identifying relationship - If a foreign key column/multiple foreign key columns referencing some table, becomes a part of the primary key, and then it is an identifying relationship. If a column (foreign key) / multiple columns (foreign key columns) referencing some table becomes a column in a table as a NON-PRIMARY KEY, and then it is non-identifying relationship.

When a Foreign key references the parent key (Primary key) of the same table, then it is called a SELF REFERENTIAL OR RECURSIVE RELATIONSHIP. The column Manager_No in the Employee table is a Foreign key which takes its values from the column Employee_No of the same Employee table.

Supertype is an entity type that has got relationship (parent to child relationship) with one or more subtypes and it contains attributes that are common to its subtypes. Subtypes are subgroups of the supertype entity and have unique attributes, but they will be different from each subtype. Supertypes and Subtypes are parent and child entities respectively and the primary keys of supertype and subtype are always identical. E.g. People, Bank Account, Insurance, Asset, Liability, Credit Card. When designing a data model for PEOPLE, you can have a supertype entity of PEOPLE and its subtype entities can be vendor, customer, and employee.

A relationship, in the context of databases, is a situation that exists between two relational database tables when one table has a foreign key that references the primary key of the other table.

A Data model is a conceptual representation of data structures (tables) required for a database and is very powerful in expressing and communicating the business requirements.

Data Models are created in either Top Down Approach or Bottom-Up Approach. In Top-Down Approach, data models are created by understanding and analyzing the business requirements. In Bottom Up Approach, data models are created from existing databases, which has no data models.

Steps to learn Data Modeling:

Fundamentals of SQL

  • Conceptual Data Modeling
  • Logical Data Modeling
  • Physical Data Modeling
  • Dimensional Data Modeling
  • Maintenance of the data model

Important phases in the data model development life cycle

  1. Gathering Business Requirements
  2. Conceptual Data Modeling
  3. Logical Data Modeling
  4. Physical Data Modeling
  5. Development of the schema or the database
  6. Maintenance of the data model as per the changes.

TABLE name standardization Lookup – LKP – Used for code, Type tables by which a fact table can be directly accessed. e.g. Credit Card Type Lookup – CREDIT_CARD_TYPE_LKP

Fact – FCT – Used for transaction tables: e.g. Credit Card Fact – CREDIT_CARD_FCT

Cross Reference – XREF - Tables that resolves many to many relationships. e.g. Credit Card Member XREF – CREDIT_CARD_MEMBER_XREF

History – HIST – Tables the stores history. e.g. Credit Card Retired History – CREDIT_CARD_RETIRED_HIST

Statistics – STAT – Tables that store statistical information. e.g. Credit Card Web Statistics – CREDIT_CARD_WEB_STAT

ALTERNATE KEYS is a column or group of columns in a table that uniquely identify every row in that table.

  • Data migration is the process of moving data from one system to another as a one-time event. Generally, this is data that doesn’t change over time. After the migration, the destination is the new source of migrated data, and the original source is retired. Data mapping supports the migration process by mapping source fields to destination fields.

  • Data warehousing - if the goal is to pool data into one source for analysis or other tasks, it is generally pooled in a data warehouse. When you run a query, a report, or do analysis, the data comes from the warehouse. Data in the warehouse is already migrated, integrated, and transformed. Data mapping ensures that as data comes into the warehouse, it gets to its destination the way it was intended.

Steps in data mapping

  • Step 1: Define — Define the data to be moved, including the tables, the fields within each table, and the format of the field after it’s moved. For data integrations, the frequency of data transfer is also defined.
  • Step 2: Map the Data — Match source fields to destination fields.
  • Step 3: Transformation — If a field requires transformation, the transformation formula or rule is coded.
  • Step 4: Test — Using a test system and sample data from the source, run the transfer to see how it works and make adjustments as necessary.
  • Step 5: Deploy — Once it’s determined that the data transformation is working as planned, schedule a migration or integration go-live event.
  • Step 6: Maintain and Update — For ongoing data integration, the data map is a living entity that will require updates and changes as new data sources are added, as data sources change, or as requirements at the destination change.

Note Space © 2022 — Published with Nextjs

HomeTopicsLinksDefinitionsCommandsSnippetsMy works