Thursday, September 10, 2015

Data Modeling


Data Model
A collection of concepts that can be used to describe structure of database and in other words representation of set of business requirement in standard framework which is understandable to biz users.
Data Model can be defined as an integrated collection of concepts for describing and manipulating data, relationships between data, and constraints on the data in an organization. 
A data model comprises of three components: 
• A structural part, consisting of a set of rules according to which databases can be constructed.
• A manipulative part, defining the types of operation that are allowed on the data (this includes the operations that are used for updating or retrieving data from the database and for changing the structure of the database).
• Possibly a set of integrity rules, which ensures that the data is accurate.
In Data modeling some points we need to remember while design data model
      1.       Identify Entities: It should be classified by its properties and characteristics and business definition for that entity.
      2.      Identify Attributes: Charteristics and properties of entities, name should be unique and self-explanatory.
      3.      Identify Relationship: We have to identify relation between entities and use of attributes for this.
      4.      Naming Convention: Name should be short avoid using special character as well as common used terms , follow standards for all entities and attributes.
      5.      Defining Keys to attributes.
Three categories of Data model
1. Object Based Data Models: Object based data models use concepts such as entities, attributes, and relationships. An entity is a distinct object (a person, place, concept, and event) in the organization that is to be represented in the database. An attribute is a property that describes some aspect of the object that we wish to record, and a relationship is an association between entities.
 Types of object based data models
• Entity-Relationship
• Object Oriented
• Semantic
• Functional 
The Entity-Relationship model has emerged as one of the main techniques for modeling database design and forms the basis for the database design methodology.
The object oriented data model extends the definition of an entity to include, not only the attributes that describe the state of the object but also the actions that are associated with the object, that is, its behavior. The object is said to encapsulate both state and behavior.
Entities in semantic systems represent the equivalent of a record in a relational system or an object in an system but they do not include behavior (methods). They are abstractions 'used to represent real world (e.g. customer) or conceptual (e.g. bank account) objects.
The functional data model is now almost twenty years old. The original idea was to' view the database as a collection of extension ally defined functions and to use a functional language for querying the database.
2. Physical Data Models:
Physical data models describe how data is stored in the computer, representing information such as record structures, record ordering, and access paths. There are not as many physical data models as logical data models, the most common one being the Unifying Model.
3. Record Based Logical Models:
Record based logical models are used in describing data at the logical and view levels. In contrast to object based data models, they are used to specify the overall logical structure of the database and to provide a higher-level description of the implementation. Record based models are so named because the database is structured in fixed format records of several types. Each record type defines a fixed number of fields, or attributes, and each field is usually of a fixed length.
The three most widely accepted record based data models are: 
• Hierarchical Model
• Network Model
• Relational Model 
The relational model has gained favor over the other two in recent years. The network and hierarchical models are still used in a large number of older databases.
Conceptual Data modeling
Conceptual data modeling is a map of concept and their relationship. This describes semantic of organization i.e. how organization doing business what are all components involve in this organization.
What are all information are required in terms of decision making for biz associates here it is considered in conceptual data modeling, in this we need to design solution.
Component of Conceptual Data modeling
      1.       Entity or Objects for example Product, customer, services.
      2.      Relationship between entities.
      3.      Identifiers which will distinguish the entity instances.
      4.      Various attributes required to maintain entity.
ER Model: ER Model is Conceptual Data model Here we need to describe data as entities, relationship and attributes.
Types of Attributes
Description
Example
Single
Not divisible
Aadhar no., Age, Sex
Composite
Divisible into small parts
Name(Fname, MName, LName)
Single-Valued
Single Value for entity
Age
Multi-Valued
Multiple value for entity
Academic marks of person
Derived
Contain calculated value
No of customer per region
Complex
Composite and multi value
Shipping address

Enhanced ER Modeling:
In this along with ER model we have additional concepts
   ·         Subclass and super-class
   ·         Top to Bottom
   ·         Bottom to Top
   ·         Category
   ·         Attribute and relationship inheritance

 Logical Data Modeling
Logical data modeling refers to actual implementation of conceptual data model in database. Logical data model is version of data model that represent the business requirement of an organization.
As and when conceptual data model is approved by design architect and functional team, development team start designing logical data model. A good logical data model created for future aspects and business requirement. Logical data model streamlined data structure and relationship between entities and attributes.
Logical data model includes all required entities, attributes, key groups and relationship that represents business information and business rules.
Logical data model characteristics
·         Logical model works in iterative manner.
·         Design is independent of database
·         All Entities and relationship among them
·         All attributes  for each entity are specified
·         Primary Key for each entity is specified
·         Foreign key between entities specified



Dimension Data Model:
Dimension data modeling comprises of one or more dimension table and fact table.
Modeling techniques are nothing but way of storing data, dimension data model give us advantage to retrieve data fast.
In dimensional model, everything is divided in 2 distinct categories - dimension or measures. Anything we try to model, must fit in one of these two categories.
In Dimension data model some benefits
  1. Faster Data Retrieval
  2. Better Understand ability
  3. Extensible


Step by step to design dimension data model
      1.       Identify Dimension: Dimensions are the object or context. That is - dimensions are the 'things' about which something is being spoken. 
      2.      Identify Measures: Measures are the quantifiable subjects and these are often numeric in nature or aggregate value.
      3.      Identify attributes or properties of dimensions: In real time we have many attributes in one context but we have to select relevant attributes.
      4.      Identify the granularity of measures:  Granularity refers to depth of information stored in data model, how granular is data stored in system
      5.      History stored: We need to make sure relevant data is getting stored and history is maintained and by using slowly changing dimension we can achieve this.
Physical Data Modeling
Physical data modeling is set of process to develop data structure for data warehousing in selected database.
Below are important points while developing physical data modeling
      1.       Convert Entities into Physical Tables
      2.      Convert Relationship to foreign keys
      3.      Convert Attributes into physical columns
      4.      Convert Unique Identifiers into define constraint on physical table-columns.
      5.      Creating Index, Views (materialized view), dimensions in table space.
      6.      Maintain datatype based on logical data model and conceptual data model.
Apart from these some more points we need to consider
How to insert, update and retrieve data into physical tables as well as how to store non-structured data into tables and Performance tuning, memory utilization.






4 comments:

tejaswini said...

You completed certain reliable points there. I did a search on the subject and found nearly all persons will agree with your blog.data science malaysia

data science said...

This is absolutely wonderful! I am really happy with articles quality and presentation.

artificial intelligence course in malaysia

artificial intelligence training in malaysia

ai course

ai training

360DigiTMG said...

Nice article. I liked very much. All the information given by you are really helpful for my research. keep on posting your views.
data science in malaysia

Shivam said...

Finding debt consolidation and bankruptcy lawyer Cambridge, Kitchener or anywhere in Ontario? So we will assist you navigate the complex debt laws landscape. Debt Consolidation Lawyer in Ontario