Supervised and Unsupervised Learning are used in Dimensional modeling
Before looking into supervised and unsupervised learning we have to understand that what dimensional modeling is and why is there need of supervised or unsupervised learning so that it can be used in dimensional modeling.
Understanding Dimensional Modeling:
It is a way of simplifying the entity relation model. So it should be clear to use that we are talking about database and dimensional modeling is something related to data warehouse design. But dimensional modeling is a specially used as a simpler logical model which is optimized for decision support. And we know that OLAP is a classification of applications or is such a facility which specially provided for decision making instead of just storing and retrieving activities.
FACTS and DIMENSIONS:
OLAP is enriched with the two things, first one is FACTS and the second one is DIMENSIONS. Our concern lies in this line which revolves around OLAP because FACTS involve supporting information like quantitative values, numbers etc. And DIMENSIONS involve descriptive segregations like area, time, products or products in specific order or groups in hierarchy.
Dimensional modeling has a fact table in center with multi-part key and a set of dimension tables.
Relating Dimensional Modeling with OLAP & Discussing Supervised & Unsupervised Learning Usage:
So keeping it short and simple, Online Analytical Processing involves FACTS and DIMENSIONS which respectively quantitative and descriptive but supporting information. Let have a look on supervised and unsupervised learning methods.
In Supervised learning method we have the supporting information about our data set which helps in data modeling process.
And
In Unsupervised learning method we don’t have such supporting information to put as an input to data modeling technique which can facilitate data modeling process.
How supervised learning used in DM?
It’s cleared now that in OLAP we are using dimensional modeling with supervised learning method. Where we know no. of attributes let say “m” and furthers the records will grow the database and we will call it data matrix where n > m, records are more than columns/attributes.
How unsupervised learning used in DM?
And if supporting information is not available and learning method is purely unsupervised so we will affect quality of results due although it will perform one-way clustering and two way clustering. When we don’t know about the attributes so obviously it will go every pair as “n” and “n” we call it similarity/dissimilarity matrix so it will take time O x (n x n) x m.
Last updated: March 19, 2014