Knowledge Discovery in
Databases,Tools and
Techniques
Presented by
Wing-Yeung (Paul) Chong
Agenda
Concept of knowledge discovery in
databases (KDD)
KDD process
Major KDD techniques
Benefits from KDD
Applications of KDD
Conclusion
Concept of Knowledge Discovery
in Database
Basic concept:
conversion of raw data into previously
unknown knowledge for business
advantage
knowledge discovery should be done with
the technology of data mining
Concept of knowledge discovery
in database (cont’d)
What is data mining then?
An artificial intelligence tool
provides the capability to discover new and
meaningful information by using existing
data
exceeds the human capacity to analyze
large data sets
KDD Process
What is KDD process?
Transforming the raw data into patterns or
trends
Including several major processes,
hundreds of minor processes,and many
techniques.
KDD Process (cont’d)
Da t a
Da ta
se lec tio n
Se lec ted d at a
E xt ern al d at a
In pu t d at a f or
al go rit hm s
P at ter ns
or tre nd s
P at ter ns
or tre nd s
Ne w
k no w led ge
Ne w
k no w led ge
Da ta
An al y si s
In ter pr eta tio n
and
E va lu at io n
Statistical Approach
uses rule discovery and is based on data
relationships.
E.g,Linear regression
Classification Approach
groups data according to similarities or
classes,
E.g,data cleaning,decision tree approach
Major KDD techniques
Deviation and Trend Analysis
involves pattern detection by filtering
important trends,
E.g,analysis of traffic on large
telecommunications networks
Major KDD techniques (cont’d)
KDD benefits
Seek and retain most profitable
customers of a company
Market segmentation for a targeted
approach
Predict the future and identify factors to
secure a desired effect
Summarize historical data
Applications of KDD
WalMart
allows more than 3,500 suppliers,to
access data on their products and perform
data analysis,
suppliers use this data to identify customer
buying patterns at the store display level,
In 1995,WalMart computers processed
over 1 million complex data queries,
Applications of KDD (Cont’d)
American Express
has half a terabyte of information about its
customers' charges on its 35 million cards,
finds patterns that predict what product categories
individual purchasers might be interested in.
AT&T
AT&T uses the deviation and trend analysis as a
KDD technique to analyze the busy traffic of AT&T
network,
checking any faulty behavior in the network
Conclusion
With KDD,previous unknown knowledge can
be discovered quickly,
However,patterns generated by data mining
are useless without human’s interpretation,
An intelligent human is still required to:
construct query questions;
understand the identified patterns;
digest and absorb all the patterns to create more
useful new knowledge,
Databases,Tools and
Techniques
Presented by
Wing-Yeung (Paul) Chong
Agenda
Concept of knowledge discovery in
databases (KDD)
KDD process
Major KDD techniques
Benefits from KDD
Applications of KDD
Conclusion
Concept of Knowledge Discovery
in Database
Basic concept:
conversion of raw data into previously
unknown knowledge for business
advantage
knowledge discovery should be done with
the technology of data mining
Concept of knowledge discovery
in database (cont’d)
What is data mining then?
An artificial intelligence tool
provides the capability to discover new and
meaningful information by using existing
data
exceeds the human capacity to analyze
large data sets
KDD Process
What is KDD process?
Transforming the raw data into patterns or
trends
Including several major processes,
hundreds of minor processes,and many
techniques.
KDD Process (cont’d)
Da t a
Da ta
se lec tio n
Se lec ted d at a
E xt ern al d at a
In pu t d at a f or
al go rit hm s
P at ter ns
or tre nd s
P at ter ns
or tre nd s
Ne w
k no w led ge
Ne w
k no w led ge
Da ta
An al y si s
In ter pr eta tio n
and
E va lu at io n
Statistical Approach
uses rule discovery and is based on data
relationships.
E.g,Linear regression
Classification Approach
groups data according to similarities or
classes,
E.g,data cleaning,decision tree approach
Major KDD techniques
Deviation and Trend Analysis
involves pattern detection by filtering
important trends,
E.g,analysis of traffic on large
telecommunications networks
Major KDD techniques (cont’d)
KDD benefits
Seek and retain most profitable
customers of a company
Market segmentation for a targeted
approach
Predict the future and identify factors to
secure a desired effect
Summarize historical data
Applications of KDD
WalMart
allows more than 3,500 suppliers,to
access data on their products and perform
data analysis,
suppliers use this data to identify customer
buying patterns at the store display level,
In 1995,WalMart computers processed
over 1 million complex data queries,
Applications of KDD (Cont’d)
American Express
has half a terabyte of information about its
customers' charges on its 35 million cards,
finds patterns that predict what product categories
individual purchasers might be interested in.
AT&T
AT&T uses the deviation and trend analysis as a
KDD technique to analyze the busy traffic of AT&T
network,
checking any faulty behavior in the network
Conclusion
With KDD,previous unknown knowledge can
be discovered quickly,
However,patterns generated by data mining
are useless without human’s interpretation,
An intelligent human is still required to:
construct query questions;
understand the identified patterns;
digest and absorb all the patterns to create more
useful new knowledge,