In any Organization 80% of Data is Unstructured.These data can come from Social Network,mobile Technology,emails,blogs etc. For a key organization it becomes very important to accelerate learning,
discovery and readiness to retrieve , process and store those data.
Challenge
This sort of Information is hard to track and analyze. We need a systematic way to :-
Discover-Automatically identify and tag key attributes and entities within content.
Refine- Drill down based on key attributes ,entities and extracted dimension.
Visualize- Highlight Trends
Deliver-Support for broader delivery of information to other processes and applications using standards.
Data Mining of Unstructured Data
Data Mining of Unstructure Data is a 3-step Process.
Explore-Use a Combination of techniques to locate the relevant set of information from larger sets.
Understand- Discover what the Information contains.
Analyze- Take Combination of Structured and Unstructured Information and look for trends, patterns and relationships inherent in the data and use that to make better business decisions.
Some of the Data Mining Tools that can be utilized.
IBM Case Manager- a new software offering designed automate content-centric processes and manage unstructured content such as scanned images, electronic documents, web pages, video, email and text messages. integrates content and process management with advanced analytics, business rules, collaboration and social software.
Microsoft Data Mining- Applies to SQL Server 2005,SQL Server 2008. The Microsoft data mining tools leverage the strengths of Microsoft SQL Server data management software and the Microsoft Office system.
SAS Enterprise Miner- an integrated suite which provides a user-friendly GUI front-end to the SEMMA (Sample, Explore, Modify, Model, Assess) process.
Oracle Data Mining (ODM)-provides GUI, PL/SQL-interface, and Java-interface to Attribute Importance, Bayes Classification, Association Rules, Clustering, SVM, and more.
Microsoft Data Mining- Applies to SQL Server 2005,SQL Server 2008. The Microsoft data mining tools leverage the strengths of Microsoft SQL Server data management software and the Microsoft Office system.
SAS Enterprise Miner- an integrated suite which provides a user-friendly GUI front-end to the SEMMA (Sample, Explore, Modify, Model, Assess) process.
Oracle Data Mining (ODM)-provides GUI, PL/SQL-interface, and Java-interface to Attribute Importance, Bayes Classification, Association Rules, Clustering, SVM, and more.