Bayesian Web Document Classification through Mining and Refining of Association Word

  • Jung Hyun Lee

초록

Existing Bayesian document classification has a problem because it does not reflect semantic relation accurately in expressing characteristic of document. In order to resolve this problem, this paper suggests Bayesian document classification method through mining and refining of association word. Apriori algorithm extracts characteristic of test document in form of association words that reflects semantic relation and mines association words from training documents.If association word from training documents is mined only with Apriori algorithm, inappropriate association word is included within them.Accordingly it has disadvantage of lack of accuracy in document classification. In order to complement the disadvantage, we adopt method to refine association words through use of genetic algorithm. Naive Bayes classifier classifies test documents based on refined association words. In order to evaluate performance of Bayesian document classification through mining and refining of association word, it is compared with Bayesian document classification method through mining of association word. Bayesian document classification method through use of TF.IDF and simple Bayesian classification method.

제목
Bayesian Web Document Classification through Mining and Refining of Association Word
저자
Jung Hyun Lee