Semi-supervised classification of texts using particle learning for probabilistic automata
Semi-supervised classification of texts using particle learning for probabilistic automata
This chapter presents a novel online learning system for classification of email texts based on particle learning for composite mixture models involving probabilistic automata. The composite mixture structure allows specification of a joint probability model for heterogeneous collections of independent variables without requiring complex embeddings via generalized linear models or copula techniques. The chapter is organized as follows. Section 12.2 presents a hierarchical model representation of a class of probabilistic automata and derives the particle learning algorithm for obtaining parameter estimates. Section 12.3 describes a framework for semisupervised text classification problems based on a composite mixture model formulation. Section 12.4 discusses an application of the classifier to a spam detection dataset. Section 12.5 concludes with a discussion of computational considerations and future work.
Keywords: online learning framework, active learning, semisupervised classification, particle learning, Bayesian models, Monte Carlo approach
Oxford Scholarship Online requires a subscription or purchase to access the full text of books within the service. Public users can however freely search the site and view the abstracts and keywords for each book and chapter.
Please, subscribe or login to access full text content.
If you think you should have access to this title, please contact your librarian.
To troubleshoot, please check our FAQs , and if you can't find the answer there, please contact us .