\documentclass[11pt,american]{article}
\usepackage[american]{babel}
%\selectlanguage{english}
\usepackage{t1enc}
\begin{document}
\begin{center}
{\Large\bf Construction of decision trees using the MCMC algorithm}
\end{center}
\
The construction of decision trees is a commonly used and easily applied
way of supervised learning. The aim is the prediction of a binary
target variable on the basis of many predictor variables. The algorithm
divides the field of predictors along a predictor variable one after
another. The goal is that the target variable should be more and more
homogeneous along the resulting partition. I modified the CART algorithm
developed by Breiman and others \cite{Br84}-- which aims for the minimizing
of a concave risk function defined on the partitions generated by the
trees. I applied the Markov Chain Monte Carlo method so doing stochastic
searches on the set of decision trees. (It was first proposed in a Bayesian
framework by Chipman and others \cite{Ch98}.) By empirical experience
finding the optimal tree this technique is much more effective than the former
deterministic methods.
My main result is the examination of the given algorithm and its
variations, using the theory of general state--space Markov chains,
\cite{Ti96}.
Three different cases can be distinguished: the pure categorical, the
pure continuous and the case of mixed predictor variables. I construct a
theoretical tree, where the joint distribution is supposed to be known.
Using the so called drift-criterion, \cite{MT93}, I prove that the
constructed Markov
chain is geometrically ergodic with the desired stationary distribution
constructed of the risk function. Moreover, I examine the case
of the cost-complexity risk function, witch prevents the trees being too large.
\
\begin{thebibliography}{99}
\bibitem{Br84} {\sc Breiman, Friedman, Olsen \& Stone,}
{\em Classification and Regression Trees.} Wadsworth International
Group, Belmont, California. 1984.
\bibitem{Ch98} {\sc Chipman, George \& McCulloch,}
{\em Bayesian CART Model Search.} JASA, 1998, 93,
No. 443, 935--960.
\bibitem{MT93} {\sc Meyn \& Tweedie,}
{\em Markov Chains and Stochastic Stability.} Springer. 1993.
\bibitem{Ti96} {\sc Tierney,}
{\em Introduction to general state-space Markov Chain Theory.}
In: Gilks--Richardson--Spiegelhalter: Markov Chain Monte Carlo in Practice.
Chapman\&Hall/CRC. 1996.
\end{thebibliography}
\end{document}