In this dissertation, we analyze distributional asymptotics of the randomized search algorithm \quickselect. In the first half of this dissertation, we define a sequence of tree-indexed processes closely related to the operation of the \quickselect\ search algorithm (also known as {\tt Find}) for all the various values of~$n$ (the number of input keys) and~$m$ (the rank of the desired order statistic among the keys). As a ``master theorem'' we establish convergence of these processes in a certain Banach space, from which known distributional convergence results as $n \to \infty$ about the number of key or symbol comparisons are recovered both (a) when $m / n \to \alpha \in [0, 1]$, and (b) in the worst case over the choice of~$m$. In the second half, we consider an algorithm related to \quickselect\ introduced in \cite{vcff2009} called \quickval. Motivated by previous limiting results from Fill and Nakama \cite{fn2013} and $L^2$ asymptotics \cite{fn2013} we prove in \refCh{c:exact_L2}, we define what we call the \emph{\quickval\ residual} and prove convergence in distribution to a scale-mixture of centered Gaussians in \refCh{c:quickval_residual}. We also highlight an application of this result to \quickmin\ and \quickmax.
%It is well-known \cite{fn2013} that \quickselect\ (and \quickval) with minor conditions on the source and cost-function has a limit both almost surely and in $L^p$ (where $p$ depends on the conditions). We refer to the difference between the cost of \quickselect\ (\quickval) and this limit random variable after a suitable scaling as the \emph{\quickselect\ (\quickval) residual}. The main result of the second half of this dissertation is to prove convergence in law of the \quickselect\ residual to a scale-mixture of Gaussians.
\quickselect\ (also known as {\tt Find}), introduced by Hoare \cite{h1961}, is a randomized algorithm for selecting a specified order statistic from an input sequence of objects, or rather their identifying labels usually known as \emph{keys}. The keys can be numeric or symbol strings, or indeed any labels drawn from a given linearly ordered set. Suppose we are given keys $y_1,\ldots, y_n$ and we want to find the $m^{\rm th}$ smallest among them.
%If $n=1$, \quickselect\ returns $X_1$, otherwise the
The algorithm first selects a key (called the pivot) uniformly at random. It then compares every other key to the pivot, thereby determining the rank, call it $r$, of the pivot among the~$n$ keys. If $r = m$, then the algorithm terminates, returning the pivot key as output. If $r > m$, then the algorithm is applied recursively to the keys smaller than the pivot to find the $m$th smallest among those; while if $r < m$, then the algorithm is applied recursively to the keys larger than the pivot to find the $(m - r)$th smallest among those.
More formal descriptions of \quickselect\ can be found in \cite{h1961} and \cite{k1972}, for example.
When the desired rank equals 1, so that \quickselect\ is searching for the minimum of the keys, we will use the name \quickmin.
When considering asymptotics of the cost of \quickselect\ as the number of keys tends to infinity, it becomes necessary to let the order statistic $m_n$ depend on the number of keys $n$. When $m_n/n \rightarrow \alpha$, we refer to \quickselect\ finding the $m_n^{\rm th}$ order statistic in $n$ keys as \quickquant($n,\alpha$).
The \quickval\ algorithm \cite{vcff2009} is useful in the analysis of asymptotic costs of \quickquant. For $n$ keys, \quickval($n,\alpha$) searches for the $\alpha$-quantile of the \emph{population} from which the $n$ keys are drawn. On the other hand, \quickquant($n,\alpha$)
finds the (approximate) $\alpha$-quantile from the \emph{sample} of $n$~keys.
A detailed description of \quickval\ is postponed until we discuss the probabilistic sources of the keys (see \refS{s:quickval_prelim}).
Observe that, for fixed~$n$ and a given sequence $(y_1, \ldots, y_n)$ of keys, it is possible to build the randomness needed to run \quickselect\ for \emph{every} value of $m \in \{1, \dots, n\}$ on a single probability space, as follows. Let $\pi$ denote a uniformly random permutation of $\{1, \dots, n\}$, and consider the sequence $(z_1 , \dots, z_n)$ with $z_i := y_{\pi_i}$ for $i = 1, \dots, n$. Regardless of the value of~$m$, choose $z_1$ as the initial pivot; and when the algorithm is applied recursively, apply it to the appropriate sequence of $z_i$-values listed in the \emph{same} relative order as within $(z_1, \dots, z_n)$.
The cost of running \quickselect\ can be measured by assessing the cost of comparing keys. We assume that every comparison of two (distinct) keys costs some amount that is perhaps dependent on the values of the keys, and then the cost of the algorithm is the sum of the comparison costs.
Until recently, it has been customary to assign unit cost to each comparison of two keys, irrespective of their values. We denote the (random) key-comparisons-count cost for {\tt QuickSelect} by $K_{n, m}$. As we have explained, for fixed~$n$ one can use a single uniformly random permutation of $\{1, \dots, n\}$ to build a single probability space on which all of the random variables $K_{n, m}$ with $1 \leq m \leq n$ are defined. [Note also that the joint distribution of $K_{n, 1}, \dots, K_{n, n}$ does not depend on the initial sequence $(y_1, \dots, y_n)$ of distinct keys.] Among other things, this opens up the possibility of studying the distribution of $\max_m K_{n, m}$, the cost of so-called ``worst-case {\tt Find}'', in which an adversary is allowed to choose the rank of the key sought by the \quickselect\ algorithm. Our motivation for the first half of this dissertation was to investigate the large-$n$ behavior of worst-case {\tt Find} for more general cost functions.
There have been many studies of the random variables $K_{n, m}$, including \cite{d1984}, \cite{mms1995}, \cite{gr1996}, \cite{MR1454110}, \cite{g1998}, \cite{d2001}, \cite{ht2002}, \cite{df2010}, and \cite{fh2010}; and several corresponding studies, including \cite{p1995}, \cite{lm1996}, and \cite{ms1998}, of the number(s) of key comparisons for an extension of \quickselect\ called {\tt MultipleQuickSelect} that searches simultaneously for multiple order statistics. Gr{\"{u}}bel and R{\"{o}}sler \cite{gr1996} analyzed a modified version of \quickselect\ that splits the collection of keys into two sets, those smaller than the pivot and those greater than or equal to the pivot, rather than into three sets (one of which has the pivot as its only element) as considered in this dissertation. They studied (see especially their Theorem~4) the limiting behavior of this modified \quickselect\ through the convergence (in distribution, in the Skorohod topology on the space $D[0, 1]$ of \emph{c\`{a}dl\`{a}g} functions on the unit interval $[0, 1]$) of a sequence $X_1, X_2, \dots$ of stochastic processes defined by $X_n(\alpha) := n^{-1} K_{n,\lfloor n\alpha\rfloor + 1}$ for $\alpha\in[0,1)$ and $X_n(1) := n^{-1} K_{n, n}$.
%; of particular relevance to our interests is \cite[Theorem~4]{gr1996}, which implies that the
%scale-normalized key-comparisons-count cost $n^{-1} \max_m K_{n, m}$ of worst-case {\tt Find
%converges in distribution.
R\"{u}schendorf~\cite[Examples 4.1--4.2]{MR1454110} utilized the contraction method to prove that
the scale-normalized key-comparisons-count cost $n^{-1} \max_m K_{n, m}$ of worst-case {\tt Find} (the version considered in this dissertation) converges in distribution. Devroye~\cite{d2001} presented an alternative proof of the latter result.
But unit cost is not always a reasonable model for comparing two keys. For example, if each key is a string of symbols, then a more realistic model for the cost of comparing two keys is the value of the first index at which the two symbol strings differ. To date, only a few papers (\cite{vcff2009}, \cite{fn2009}, and \cite{fn2013}) have considered \quickselect\ from this more realistic symbol-comparisons perspective. As in~\cite{fn2013}, in this dissertation we will treat a rather general class of cost functions that includes both key-comparisons cost and symbol-comparisons cost.
%Cl{\'e}ment et al. \cite{cfv2001} introduced a probabilistic model for key generation that allows for a general representation of the cost of comparing keys
%in \quickselect.
In our set-up (to be described in detail in \refCh{S:set-up}) for this dissertation, we will consider a variety of probabilistic models (called \emph{probabilistic sources}) for how a key is generated as an infinite-length string of symbols, but we will always assume that the keys form an infinite sequence of independent and identically distributed and almost surely distinct symbol strings. This gives us, on a single probability space, all the randomness needed to run {\tt QuickSelect} for \emph{every} value of~$n$ and \emph{every} value of $m \in \{1, \dots, n\}$ by always choosing the \emph{first} key in the sequence as the pivot (and maintaining initial relative order of keys when the algorithm is applied recursively); this is what is meant by the \emph{natural coupling} (cf.~\cite[Section~1]{Fil2013}) of the runs of the algorithm for varying~$n$ and~$m$. As explained in~\cite[Section~1]{Fil2013}, the coupling allows us to consider stronger forms of convergence than convergence in distribution, such as almost sure convergence and convergence in $L^p$.
Whatever cost function is used for comparisons of two keys, denote the corresponding total cost of \quickselect\ (under the natural coupling) in selecting the $m$th order statistic from the first $n$ keys by ${\tt FIND}(n, m)$. Let $m_n \in \{1, \dots, n\}$ for every $n$, and suppose that $m_n / n \to \alpha \in [0,1]$.
Fill and Nakama~\cite{fn2013} prove, under certain ``tameness'' conditions (to be reviewed later) on the probabilistic source and the cost function, that
$n^{-1} {\tt FIND}(n,m_n)$ converges both in $L^p$ and almost surely to a limiting random variable. We complement their result by proving analogous results for the cost of worst-case {\tt Find}, namely, $\max_{1\leq m \leq n} {\tt FIND}(n,m)$. Our new results and (under somewhat stronger hypotheses than assumed
in~\cite{fn2013}) the results of Fill and Nakama~\cite{fn2013} are both obtained rather effortlessly from a ``master theorem'', \refT{T:proc_conv}, which establishes convergence in a certain Banach space of a certain sequence of tree-indexed processes closely related to the operation of \quickselect\ for all the various values
of~$n$ and~$m$.
The ubiquitous \quicksort, also introduced by Hoare \cite{h1962}, is a divide-and-conquer algorithm %in the same style as \quickselect\ designed to efficiently sort a collection of values.
designed, in the same style as \quickselect, to sort a collection of values efficiently.
We will describe the algorithm as acting on an array of data; however note that the algorithm is data-representation agnostic. Given an array of $n$ values (termed \emph{keys}, as in \quickselect), if $n = 0$ the algorithm returns the empty list (trivially sorted)
%if $n=1$ the algorithm returns the single key that is trivially in sorted order,
and otherwise a pivot is chosen uniformly at random. The pivot is compared to the remaining $n-1$ keys and the array is divided into the subarray of keys smaller than the pivot (say $X_{<}$) and subarray of keys larger than the pivot (say $X_{>}$). The keys are rearranged so that the entire subarray $X_{<}$ is to the left of the pivot and $X_{>}$ lies to the right. This function is then recursively called on both subarrays $X_{<}$ and $X_{>}$. A more complete description can be found in any algorithms textbook (e.g.,\ \cite{clrs2009}).
R\'{e}gnier \cite{r1989} proved almost sure convergence of the key-comparison cost of \\\quicksort\ to a limit random variable.
%\TR{She also proved $L^p$-convergence for all finite~$p$. I don't know whether or not you want to mention that.}
Using the contraction method, Neininger \cite{n2013} proved that the suitably normalized difference between the suitably normalized key-comparison cost of \quicksort\ and its almost sure limit
%the key-comparison cost of \quicksort\ subtracted by its almost sure limit, after being suitably scaled
converges in law to a standard normal distribution. Fuchs \cite{f2015} provided an alternate proof of Neininger's result using the method of moments. Gr\"{u}bel and Kabluchko \cite{gk2014} extend Neininger and Fuch's result to what they call almost sure weak convergence. They prove a functional central limit theorem for the Biggins martingale \cite{b1992} on the branching random walk and apply this result to \quicksort\ by appealing to the binary search tree characterization of \quicksort\ (we state a similar characterization for \quickselect\ in \refS{S:qsel_processes}).
We follow in
%Neininger and Fuchs' footsteps
the footsteps of Neininger and Fuchs
and prove a similar limit law for \quickval. Our technique applies the classical central limit theorem to an approximation to the cost of \quickval, where we condition on a fixed number $K$ of initial pivots. We then prove a series of reductions, using results like Slutsky's theorem (e.g.,\ \cite[Theorem~A.14.9]{bd2001}
%\TR{--add a comma after ``i.e.'' and ``e.g.'',\ etc.,\ throughout your dissertation}
), until we reach the desired convergence.
An outline for this dissertation is as follows. First, in \refCh{S:set-up} we carefully describe our set-up and, in some detail, discuss probabilistic sources, cost functions, and tameness; we also discuss the idea of \emph{seeds}, which allow us a unified treatment of all sources. In \refS{S:prelim} we state and prove a number of useful lemmas.
In \refCh{S:proc_conv} (specifically, our master \refT{T:proc_conv}) we prove that the
%\\``\quickselect\ tree processes'' to which we alluded
``{\tt Quick}-{\tt Select} tree processes'' to which we alluded
%in the preceding paragraph
earlier
%converge in a certain Banach space (described in \refD{norm} and \\\refP{p:banach}). Some
converge in a certain Banach space described in \refD{norm} and \refP{p:banach}. Some
consequences of \refT{T:proc_conv} are provided in \refCh{S:conseq}; the highlight is \refC{C:worst}, which gives sufficient conditions for $L^p$-convergence of the cost of worst-case {\tt Find}. In \refS{S:old}, we use \refT{T:proc_conv} to provide (under an additional restriction) very simple proofs of Theorems~3.1 and~4.1 in~\cite{fn2013}; the latter concerns $L^p$-convergence of the cost of \quickselect\ for fixed~$\alpha$. In \refS{S:AS_conv} we complement the $L^p$-convergence result of \refC{C:worst} for the cost of worst-case {\tt Find} by providing a tameness condition under which the scale-normalized cost of worst-case {\tt Find} converges almost surely.
We motivate our scaling of \quickval\ residual by $\sqrt{n}$ in \refCh{c:exact_L2} by first calculating the exact $L^2$ distance between the normalized key-comparison cost of \quickmin\ and its almost sure limit random variable.
%in \refCh{c:exact_L2}.
In \refCh{c:quickval_residual}, we consider the residual \quickval\ $R_n^{\rm (V)}$ random variable. Our work there culminates in \refT{t:quickval_residual_limit} in \refS{s:quickval_clt} giving conditions for convergence in law of the \quickval\ residual to a scale mixture of centered Gaussians.
We approach the proof by proposing approximations to $\SV$ and $S$ that consider only a fixed number of pivots (as opposed to letting the number of pivots considered tend to infinity as $n\rightarrow\infty$). We prove convergence of the \quickval\ residual in \refS{s:quickval_clt} through an application of the classical central limit theorem and a series of reductions. Observing that the operation of \quickmin\ is equivalent to \quickval\ when $\alpha=0$, we immediately conclude \refC{c:quickmin}.
In \refS{s:quickval_moments}, we prove using similar arguments used in the proof of \refT{t:quickval_residual_limit} that the moments of $R_n^{\rm (V)}$ converge to the moments of the limit scale mixture of centered Gaussians.
In \refCh{c:quickmin_moments} we consider the moments of the \quickmin\ residual and prove they converge to moments of a distribution defined by a distributional fixed point equation, which has as its unique solution a scale mixture of centered Gaussians.
A recurrence for moments of the candidate limit distribution is derived in \refS{s:limit_recurrence}.
In \refS{S:quickmin_limit_finite_series}, we prove that a scale-mixture of centered Gaussians is the unique solution to the distributional fixed point equation.
In \refS{S:quickmin_finite}, we derive a recurrence
for the moments of the \quickmin\ residual under key-comparison cost, and prove that each recurrence converges (in a suitable sense)
to the recurrence derived in \refS{s:limit_recurrence}.
\noindent
%\TR{There's a way to have the following remark numbered 1.1 rather than 1.0.1, and you'll want to proceed similarly in later chapters of the dissertation. I don't know how to do this, but perhaps Vince Lyzinsky does. In fact, it will make your dissertation MUCH more readable (to find out how) to number results only by chapter, not by section.} FIXED
\begin{Remark}
\emph{
As
%recalled from~\cite{fn2013} at the end of our \refS{S:sources},
we recall from~\cite{fn2013} at the end of our \refS{S:sources},
many common sources, including memoryless and Markov sources, have the property that the source-specific cost function~$\beta$ corresponding to the symbol-comparisons cost for comparing keys is $\epsilon$-tame for every $\epsilon > 0$. Thus, for such sources, the conclusions of two of our main results, \refT{T:proc_conv} and \refC{C:worst}, hold for every $p \in [2, \infty)$; and the almost-sure convergence theorem (\refT{T:AS_conv}) for worst-case {\tt Find} and \quickval\ residual convergence \refT{t:quickval_residual_limit} also applies to all such sources.
}
\end{Remark}