\documentclass[11pt]{article}
\usepackage{amsmath,amssymb,amsthm,graphicx}
\usepackage{fullpage}
\usepackage[capitalise,nameinlink]{cleveref}
\crefname{lemma}{Lemma}{Lemmas}
\crefname{fact}{Fact}{Facts}
\crefname{theorem}{Theorem}{Theorems}
\crefname{corollary}{Corollary}{Corollaries}
\crefname{claim}{Claim}{Claims}
\crefname{example}{Example}{Examples}
\crefname{problem}{Problem}{Problems}
\crefname{setting}{Setting}{Settings}
\crefname{definition}{Definition}{Definitions}
\crefname{assumption}{Assumption}{Assumptions}
\crefname{subsection}{Subsection}{Subsections}
\crefname{section}{Section}{Sections}
\DeclareMathOperator*{\E}{\mathbb{E}}
\let\Pr\relax
\DeclareMathOperator*{\Pr}{\mathbb{P}}
\newcommand{\eps}{\varepsilon}
\newcommand{\inprod}[1]{\left\langle #1 \right\rangle}
\newcommand{\R}{\mathbb{R}}
\newcommand{\handout}[5]{
\noindent
\begin{center}
\framebox{
\vbox{
\hbox to 5.78in { {\bf CS 270: Combinatorial Algorithms and Data Structures
} \hfill #2 }
\vspace{4mm}
\hbox to 5.78in { {\Large \hfill #5 \hfill} }
\vspace{2mm}
\hbox to 5.78in { {\em #3 \hfill #4} }
}
}
\end{center}
\vspace*{4mm}
}
\newcommand{\lecture}[4]{\handout{#1}{#2}{#3}{Scribe: #4}{Lecture #1}}
\newtheorem{theorem}{Theorem}[section]
\newtheorem*{theorem*}{Theorem}
\newtheorem{itheorem}{Theorem}
\newtheorem{subclaim}{Claim}[theorem]
\newtheorem{proposition}[theorem]{Proposition}
\newtheorem*{proposition*}{Proposition}
\newtheorem{lemma}[theorem]{Lemma}
\newtheorem*{lemma*}{Lemma}
\newtheorem{corollary}[theorem]{Corollary}
\newtheorem*{conjecture*}{Conjecture}
\newtheorem{fact}[theorem]{Fact}
\newtheorem*{fact*}{Fact}
\newtheorem{exercise}[theorem]{Exercise}
\newtheorem*{exercise*}{Exercise}
\newtheorem{hypothesis}[theorem]{Hypothesis}
\newtheorem*{hypothesis*}{Hypothesis}
\newtheorem{conjecture}[theorem]{Conjecture}
\theoremstyle{definition}
\newtheorem{definition}[theorem]{Definition}
\newtheorem{setting}[theorem]{Setting}
\newtheorem{construction}[theorem]{Construction}
\newtheorem{example}[theorem]{Example}
\newtheorem{question}[theorem]{Question}
\newtheorem{openquestion}[theorem]{Open Question}
% \newtheorem{algorithm}[theorem]{Algorithm}
\newtheorem{problem}[theorem]{Problem}
\newtheorem{protocol}[theorem]{Protocol}
\newtheorem{assumption}[theorem]{Assumption}
\newtheorem{exercise-easy}[theorem]{Exercise}
\newtheorem{exercise-med}[theorem]{Exercise}
\newtheorem{exercise-hard}[theorem]{Exercise$^\star$}
\newtheorem{claim}[theorem]{Claim}
\newtheorem*{claim*}{Claim}
\newtheorem{remark}[theorem]{Remark}
\newtheorem*{remark*}{Remark}
\newtheorem{observation}[theorem]{Observation}
\newtheorem*{observation*}{Observation}
% 1-inch margins, from fullpage.sty by H.Partl, Version 2, Dec. 15, 1988.
% \topmargin 0pt
% \advance \topmargin by -\headheight
% \advance \topmargin by -\headsep
% \textheight 8.9in
% \oddsidemargin 0pt
% \evensidemargin \oddsidemargin
% \marginparwidth 0.5in
% \textwidth 6.5in
% \parindent 0in
% \parskip 1.5ex
\usepackage{amsfonts, dsfont}
\begin{document}
\lecture{26 --- April 27, 2023}{Spring 2023}{Prof.\ Jelani Nelson}{Jelani Nelson}
% \documentclass[12pt]{article}
% \usepackage{amsmath,amssymb,amsthm}
% \usepackage{graphicx}
% \begin{document}
% \title{Chronogram method for partial sums, and more}
% \maketitle
\section{Lower bounds and the cell probe model}
Typically in this class we have assumed the {\em word RAM} model of computation. That is, our machine has some constant number of registers and runs a program stored in memory. The first instruction of the program is stored at some particular memory address $M_0$, the next one at $M_1$, etc.\ (and some instructions can be \texttt{JUMP}s, which set the program counter to some other value). These instructions all operator on {\em words}. A word is a basic unit of storage, which we assume is $w$ bits. For example, the \texttt{ADD} instruction can add two words at a time (modulo $2^w$), and memory addresses are $w$ bits long (and hence the total memory of the machine is at most $2^w$), etc.
How do we prove lower bounds on data structures and algorithms in this model? One of the most robust ways is to only count \texttt{LOAD} and \texttt{STORE} instructions, since these are instructions that pretty much any real machine has, whereas some of the other instructions are not necessarily universal (e.g.\ \texttt{POPCOUNT} and \texttt{MostSignificantBit} instructions). That is, we only count memory reads and writes. This model of complexity, where all computation is free and only reads/writes from/to memory are counted, is known as the {\em cell probe model} of Yao \cite{Yao78}. In the remainder of this note, we will be proving lower bounds in this model.
\section{Partial sums lower bound}
Consider the dynamic partial sums problem over some group $G$ with operation `$+$' on an $n$-dimensional array $A$. Initially the array has all entries set to $0$ (the identity element of the group).
\begin{itemize}
\item \texttt{update}$(i, b)$: $A\text{[}i\text{]}\leftarrow b$
\item \texttt{query}$(i)$: return $\sum_{j=1}^iA\text{[}j\text{]}$
\end{itemize}
Working with a word of size $w$, it is natural to consider $G$ to be the cyclic group $\mathbb{Z}_{2^w}$. There is a data structure solving this problem with $O(\lg n)$ query time and update time, by building a complete binary search tree with $[n]$ as leaves. For a node $u$ a group element $g_u$ is stored, maintaining the invariant that the answer to \texttt{query}$(i)$ is the sum of all node-associated group elements on the root-to-leaf path to leaf $i$. This invariant can be maintained with $O(\log n)$ update and query. Note that the group does not {\em have} to be $\mathbb{Z}_{2^w}$. Below we show a lower bound for an even simpler problem: where the group is $\mathbb{Z}_2$ (note this problem is easier, since it is equivalent to saying we only want to find the least significant bit of the answer for $G = \mathbb Z_{2^w}$).
We show a lower bound for dynamic partial sums over $\mathbb Z_2$ due to Fredman and Saks \cite{FredmanS89} of $t_q = \Omega(\lg n / \lg(t_u w))$, where $t_u$ is the update time and $t_u$ is the query time. In particular, this implies $\max\{t_u, t_q\} = \Omega(\lg n / \lg\lg n)$. The optimal lower bound of $\max\{t_u, t_q\} = \Omega(\lg n)$ was not shown until 15 years later by P\v{a}tra\c{s}cu and Demaine \cite{PatrascuD04} --- we will not show that today.
The lower bound works as follows, and is known as the {\em chronogram technique}. The way we describe the chronogram technique will be in the family of what we call encoding techniques. Essentially what we say is that if we have a data structures with a bound $t_q$ that is too small, then we could use that data structure an an encoding scheme to compress elements of some set $S$ into $\ll \lg|S|$ bits. Clearly this is impossible by the pigeonhole principle, so $t_q$ must be large. We now give the details. Almost all of my understanding of how this works is due to a conversation with Kasper Green Larsen. In particular, the presentation of the chronogram technique below {\em is} slightly different than both the original \cite{FredmanS89} and the treatment in Miltersen's survey \cite{Miltersen99}. I personally find the explanation given below slightly more intuitive.
Consider a data structure $\mathcal D$ that works on operation sequences that look as follows. The operation sequence has $n$ updates, followed by one uniformly random query. We group these $n$ updates together into what we call {\em epochs}. Epoch $1$ is the last epoch of updates (right before the query), and epoch $2$ comes right before it, etc. Epoch $i$ will be a sequence of $\beta^i$ updates for some $\beta$ we will choose later. Thus the number of epochs is $\lg_{\beta} n$.
Recall that there is an $n$-dimensional array $A$ being updated in dynamic partial sums. In the updates of epoch $i$, we update the set of array entries with index of the form $j\cdot (n/\beta^i)$ for $j=1,\ldots,\beta^i$. The entries are assigned independent uniform random bit values. We use $b_{i,1},...,b_{i,\beta^i}$ to denote these random bit values, where $b_{i,j}$ is the value assigned to the entry $j\cdot (n/\beta^i)$ during the updates of epoch $i$.
At the end of epoch $1$, we execute a single uniformly random partial sum query, i.e.\ \texttt{query}$(k)$ for a uniformly random $k\in[n]$.
Now, we color the memory cells of the data structure (there are at most $2^w$ memory cells) after the entire sequence of updates has been processed. A memory cell $c$ is colored with color $i$ if its contents were changed during epoch $i$ but not in any epoch $j