<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <title>Nikhil Vytla</title>
        <link>https://nikhilvytla.com/</link>
        <description>Nikhil Vytla's Blog</description>
        <lastBuildDate>Thu, 06 Nov 2025 17:40:51 GMT</lastBuildDate>
        <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
        <generator>https://github.com/jpmonette/feed</generator>
        <image>
            <title>Nikhil Vytla</title>
            <url>https://nikhilvytla.com/avatar.png</url>
            <link>https://nikhilvytla.com/</link>
        </image>
        <copyright>CC BY-NC-SA 4.0 2025 - PRESENT © Nikhil Vytla</copyright>
        <atom:link href="https://nikhilvytla.com/feed.xml" rel="self" type="application/rss+xml"/>
        <item>
            <title><![CDATA[An Interactive Look at Bayes' Theorem]]></title>
            <link>https://nikhilvytla.com/posts/bayes-theorem</link>
            <guid isPermaLink="true">https://nikhilvytla.com/posts/bayes-theorem</guid>
            <pubDate>Fri, 05 Sep 2025 00:00:00 GMT</pubDate>
            <content:encoded><![CDATA[<p>[[toc]]</p>
<h2>Introduction</h2>
<p>Hello! I'm currently experimenting with more interactive notebook playgrounds to visually explore statistics and AI/ML! This is a first draft of a notebook diving into the wonders of Bayes' Theorem and conditional probability, enjoy!</p>
<h2>Molab Notebook</h2>
<iframe
    id="inlineFrameBayes"
    title="Bayes Theorem Molab"
    src="https://molab.marimo.io/notebooks/nb_qdtZuHRiZTstYLoUFnLs4P/app"
    width="100%"
    height="1000"
></iframe>
]]></content:encoded>
            <author>nikhil@nikhilvytla.com (Nikhil Vytla)</author>
        </item>
        <item>
            <title><![CDATA[Site Maintenance & TODOs]]></title>
            <link>https://nikhilvytla.com/posts/maintain-site</link>
            <guid isPermaLink="true">https://nikhilvytla.com/posts/maintain-site</guid>
            <pubDate>Fri, 04 Jul 2025 00:00:00 GMT</pubDate>
            <content:encoded><![CDATA[<h2>TODOs</h2>
<ul>
<li>
<p>Migrate blog posts from <a href="https://nikhil-vytla.github.io/interactive-blog">interactive-blog</a> to <a href="/posts">/posts</a>.</p>
<ul>
<li>Enable pretty printing + line highlighting with Shiki</li>
<li>Replace custom Pyodide implementation with Marimo notebooks (either widgets/iframes or <a href="https://docs.marimo.io/guides/island_example/">islands 🏝️</a> (experimental feat))</li>
</ul>
</li>
<li>
<p>Upload updated resume</p>
</li>
<li>
<p>Add more photos</p>
</li>
<li>
<p>Update projects (and project categories to reflect topics e.g. CS+SG, etc.) and demos page with additional repos/videos</p>
</li>
<li>
<p>Update bookmarks</p>
</li>
<li>
<p>Add blogs for inspiration note</p>
</li>
<li>
<p>Update media</p>
</li>
<li>
<p>Confirm that math works (may need to fiddle with mathjax3 CSS more to fix aligns!)</p>
</li>
<li>
<p>Confirm that code renders work</p>
</li>
<li>
<p>Add custom art simulations (a la plum/dots)</p>
</li>
<li>
<p>write up little blog on OG image generation - perhaps make a tool for it as well!</p>
</li>
<li>
<p>write up a little blog on link minifier (and other projects - maybe case studies for more serious projects? or i could do multi-part blogs...)</p>
</li>
<li>
<p>write up smaller blog posts on fundamentals from stats, CS, HDS, potential projects</p>
</li>
<li>
<p>write paper breakdowns, dl, mi fundamentals, learnings from ai/ml in hc</p>
</li>
<li>
<p>aim to write 1/week, if not faster (shorter possibly too)</p>
</li>
<li>
<p>add bookmarks for visualization</p>
</li>
</ul>
<h2>Completed</h2>
<ul>
<li>Move this code to new repo: <a href="https://github.com/nikhil-vytla/nikhilvytla.com">nikhil-vytla/nikhilvytla.com</a></li>
<li>Setup Netlify</li>
<li>Remove domain name redirect from old site: <a href="https://nikhil-vytla.github.io">nikhil-vytla.github.io</a></li>
</ul>
]]></content:encoded>
            <author>nikhil@nikhilvytla.com (Nikhil Vytla)</author>
        </item>
        <item>
            <title><![CDATA[Causal Representation Learning]]></title>
            <link>https://nikhilvytla.com/posts/causal-representation-learning</link>
            <guid isPermaLink="true">https://nikhilvytla.com/posts/causal-representation-learning</guid>
            <pubDate>Wed, 15 Jan 2025 00:00:00 GMT</pubDate>
            <content:encoded><![CDATA[<p>[[toc]]</p>
<h2>Introduction &amp; Motivation</h2>
<p>What is causal representation learning? It involves methods that aim not to extract low-dimensional features, but to encode the causal factors of variation underlying observed data. To move beyond distributional robustness, we seek models that can reason under interventions and support counterfactual queries. This requires mathematical tools from causal inference and practical algorithms to discover disentangled, <strong>causally-meaningful</strong> representations.</p>
<h2>Generative Model &amp; SCM Formalization</h2>
<p>Let $\mathbf{x} \in \mathbb{R}^D$ be an observed datapoint. Assume that $\mathbf{x}$ is generated by underlying causal variables $\mathbf{z} = [z_1, ..., z_K]$. Under a Structural Causal Model (SCM):</p>
<p>$$<br>
z_j = f_j(\mathrm{PA}_j, n_j), \quad \text{where} ~ n_j ~\text{i.i.d. noise}, ; \mathrm{PA}_j=\text{parents of }z_j<br>
\mathbf{x} = g(\mathbf{z}, n_x)<br>
$$</p>
<p>Goal: Find an encoder $E: \mathbf{x} \mapsto \mathbf{\hat{z}}$ s.t. $\mathbf{\hat{z}}$ is (approximately) causally correct and disentangled. We would like</p>
<p>$$<br>
\text{do}(\hat{z}_k = c) \implies \text{systematic change in observation corresponding to } z_k<br>
$$</p>
<h2>Common Approaches &amp; Algorithms</h2>
<ul>
<li><code>FactorVAE</code>, <code>BetaVAE</code>: Encourage disentanglement via penalizing total correlation of $q(\mathbf{z}|\mathbf{x})$.</li>
<li><code>CausalVAE</code>: Explicit structure priors enforce relations between $z_i$ reflecting causal graph.</li>
<li><code>Invariant Risk Minimization (IRM)</code>: Learn features $\Phi(\mathbf{x})$ such that $\arg\min_{w} \sum_{e\in \mathcal{E}} R^e(w \circ \Phi)$ yields an invariant predictor across environments.</li>
</ul>
<pre><code class="language-python"># Example: FactorVAE loss (PyTorch-like pseudocode)
def factor_vae_loss(x, encoder, decoder, discriminator):
    z_mu, z_logvar = encoder(x)
    z = reparameterize(z_mu, z_logvar)
    recon_x = decoder(z)
    recon_loss = F.mse_loss(recon_x, x)
    tc = total_correlation(z, discriminator)
    return recon_loss + beta * tc
</code></pre>
<h2>Key Concepts &amp; Equations</h2>
<ul>
<li><strong>Total Correlation</strong>: $\mathrm{TC}(\mathbf{z}) = \mathrm{KL}(q(\mathbf{z}) ,\Vert, \prod_j q(z_j))$ (measures independence of latent dimensions)</li>
<li><strong>Identifiability</strong>: Under certain conditions (e.g., multiple environments, known interventions), causal features are learnable.</li>
<li><strong>Structural Hamming Distance</strong>: For causal graph evaluation, $\mathrm{SHD}(G_1,G_2)$ counts mismatched edges between graphs.</li>
</ul>
<p><strong>Mathematical Example</strong>:</p>
<pre><code class="language-python"># SCM causal simulation
def scm_sim():
    x1 = np.random.normal()
    x2 = 2*x1 + np.random.normal()
    x3 = x2 - x1 + np.random.normal()
    return np.stack([x1, x2, x3])
</code></pre>
<p>This code simulates a simple, linear SCM. Changing (&quot;intervening on&quot;) $x_1$ propagates through the system.</p>
<h2>Evaluation Metrics</h2>
<ul>
<li><strong>Modularity</strong>: Measure sensitivity of each dimension $z_j$ to changes in data or interventions.</li>
<li><strong>Disentanglement</strong>: Metrics like DCI and Mutual Information Gap (MIG).</li>
</ul>
<h2>Open Challenges</h2>
<ul>
<li>Proving identifiability with finite data.</li>
<li>Efficient algorithms for high-dimensional, non-linear SCMs.</li>
<li>Bridging deep learning and graphical model causal theory.</li>
</ul>
<h2>References</h2>
<ol>
<li><a href="https://arxiv.org/abs/1811.12359">Locatello et al. (2019). Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations</a></li>
<li><a href="https://arxiv.org/abs/2102.11107">Schölkopf et al. (2021). Towards Causal Representation Learning</a></li>
<li><a href="https://arxiv.org/abs/2210.13583">Subramanian et al. (2022) Learning Latent Structural Causal Models</a></li>
</ol>
]]></content:encoded>
            <author>nikhil@nikhilvytla.com (Nikhil Vytla)</author>
        </item>
        <item>
            <title><![CDATA[Markdown Syntax Guide]]></title>
            <link>https://nikhilvytla.com/posts/markdown-syntax-guide</link>
            <guid isPermaLink="true">https://nikhilvytla.com/posts/markdown-syntax-guide</guid>
            <pubDate>Fri, 04 Oct 2024 00:00:00 GMT</pubDate>
            <content:encoded><![CDATA[<p>[[toc]]</p>
<p>Here's some example Markdown syntax.</p>
<h2>Headings</h2>
<p>The following HTML <code>&lt;h1&gt;</code>—<code>&lt;h6&gt;</code> elements represent six levels of section headings. <code>&lt;h1&gt;</code> is the highest section level while <code>&lt;h6&gt;</code> is the lowest.</p>
<h1>H1</h1>
<h2>H2</h2>
<h3>H3</h3>
<h4>H4</h4>
<h5>H5</h5>
<h6>H6</h6>
<p><h7>H7</h7></p>
<h2>Paragraph</h2>
<p>Xerum, quo qui aut unt expliquam qui dolut labo. Aque venitatiusda cum, voluptionse latur sitiae dolessi aut parist aut dollo enim qui voluptate ma dolestendit peritin re plis aut quas inctum laceat est volestemque commosa as cus endigna tectur, offic to cor sequas etum rerum idem sintibus eiur? Quianimin porecus evelectur, cum que nis nust voloribus ratem aut omnimi, sitatur? Quiatem. Nam, omnis sum am facea corem alique molestrunt et eos evelece arcillit ut aut eos eos nus, sin conecerem erum fuga. Ri oditatquam, ad quibus unda veliamenimin cusam et facea ipsamus es exerum sitate dolores editium rerore eost, temped molorro ratiae volorro te reribus dolorer sperchicium faceata tiustia prat.</p>
<p>Itatur? Quiatae cullecum rem ent aut odis in re eossequodi nonsequ idebis ne sapicia is sinveli squiatum, core et que aut hariosam ex eat.</p>
<h2>Images</h2>
<h3>Syntax</h3>
<pre><code class="language-markdown">![Alt text](./full/or/relative/path/of/image)
</code></pre>
<h3>Output</h3>
<p><img src="https://i.giphy.com/Gty2oDYQ1fih2.gif" alt="electricity"></p>
<h2>Blockquotes</h2>
<p>The blockquote element represents content that is quoted from another source, optionally with citations which may be within a <code>footer</code> or <code>cite</code> element, and optionally with in-line changes such as annotations and abbreviations.</p>
<h3>Blockquote without attribution</h3>
<h4>Syntax</h4>
<pre><code class="language-markdown">&gt; Tiam, ad mint andaepu dandae nostion secatur sequo quae.
&gt; **Note** that you can use _Markdown syntax_ within a blockquote.
</code></pre>
<h4>Output</h4>
<blockquote>
<p>Tiam, ad mint andaepu dandae nostion secatur sequo quae.<br>
<strong>Note</strong> that you can use <em>Markdown syntax</em> within a blockquote.</p>
</blockquote>
<h3>Blockquote with attribution</h3>
<h4>Syntax</h4>
<pre><code class="language-markdown">&gt; Don't communicate by sharing memory, share memory by communicating.&lt;br&gt;
&gt; — &lt;cite&gt;Rob Pike[^1]&lt;/cite&gt;
</code></pre>
<h4>Output</h4>
<blockquote>
<p>Don't communicate by sharing memory, share memory by communicating.<br><br>
— Rob Pike[^1]</p>
</blockquote>
<h3>Citations/Footnotes</h3>
<h4>Syntax</h4>
<pre><code class="language-markdown">[^1]: The above quote is excerpted from Rob Pike's [talk](https://www.youtube.com/watch?v=PAAkCSZUG1c) during Gopherfest, November 18, 2015.
</code></pre>
<h4>Output</h4>
<p>See bottom of page.</p>
<p>[^1]: The above quote is excerpted from Rob Pike's <a href="https://www.youtube.com/watch?v=PAAkCSZUG1c">talk</a> during Gopherfest, November 18, 2015.</p>
<h2>Tables</h2>
<h3>Syntax</h3>
<pre><code class="language-markdown">| Italics   | Bold     | Code   |
| --------- | -------- | ------ |
| _italics_ | **bold** | `code` |
</code></pre>
<h3>Output</h3>
<table>
<thead>
<tr>
<th>Italics</th>
<th>Bold</th>
<th>Code</th>
</tr>
</thead>
<tbody>
<tr>
<td><em>italics</em></td>
<td><strong>bold</strong></td>
<td><code>code</code></td>
</tr>
</tbody>
</table>
<h2>Colors</h2>
<h3>Syntax</h3>
<pre><code class="language-markdown">- &lt;span font-bold font-mono text-amber&gt;MAJOR&lt;/span&gt;: Increment when you make incompatible API changes.
- &lt;span font-bold font-mono text-lime&gt;MINOR&lt;/span&gt;: Increment when you add functionality in a backwards-compatible manner.
- &lt;span font-bold font-mono text-blue&gt;PATCH&lt;/span&gt;: Increment when you make backwards-compatible bug fixes.
</code></pre>
<h3>Output</h3>
<ul>
<li><span font-bold font-mono text-amber>MAJOR</span>: Increment when you make incompatible API changes.</li>
<li><span font-bold font-mono text-lime>MINOR</span>: Increment when you add functionality in a backwards-compatible manner.</li>
<li><span font-bold font-mono text-blue>PATCH</span>: Increment when you make backwards-compatible bug fixes.</li>
</ul>
<h2>Math/Equation Rendering</h2>
<p>We use MathJax for rendering LaTeX. Here are a few examples of algorithms and equations:</p>
<h3><a href="https://en.wikipedia.org/wiki/Merge_sort">Merge sort</a></h3>
<h4>Syntax</h4>
<pre><code class="language-markdown">The time complexity of merge sort is $O(n \log n)$.

The recursive relation is:

$$
T(n) = \begin{cases}
1 &amp; \text{if } n = 1 \\
2T(n/2) + n &amp; \text{if } n &gt; 1
\end{cases}
$$

Big-O notation: $\textcolor{cyan}{f(n)} = O(\textcolor{magenta}{g(n)})$ means $\exists \textcolor{red}{c}, n_0$ such that $\textcolor{cyan}{f(n)} \leq \textcolor{red}{c} \cdot \textcolor{magenta}{g(n)}$ for all $n \geq n_0$.
</code></pre>
<h4>Output</h4>
<p>The time complexity of merge sort is $O(n \log n)$.</p>
<p>The recursive relation is:</p>
<p>$$<br>
T(n) = \begin{cases}<br>
1 &amp; \text{if } n = 1 \<br>
2T(n/2) + n &amp; \text{if } n &gt; 1<br>
\end{cases}<br>
$$</p>
<p>Big-O notation: $\textcolor{cyan}{f(n)} = O(\textcolor{magenta}{g(n)})$ means $\exists \textcolor{red}{c}, n_0$ such that $\textcolor{cyan}{f(n)} \leq \textcolor{red}{c} \cdot \textcolor{magenta}{g(n)}$ for all $n \geq n_0$.</p>
<h3>Maxwell's Equations</h3>
<h4>Syntax</h4>
<pre><code class="language-markdown">In table form:
| equation | description |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------- |
| $\nabla \cdot \vec{\mathbf{B}}  = 0$ | divergence of $\vec{\mathbf{B}}$ is zero |
| $\nabla \times \vec{\mathbf{E}}\, +\, \frac1c\, \frac{\partial\vec{\mathbf{B}}}{\partial t}  = \vec{\mathbf{0}}$ | curl of $\vec{\mathbf{E}}$ is proportional to the rate of change of $\vec{\mathbf{B}}$ |
| $\nabla \times \vec{\mathbf{B}} -\, \frac1c\, \frac{\partial\vec{\mathbf{E}}}{\partial t} = \frac{4\pi}{c}\vec{\mathbf{j}}    \nabla \cdot \vec{\mathbf{E}} = 4 \pi \rho$ | _???_ |

In array form:

$$
\begin{array}{c}
\nabla \times \vec{\mathbf{B}} -\, \frac1c\, \frac{\partial\vec{\mathbf{E}}}{\partial t} &amp;
= \frac{4\pi}{c}\vec{\mathbf{j}}    \nabla \cdot \vec{\mathbf{E}} &amp; = 4 \pi \rho \\
\nabla \times \vec{\mathbf{E}}\, +\, \frac1c\, \frac{\partial\vec{\mathbf{B}}}{\partial t} &amp; = \vec{\mathbf{0}} \\
\nabla \cdot \vec{\mathbf{B}} &amp; = 0
\end{array}
$$
</code></pre>
<h4>Output</h4>
<p>In table form:</p>
<table>
<thead>
<tr>
<th>equation</th>
<th>description</th>
</tr>
</thead>
<tbody>
<tr>
<td>$\nabla \cdot \vec{\mathbf{B}}  = 0$</td>
<td>divergence of $\vec{\mathbf{B}}$ is zero</td>
</tr>
<tr>
<td>$\nabla \times \vec{\mathbf{E}}, +, \frac1c, \frac{\partial\vec{\mathbf{B}}}{\partial t}  = \vec{\mathbf{0}}$</td>
<td>curl of $\vec{\mathbf{E}}$ is proportional to the rate of change of $\vec{\mathbf{B}}$</td>
</tr>
<tr>
<td>$\nabla \times \vec{\mathbf{B}} -, \frac1c, \frac{\partial\vec{\mathbf{E}}}{\partial t} = \frac{4\pi}{c}\vec{\mathbf{j}}    \nabla \cdot \vec{\mathbf{E}} = 4 \pi \rho$</td>
<td><em>???</em></td>
</tr>
</tbody>
</table>
<p>In array form:</p>
<p>$$<br>
\begin{array}{c}<br>
\nabla \times \vec{\mathbf{B}} -, \frac1c, \frac{\partial\vec{\mathbf{E}}}{\partial t} &amp;<br>
= \frac{4\pi}{c}\vec{\mathbf{j}}    \nabla \cdot \vec{\mathbf{E}} &amp; = 4 \pi \rho \<br>
\nabla \times \vec{\mathbf{E}}, +, \frac1c, \frac{\partial\vec{\mathbf{B}}}{\partial t} &amp; = \vec{\mathbf{0}} \<br>
\nabla \cdot \vec{\mathbf{B}} &amp; = 0<br>
\end{array}<br>
$$</p>
<h3>Homomorphism</h3>
<h4>Syntax</h4>
<pre><code class="language-markdown">A homomorphism is a map between two algebraic structures of the same type (that is of the same name), that preserves the operations of the structures. This means a map $f:A \to B$ between two sets $A$, $B$ equipped with the same structure such that, if $\cdot$ is an operation of the structure (supposed here, for simplification, to be a binary operation), then

$$
\begin{equation}
f(x\cdot y)=f(x)\cdot f(y)
\end{equation}
$$

for every pair $x$, $y$ of element of $A$. One says often that $f$ preserves the operation or is compatible with the operation.

Formally, a map $f:A \to B$ preserves an operation $\mu$ of arity $\mathsf{k}$, defined on both $A$ and $B$ if

$$
\begin{equation}
f(\mu_A(a_1,\ldots,a_k))=\mu_B(f(a_1),\ldots,f(a_k))
\end{equation}
$$

for all elements $a_1,\ldots,a_k$ in $A$.
</code></pre>
<h4>Output</h4>
<p>A homomorphism is a map between two algebraic structures of the same type (that is of the same name), that preserves the operations of the structures. This means a map $f:A \to B$ between two sets $A$, $B$ equipped with the same structure such that, if $\cdot$ is an operation of the structure (supposed here, for simplification, to be a binary operation), then</p>
<p>$$<br>
\begin{equation}<br>
f(x\cdot y)=f(x)\cdot f(y)<br>
\end{equation}<br>
$$</p>
<p>for every pair $x$, $y$ of element of $A$. One says often that $f$ preserves the operation or is compatible with the operation.</p>
<p>Formally, a map $f:A \to B$ preserves an operation $\mu$ of arity $\mathsf{k}$, defined on both $A$ and $B$ if</p>
<p>$$<br>
\begin{equation}<br>
f(\mu_A(a_1,\ldots,a_k))=\mu_B(f(a_1),\ldots,f(a_k))<br>
\end{equation}<br>
$$</p>
<p>for all elements $a_1,\ldots,a_k$ in $A$.</p>
<h2>Code Blocks</h2>
<h3>Syntax</h3>
<p>We can use 3 backticks <code>```</code> on a new line, write our code snippet, and then close with 3 backticks on another new line. To highlight language specific syntax, we can type the language name after the first 3 backticks (e.g. <code>html</code>, <code>javascript</code>, <code>css</code>, <code>markdown</code>, <code>typescript</code>, <code>txt</code>, <code>bash</code>, <code>python</code>, etc).</p>
<pre><code class="language-markdown">```html
&lt;!doctype html&gt;
&lt;html lang=&quot;en&quot;&gt;
  &lt;head&gt;
    &lt;meta charset=&quot;utf-8&quot; /&gt;
    &lt;title&gt;Example HTML5 Document&lt;/title&gt;
  &lt;/head&gt;
  &lt;body&gt;
    &lt;p&gt;Test&lt;/p&gt;
  &lt;/body&gt;
&lt;/html&gt;
```
</code></pre>
<h3>Output</h3>
<pre><code class="language-html">&lt;!doctype html&gt;
&lt;html lang=&quot;en&quot;&gt;
  &lt;head&gt;
    &lt;meta charset=&quot;utf-8&quot; /&gt;
    &lt;title&gt;Example HTML5 Document&lt;/title&gt;
  &lt;/head&gt;
  &lt;body&gt;
    &lt;p&gt;Test&lt;/p&gt;
  &lt;/body&gt;
&lt;/html&gt;
</code></pre>
<h2>More Code Renders</h2>
<h3>Syntax</h3>
<blockquote>
<p>[!NOTE]<br>
Only some transformers have been configured, see <a href="https://shiki.style/packages/transformers">https://shiki.style/packages/transformers</a> for more!</p>
</blockquote>
<h3>Output</h3>
<pre><code class="language-ts">// transformerNotationDiff
console.log('hewwo') // [!code --]
console.log('hello') // [!code ++]
console.log('goodbye')
</code></pre>
<pre><code class="language-python"># transformerNotationHighlight
from sklearn.linear_model import LinearRegression

model = LinearRegression().fit([[1], [2], [3]], [2, 4, 6]) # [!code hl]
print(model.predict([[4]]))  # Output: [8]
</code></pre>
<pre><code class="language-python"># transformerNotationWordHighlight
# [!code word:RandomForestClassifier]
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris

X, y = load_iris(return_X_y=True)
print(RandomForestClassifier().fit(X, y).score(X, y))  # ~0.97
</code></pre>
<pre><code class="language-python"># transformerNotationFocus
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris

X, y = load_iris(return_X_y=True) # [!code focus]
print(RandomForestClassifier().fit(X, y).score(X, y))  # ~0.97
</code></pre>
<pre><code class="language-ts">// transformerNotationErrorLevel
console.log('No errors or warnings')
console.error('Error') // [!code error]
console.warn('Warning') // [!code warning]
</code></pre>
<pre><code class="language-js">// transformerMetaHighlight
console.log('2')
console.log('3')
console.log('4')
console.log('5')
</code></pre>
<pre><code class="language-js">// transformerMetaWordHighlight
const msg = 'Hello World'
console.log(msg)
console.log(msg) // prints Hello World
</code></pre>
<h2>List Types</h2>
<h3>Ordered List</h3>
<h4>Syntax</h4>
<pre><code class="language-markdown">1. First item
2. Second item
3. Third item
</code></pre>
<h4>Output</h4>
<ol>
<li>First item</li>
<li>Second item</li>
<li>Third item</li>
</ol>
<h3>Unordered List</h3>
<h4>Syntax</h4>
<pre><code class="language-markdown">- List item
- Another item
- And another item
</code></pre>
<h4>Output</h4>
<ul>
<li>List item</li>
<li>Another item</li>
<li>And another item</li>
</ul>
<h3>Nested list</h3>
<h4>Syntax</h4>
<pre><code class="language-markdown">- Fruit
  - Apple
  - Orange
  - Banana
- Dairy
  - Milk
  - Cheese
</code></pre>
<h4>Output</h4>
<ul>
<li>Fruit
<ul>
<li>Apple</li>
<li>Orange</li>
<li>Banana</li>
</ul>
</li>
<li>Dairy
<ul>
<li>Milk</li>
<li>Cheese</li>
</ul>
</li>
</ul>
<h2>Other Elements — abbr, sub, sup, kbd, mark</h2>
<h3>Syntax</h3>
<pre><code class="language-markdown">&lt;abbr title=&quot;Graphics Interchange Format&quot;&gt;GIF&lt;/abbr&gt; is a bitmap image format.

H&lt;sub&gt;2&lt;/sub&gt;O

X&lt;sup&gt;n&lt;/sup&gt; + Y&lt;sup&gt;n&lt;/sup&gt; = Z&lt;sup&gt;n&lt;/sup&gt;

Press &lt;kbd&gt;CTRL&lt;/kbd&gt; + &lt;kbd&gt;ALT&lt;/kbd&gt; + &lt;kbd&gt;Delete&lt;/kbd&gt; to end the session.

Most &lt;mark&gt;salamanders&lt;/mark&gt; are nocturnal, and hunt for insects, worms, and other small creatures.
</code></pre>
<h3>Output</h3>
<p><abbr title="Graphics Interchange Format">GIF</abbr> is a bitmap image format.</p>
<p>H<sub>2</sub>O</p>
<p>X<sup>n</sup> + Y<sup>n</sup> = Z<sup>n</sup></p>
<p>Press <kbd>CTRL</kbd> + <kbd>ALT</kbd> + <kbd>Delete</kbd> to end the session.</p>
<p>Most <mark>salamanders</mark> are nocturnal, and hunt for insects, worms, and other small creatures.</p>
<h2>Testing out Marimo</h2>
<iframe
  src="https://marimo.app/l/9bnuyz?embed=true&show-chrome=false"
  width="100%"
  height="300"
  frameborder="0"
></iframe>
]]></content:encoded>
            <author>nikhil@nikhilvytla.com (Nikhil Vytla)</author>
        </item>
        <item>
            <title><![CDATA[Central Limit Theorem: Part 2]]></title>
            <link>https://nikhilvytla.com/posts/central-limit-theorem-p2</link>
            <guid isPermaLink="true">https://nikhilvytla.com/posts/central-limit-theorem-p2</guid>
            <pubDate>Mon, 12 Feb 2024 00:00:00 GMT</pubDate>
            <content:encoded><![CDATA[<p>[[toc]]</p>
<h2>Quick Recap</h2>
<p><strong>New to the Central Limit Theorem?</strong> Start with <a href="/posts/central-limit-theorem-p1">Part 1</a> first!</p>
<p><strong>For everyone else</strong>: The CLT tells us that sample means become normally distributed, regardless of the original data distribution. With samples of <code>n ≥ 30</code>, we can make confident statements about populations using statistics. Now let's put this power to work!</p>
<h2>Real-World Case Study: Battery Factory Quality Control</h2>
<h3>The Scenario</h3>
<p>You're a quality control manager at a smartphone battery factory. Your boss wants batteries that last at least 20 hours on average, but testing every single battery would be expensive and time-consuming. Plus, some tests are destructive!</p>
<p><strong>The Challenge:</strong></p>
<ul>
<li>🏭 <strong>Population</strong>: Millions of batteries produced daily</li>
<li>❓ <strong>Unknown</strong>: True population mean battery life</li>
<li>🎯 <strong>Goal</strong>: Determine if a batch meets the 20-hour requirement</li>
<li>💰 <strong>Constraint</strong>: Can only test a small sample due to cost</li>
</ul>
<h3>Enter the CLT Hero</h3>
<p>The Central Limit Theorem saves the day! Here's how:</p>
<ol>
<li><strong>Sample</strong>: Test 50 batteries from a batch (<code>n = 50 &gt; 30</code> ✓)</li>
<li><strong>Results</strong>: Sample mean = 20.3 hours, sample std = 2.1 hours</li>
<li><strong>Apply CLT</strong>: The sampling distribution of means is approximately normal</li>
<li><strong>Make Decision</strong>: Use confidence intervals to assess the entire batch</li>
</ol>
<h3>The Mathematical Solution</h3>
<p>Using the CLT, we can construct a 95% confidence interval:</p>
<p>$$<br>
\bar{x} \pm t_{\alpha/2} \frac{s}{\sqrt{n}} = 20.3 \pm 2.01 \frac{2.1}{\sqrt{50}} = 20.3 \pm 0.60<br>
$$</p>
<p><strong>Result</strong>: We're 95% confident the true population mean is between <strong>19.70 and 20.90 hours</strong>.</p>
<p><strong>Decision</strong>: Since the entire confidence interval is above 20 hours, we can confidently approve this batch for shipment! 🚀</p>
<h3>Why This Works (The CLT Magic)</h3>
<ol>
<li><strong>Large Enough Sample</strong>: <code>n = 50</code> is sufficient for CLT to kick in</li>
<li><strong>Normal Distribution</strong>: Sample means follow a normal distribution regardless of how individual battery lives are distributed</li>
<li><strong>Predictable Precision</strong>: Standard error = $\frac{\sigma}{\sqrt{n}}$ decreases as sample size increases</li>
<li><strong>Quantified Uncertainty</strong>: We know exactly how confident we can be</li>
</ol>
<h3>Complete Code Implementation</h3>
<pre><code class="language-python">import numpy as np
from scipy import stats
import plotly.graph_objects as go
from plotly.subplots import make_subplots

# Simulate the battery testing scenario
np.random.seed(42)

# Create a realistic battery population (slightly skewed, mean ~20.3)
true_population = np.random.gamma(shape=4, scale=5.075, size=100000)

# Sample 50 batteries for testing
sample_data = np.random.choice(true_population, 50)

# Calculate sample statistics
sample_mean = np.mean(sample_data)
sample_std = np.std(sample_data, ddof=1)  # ddof=1 for sample std
n = len(sample_data)

# 95% confidence interval using CLT
# Using t-distribution since we don't know population std
margin_of_error = stats.t.ppf(0.975, n-1) * (sample_std / np.sqrt(n))
ci_lower = sample_mean - margin_of_error
ci_upper = sample_mean + margin_of_error

# Display results
print(&quot;🔋 Battery Quality Control Results&quot;)
print(&quot;=&quot; * 40)
print(f&quot;Sample size: {n} batteries&quot;)
print(f&quot;Sample mean: {sample_mean:.2f} hours&quot;)
print(f&quot;Sample std: {sample_std:.2f} hours&quot;)
print(f&quot;95% Confidence Interval: [{ci_lower:.2f}, {ci_upper:.2f}] hours&quot;)
print(f&quot;Meets 20-hour requirement: {'✅ YES' if ci_lower &gt; 20 else '❌ NO'}&quot;)
print(f&quot;Margin of error: ±{margin_of_error:.2f} hours&quot;)

# Create visualization
fig = make_subplots(
    rows=2, cols=2,
    subplot_titles=[&quot;Population Distribution&quot;, &quot;Sample Data&quot;,
                   &quot;Confidence Interval&quot;, &quot;Sampling Distribution&quot;],
    specs=[[{&quot;type&quot;: &quot;histogram&quot;}, {&quot;type&quot;: &quot;histogram&quot;}],
           [{&quot;type&quot;: &quot;scatter&quot;}, {&quot;type&quot;: &quot;histogram&quot;}]]
)

# Population distribution
fig.add_trace(
    go.Histogram(x=true_population[:1000], name=&quot;Population&quot;,
                histnorm=&quot;probability density&quot;, showlegend=False),
    row=1, col=1
)

# Sample data
fig.add_trace(
    go.Histogram(x=sample_data, name=&quot;Sample&quot;,
                histnorm=&quot;probability density&quot;, showlegend=False),
    row=1, col=2
)

# Confidence interval visualization
ci_x = [ci_lower, ci_upper, ci_upper, ci_lower, ci_lower]
ci_y = [0, 0, 1, 1, 0]
fig.add_trace(
    go.Scatter(x=ci_x, y=ci_y, fill=&quot;toself&quot;, name=&quot;95% CI&quot;,
              fillcolor=&quot;lightblue&quot;, line=dict(color=&quot;blue&quot;)),
    row=2, col=1
)
fig.add_vline(x=sample_mean, line=dict(color=&quot;red&quot;, width=3),
              annotation_text=f&quot;Sample Mean: {sample_mean:.2f}h&quot;,
              row=2, col=1)
fig.add_vline(x=20, line=dict(color=&quot;green&quot;, width=2, dash=&quot;dash&quot;),
              annotation_text=&quot;Requirement: 20h&quot;,
              row=2, col=1)

# Sampling distribution (theoretical)
x_range = np.linspace(sample_mean - 3*margin_of_error,
                     sample_mean + 3*margin_of_error, 100)
sampling_dist = stats.norm.pdf(x_range, sample_mean, sample_std/np.sqrt(n))
fig.add_trace(
    go.Scatter(x=x_range, y=sampling_dist, mode=&quot;lines&quot;,
              name=&quot;Sampling Distribution&quot;, line=dict(color=&quot;purple&quot;)),
    row=2, col=2
)

fig.update_layout(height=600, title=&quot;CLT in Action: Battery Quality Control&quot;)
fig.show()
</code></pre>
<h2>Confidence Intervals: Your Statistical Superpower</h2>
<p>Once you understand CLT, confidence intervals become your go-to tool for making decisions with uncertainty. The general formula is:</p>
<p>$$<br>
\text{Estimate} \pm \text{Margin of Error}<br>
$$</p>
<p>Where the margin of error depends on:</p>
<ul>
<li><strong>Confidence Level</strong>: How sure do you want to be? (90%, 95%, 99%)</li>
<li><strong>Sample Size</strong>: Larger samples = smaller margin of error</li>
<li><strong>Variability</strong>: More spread in data = larger margin of error</li>
</ul>
<h3>Confidence Level Trade-offs</h3>
<pre><code class="language-python"># Compare different confidence levels
confidence_levels = [0.90, 0.95, 0.99]
colors = ['green', 'blue', 'red']

fig = go.Figure()

for i, conf_level in enumerate(confidence_levels):
    alpha = 1 - conf_level
    t_critical = stats.t.ppf(1 - alpha/2, n-1)
    margin = t_critical * (sample_std / np.sqrt(n))

    # Add confidence interval
    fig.add_shape(
        type=&quot;rect&quot;,
        x0=sample_mean - margin, x1=sample_mean + margin,
        y0=i*0.3, y1=(i+1)*0.3,
        fillcolor=colors[i], opacity=0.3,
        line=dict(color=colors[i], width=2)
    )

    fig.add_annotation(
        x=sample_mean, y=i*0.3 + 0.15,
        text=f&quot;{conf_level*100:.0f}%: ±{margin:.2f}&quot;,
        showarrow=False
    )

fig.add_vline(x=sample_mean, line=dict(color=&quot;black&quot;, width=2))
fig.add_vline(x=20, line=dict(color=&quot;orange&quot;, width=2, dash=&quot;dash&quot;))

fig.update_layout(
    title=&quot;Confidence Level Trade-offs&quot;,
    xaxis_title=&quot;Battery Life (hours)&quot;,
    yaxis_title=&quot;Confidence Level&quot;,
    height=400
)
fig.show()
</code></pre>
<p><strong>Key Insight</strong>: Higher confidence = wider intervals. There's always a trade-off between certainty and precision!</p>
<h2>Interactive Challenge: Test Your Understanding</h2>
<p><strong>Scenario</strong>: You're analyzing customer satisfaction scores (1-10 scale) for a new app. You survey 40 users and get a mean of 7.2 with a standard deviation of 1.8.</p>
<p><strong>Questions</strong>:</p>
<ol>
<li>Can you apply the CLT here? (Check: <code>n ≥ 30</code>? ✓)</li>
<li>What's the 95% confidence interval for the true mean satisfaction?</li>
<li>If you wanted a margin of error of only ±0.2, how many users would you need to survey?</li>
</ol>
<p><strong>Try it yourself, then check the solution below!</strong></p>
<details>
<summary>Click for Solution</summary>
<pre><code class="language-python">import numpy as np
from scipy import stats

# Given data
n = 40
sample_mean = 7.2
sample_std = 1.8

# 95% confidence interval
margin_of_error = stats.t.ppf(0.975, n-1) * (sample_std / np.sqrt(n))
ci_lower = sample_mean - margin_of_error
ci_upper = sample_mean + margin_of_error

print(f&quot;95% CI: [{ci_lower:.2f}, {ci_upper:.2f}]&quot;)

# For margin of error = 0.2
desired_margin = 0.2
z_score = 1.96  # for 95% confidence
required_n = ((z_score * sample_std) / desired_margin) ** 2

print(f&quot;Required sample size for ±0.2 margin: {int(np.ceil(required_n))} users&quot;)

# Visualization
fig = go.Figure()

# Current confidence interval
fig.add_shape(
    type=&quot;rect&quot;,
    x0=ci_lower, x1=ci_upper, y0=0, y1=1,
    fillcolor=&quot;lightblue&quot;, opacity=0.5,
    line=dict(color=&quot;blue&quot;, width=2)
)

# Desired confidence interval
desired_ci_lower = sample_mean - desired_margin
desired_ci_upper = sample_mean + desired_margin
fig.add_shape(
    type=&quot;rect&quot;,
    x0=desired_ci_lower, x1=desired_ci_upper, y0=1.2, y1=2.2,
    fillcolor=&quot;lightgreen&quot;, opacity=0.5,
    line=dict(color=&quot;green&quot;, width=2)
)

fig.add_vline(x=sample_mean, line=dict(color=&quot;red&quot;, width=3))

fig.add_annotation(x=sample_mean, y=0.5, text=f&quot;Current: n={n}&quot;, showarrow=False)
fig.add_annotation(x=sample_mean, y=1.7, text=f&quot;Desired: n={int(np.ceil(required_n))}&quot;, showarrow=False)

fig.update_layout(
    title=&quot;Sample Size vs. Precision Trade-off&quot;,
    xaxis_title=&quot;Satisfaction Score&quot;,
    yaxis_title=&quot;&quot;,
    height=300
)
fig.show()
</code></pre>
<p><strong>Answers:</strong></p>
<ol>
<li>Yes! <code>n = 40 &gt; 30</code>, so CLT applies</li>
<li>95% CI: [6.62, 7.78]</li>
<li>You'd need about 312 users for that precision!</li>
</ol>
<p><strong>Key Lesson</strong>: Precision is expensive! Going from ±0.58 to ±0.2 requires almost 8x more data.</p>
</details>
<h2>What CLT Doesn't Do (Important Limitations)</h2>
<p>While CLT is powerful, it's not magic. Here's what it can't help with:</p>
<h3>🚫 <strong>Biased Samples</strong></h3>
<p><strong>Problem</strong>: If your sample isn't representative, CLT won't fix that.<br>
<strong>Example</strong>: Surveying only iPhone users about phone preferences won't tell you about Android users!<br>
<strong>Solution</strong>: Focus on proper sampling methodology first.</p>
<h3>🚫 <strong>Very Small Samples</strong></h3>
<p><strong>Problem</strong>: CLT needs &quot;sufficiently large&quot; samples.<br>
<strong>Example</strong>: For very skewed data, <code>n = 5</code> won't cut it.<br>
<strong>Solution</strong>: Use bootstrap methods or exact distributions for small samples.</p>
<h3>🚫 <strong>Dependent Data</strong></h3>
<p><strong>Problem</strong>: CLT assumes independence.<br>
<strong>Example</strong>: Stock prices over time influence each other.<br>
<strong>Solution</strong>: Use time series analysis or account for correlation structure.</p>
<h3>🚫 <strong>Infinite Variance</strong></h3>
<p><strong>Problem</strong>: Some theoretical distributions have infinite variance.<br>
<strong>Example</strong>: Cauchy distribution (rare in practice).<br>
<strong>Solution</strong>: Use robust statistics or different theoretical frameworks.</p>
<h2>Case Studies Across Industries</h2>
<h3>🏥 Medical Research: Drug Trial</h3>
<p><strong>Scenario</strong>: Testing a new blood pressure medication.</p>
<ul>
<li><strong>Population</strong>: All patients with hypertension</li>
<li><strong>Sample</strong>: 200 patients in clinical trial</li>
<li><strong>Measurement</strong>: Change in systolic blood pressure</li>
<li><strong>CLT Application</strong>: Confidence interval for mean improvement</li>
</ul>
<pre><code class="language-python"># Simulate drug trial data
np.random.seed(123)
bp_reduction = np.random.normal(12, 8, 200)  # Mean reduction: 12 mmHg

n = len(bp_reduction)
sample_mean = np.mean(bp_reduction)
sample_std = np.std(bp_reduction, ddof=1)

# 95% confidence interval
margin_of_error = stats.t.ppf(0.975, n-1) * (sample_std / np.sqrt(n))
ci_lower = sample_mean - margin_of_error
ci_upper = sample_mean + margin_of_error

print(f&quot;Drug Trial Results:&quot;)
print(f&quot;Mean BP reduction: {sample_mean:.1f} mmHg&quot;)
print(f&quot;95% CI: [{ci_lower:.1f}, {ci_upper:.1f}] mmHg&quot;)
print(f&quot;Significant improvement: {'✅ YES' if ci_lower &gt; 0 else '❌ NO'}&quot;)
</code></pre>
<h3>🗳️ Political Polling: Election Prediction</h3>
<p><strong>Scenario</strong>: Predicting election results.</p>
<ul>
<li><strong>Population</strong>: All eligible voters</li>
<li><strong>Sample</strong>: 1,000 survey respondents</li>
<li><strong>Measurement</strong>: Proportion supporting candidate A</li>
<li><strong>CLT Application</strong>: Confidence interval for vote share</li>
</ul>
<pre><code class="language-python"># Simulate polling data
np.random.seed(456)
support_rate = 0.52  # True support rate: 52%
poll_responses = np.random.binomial(1, support_rate, 1000)

n = len(poll_responses)
sample_prop = np.mean(poll_responses)
sample_std = np.sqrt(sample_prop * (1 - sample_prop))  # Binomial std

# 95% confidence interval for proportion
margin_of_error = 1.96 * (sample_std / np.sqrt(n))
ci_lower = sample_prop - margin_of_error
ci_upper = sample_prop + margin_of_error

print(f&quot;Polling Results:&quot;)
print(f&quot;Support rate: {sample_prop:.1%}&quot;)
print(f&quot;95% CI: [{ci_lower:.1%}, {ci_upper:.1%}]&quot;)
print(f&quot;Margin of error: ±{margin_of_error:.1%}&quot;)
</code></pre>
<h3>🌐 A/B Testing: Website Optimization</h3>
<p><strong>Scenario</strong>: Testing two website designs.</p>
<ul>
<li><strong>Population</strong>: All website visitors</li>
<li><strong>Sample</strong>: 5,000 visitors per variant</li>
<li><strong>Measurement</strong>: Conversion rate</li>
<li><strong>CLT Application</strong>: Compare confidence intervals</li>
</ul>
<pre><code class="language-python"># Simulate A/B test data
np.random.seed(789)
conversion_a = np.random.binomial(1, 0.08, 5000)  # Control: 8%
conversion_b = np.random.binomial(1, 0.095, 5000)  # Variant: 9.5%

def analyze_conversion(data, name):
    n = len(data)
    rate = np.mean(data)
    std = np.sqrt(rate * (1 - rate))
    margin = 1.96 * (std / np.sqrt(n))

    print(f&quot;{name}:&quot;)
    print(f&quot;  Conversion rate: {rate:.2%}&quot;)
    print(f&quot;  95% CI: [{rate-margin:.2%}, {rate+margin:.2%}]&quot;)
    return rate, margin

rate_a, margin_a = analyze_conversion(conversion_a, &quot;Control (A)&quot;)
rate_b, margin_b = analyze_conversion(conversion_b, &quot;Variant (B)&quot;)

# Test for significant difference
diff = rate_b - rate_a
diff_std = np.sqrt(margin_a**2 + margin_b**2)
significant = abs(diff) &gt; 1.96 * diff_std

print(f&quot;\nDifference: {diff:.2%}&quot;)
print(f&quot;Statistically significant: {'✅ YES' if significant else '❌ NO'}&quot;)
</code></pre>
<h2>Advanced Topics: When CLT Gets Interesting</h2>
<h3>Sample Size Calculation</h3>
<p><strong>Question</strong>: How many samples do you need for a given precision?</p>
<p><strong>Formula</strong>:</p>
<p>$$<br>
n = \left(\frac{z_{\alpha/2} \cdot \sigma}{E}\right)^2<br>
$$</p>
<p>Where:</p>
<ul>
<li>$z_{\alpha/2}$ = critical value (1.96 for 95% confidence)</li>
<li>$\sigma$ = population standard deviation (estimated)</li>
<li>$E$ = desired margin of error</li>
</ul>
<pre><code class="language-python">def calculate_sample_size(confidence_level, margin_of_error, std_dev):
    &quot;&quot;&quot;Calculate required sample size for given precision.&quot;&quot;&quot;
    alpha = 1 - confidence_level
    z_critical = stats.norm.ppf(1 - alpha/2)

    n = (z_critical * std_dev / margin_of_error) ** 2
    return int(np.ceil(n))

# Example: Customer satisfaction survey
required_n = calculate_sample_size(
    confidence_level=0.95,
    margin_of_error=0.2,
    std_dev=1.8
)

print(f&quot;Required sample size: {required_n}&quot;)
</code></pre>
<h3>Bootstrap vs. CLT</h3>
<p><strong>Bootstrap</strong>: Computer-intensive alternative to CLT that works with smaller samples.</p>
<pre><code class="language-python">def bootstrap_ci(data, n_bootstrap=10000, confidence_level=0.95):
    &quot;&quot;&quot;Calculate confidence interval using bootstrap.&quot;&quot;&quot;
    bootstrap_means = []
    n = len(data)

    for _ in range(n_bootstrap):
        bootstrap_sample = np.random.choice(data, size=n, replace=True)
        bootstrap_means.append(np.mean(bootstrap_sample))

    alpha = 1 - confidence_level
    lower_percentile = (alpha/2) * 100
    upper_percentile = (1 - alpha/2) * 100

    ci_lower = np.percentile(bootstrap_means, lower_percentile)
    ci_upper = np.percentile(bootstrap_means, upper_percentile)

    return ci_lower, ci_upper

# Compare CLT vs Bootstrap
small_sample = np.random.exponential(2, 15)  # Small, skewed sample

# CLT approach
clt_mean = np.mean(small_sample)
clt_std = np.std(small_sample, ddof=1)
clt_margin = stats.t.ppf(0.975, len(small_sample)-1) * (clt_std / np.sqrt(len(small_sample)))
clt_ci = (clt_mean - clt_margin, clt_mean + clt_margin)

# Bootstrap approach
bootstrap_ci_result = bootstrap_ci(small_sample)

print(f&quot;CLT CI: [{clt_ci[0]:.2f}, {clt_ci[1]:.2f}]&quot;)
print(f&quot;Bootstrap CI: [{bootstrap_ci_result[0]:.2f}, {bootstrap_ci_result[1]:.2f}]&quot;)
</code></pre>
<h2>Practice Problems with Solutions</h2>
<h3>Problem 1: Coffee Shop Revenue</h3>
<p><strong>Question</strong>: A coffee shop's daily revenue has a mean of $1,200 and standard deviation of $300. If you calculate the average revenue over 25 days, what's the probability this average exceeds $1,300?</p>
<details>
<summary>Solution</summary>
<pre><code class="language-python"># Given information
mu = 1200  # Population mean
sigma = 300  # Population std
n = 25  # Sample size
target = 1300  # Target value

# Sampling distribution parameters
sampling_mean = mu
sampling_std = sigma / np.sqrt(n)  # Standard error

# Calculate probability
z_score = (target - sampling_mean) / sampling_std
prob = 1 - stats.norm.cdf(z_score)

print(f&quot;Sampling distribution: N({sampling_mean}, {sampling_std:.1f})&quot;)
print(f&quot;Z-score: {z_score:.2f}&quot;)
print(f&quot;P(sample mean &gt; $1300) = {prob:.4f} or {prob:.2%}&quot;)
</code></pre>
<p><strong>Answer</strong>: About 4.78% chance</p>
</details>
<h3>Problem 2: Manufacturing Quality</h3>
<p><strong>Question</strong>: Light bulbs have lifespans with mean 1000 hours and standard deviation 200 hours. In a sample of 64 bulbs, what's the probability the sample mean is between 950 and 1050 hours?</p>
<details>
<summary>Solution</summary>
<pre><code class="language-python"># Given information
mu = 1000
sigma = 200
n = 64
lower_bound = 950
upper_bound = 1050

# Sampling distribution
sampling_std = sigma / np.sqrt(n)

# Calculate z-scores
z_lower = (lower_bound - mu) / sampling_std
z_upper = (upper_bound - mu) / sampling_std

# Calculate probability
prob = stats.norm.cdf(z_upper) - stats.norm.cdf(z_lower)

print(f&quot;Sampling distribution: N({mu}, {sampling_std:.1f})&quot;)
print(f&quot;Z-scores: {z_lower:.2f} to {z_upper:.2f}&quot;)
print(f&quot;P(950 &lt; sample mean &lt; 1050) = {prob:.4f} or {prob:.2%}&quot;)
</code></pre>
<p><strong>Answer</strong>: About 95.45% chance</p>
</details>
<h2>Key Takeaways for Practitioners</h2>
<p>🎯 <strong>CLT is your foundation</strong>: Most statistical inference relies on it</p>
<p>📊 <strong>Confidence intervals &gt; point estimates</strong>: Always quantify uncertainty</p>
<p>🔍 <strong>Sample size matters</strong>: But there are diminishing returns</p>
<p>⚠️ <strong>Check your assumptions</strong>: Independence, sufficient sample size, representative sampling</p>
<p>🛠️ <strong>Multiple tools available</strong>: CLT, bootstrap, exact methods - choose appropriately</p>
<p>💡 <strong>Context is king</strong>: Statistical significance ≠ practical significance</p>
<h2>What's Next?</h2>
<p>Now that you've mastered CLT applications, you're ready to explore:</p>
<ul>
<li><strong>Hypothesis Testing</strong>: Using CLT to test specific claims about populations</li>
<li><strong>Regression Analysis</strong>: How CLT underlies the assumptions in linear models</li>
<li><strong>ANOVA</strong>: Comparing multiple groups using CLT principles</li>
<li><strong>Bayesian Statistics</strong>: A different approach to uncertainty quantification</li>
</ul>
<h2>Further Reading</h2>
<ul>
<li><strong>Advanced</strong>: &quot;Mathematical Statistics with Applications&quot; by Wackerly, Mendenhall, and Scheaffer</li>
<li><strong>Practical</strong>: &quot;Practical Statistics for Data Scientists&quot; by Bruce &amp; Bruce</li>
<li><strong>Online</strong>: Duke's <a href="https://www.coursera.org/learn/inferential-statistics-intro">Inferential Statistics course</a> on Coursera</li>
<li><strong>Interactive</strong>: Try the examples in this article with your own data!</li>
</ul>
<hr>
<p><em>Remember: The Central Limit Theorem isn't just theory - it's the practical foundation for making confident decisions with data in the real world!</em> 🚀</p>
]]></content:encoded>
            <author>nikhil@nikhilvytla.com (Nikhil Vytla)</author>
        </item>
        <item>
            <title><![CDATA[Central Limit Theorem: Part 1]]></title>
            <link>https://nikhilvytla.com/posts/central-limit-theorem-p1</link>
            <guid isPermaLink="true">https://nikhilvytla.com/posts/central-limit-theorem-p1</guid>
            <pubDate>Sun, 07 Jan 2024 00:00:00 GMT</pubDate>
            <content:encoded><![CDATA[<p>[[toc]]</p>
<h2>Why Should You Care?</h2>
<p>Ever wonder how Netflix can recommend movies you'll love based on ratings from millions of users? Or how medical researchers can test a drug on 1,000 patients and confidently say it works for everyone? How can political polls survey just 1,000 people and predict the behavior of millions of voters?</p>
<p>The answer lies in one of statistics' most powerful and elegant theorems: the <strong>Central Limit Theorem</strong>. It's the mathematical reason why small samples can tell us big truths about the world.</p>
<h2>The Problem We're Solving</h2>
<p>Imagine you're trying to understand the average height of all adults in your country. Testing everyone would be impossible - that's millions of people! But somehow, measuring just a few hundred people can give you a remarkably accurate estimate.</p>
<p>This seems like magic, but it's actually math. The Central Limit Theorem explains why this &quot;averaging effect&quot; works for ANY type of data - not just heights, but everything from battery life to stock prices to exam scores.</p>
<h2>The Big Picture (No Math Yet!)</h2>
<p>Think of the CLT like a really good friend who always brings you back to normal after a chaotic day. Here's the intuitive idea:</p>
<p><strong>Individual data points are unpredictable</strong> → One coin flip, one test score, one battery life measurement<br>
<strong>But averages become predictable</strong> → Average of 100 coin flips, average test score of a class, average battery life of a batch</p>
<p>The CLT is like the Marvel multiverse - no matter how different the individual universes (data distributions), the overall story (sampling distribution) follows predictable patterns. Whether your original data is:</p>
<ul>
<li>Completely random and chaotic 🎲</li>
<li>Heavily skewed to one side 📈</li>
<li>Has multiple peaks like a camel's back 🐪</li>
</ul>
<p>...when you start taking averages of samples, something beautiful happens: those averages cluster around the true population mean in a perfect bell curve!</p>
<h3>A Simple Analogy</h3>
<p>Imagine you're at a carnival with a bunch of friends, and you all decide to play different games:</p>
<ul>
<li><strong>Alice</strong> plays ring toss (skill-based, consistent results)</li>
<li><strong>Bob</strong> plays the lottery wheel (completely random)</li>
<li><strong>Charlie</strong> plays basketball shots (mostly good, occasional bad shots)</li>
</ul>
<p>If you look at their individual game results, they're all over the place. But if you average their scores over many rounds, something magical happens - all three friends end up with averages that cluster around predictable values, and those averages follow a nice, normal bell curve pattern!</p>
<p>That's the CLT in action: <strong>chaos becomes order through averaging</strong>.</p>
<h2>What You Need to Know First</h2>
<p>Before we dive deeper, make sure you're comfortable with:</p>
<ul>
<li><strong>Mean (average)</strong>: Add up all numbers, divide by count</li>
<li><strong>Standard deviation</strong>: How spread out your data is</li>
<li><strong>Normal distribution</strong>: That classic bell curve shape</li>
<li><strong>Sampling</strong>: Taking a subset of a larger group</li>
</ul>
<p>Don't worry if you're rusty - the CLT is surprisingly intuitive once you see it in action!</p>
<h2>The Formal Definition</h2>
<p>Now for the mathematical beauty. The Central Limit Theorem states:</p>
<blockquote>
<p>For a sample size $n$ that is sufficiently large, the sampling distribution of the sample mean approaches a normal distribution, regardless of the shape of the original population distribution.</p>
</blockquote>
<p>In math notation:</p>
<p>$$<br>
\frac{\bar{X} - \mu}{\sigma/\sqrt{n}} \xrightarrow{d} N(0,1)<br>
$$</p>
<p><strong>Translation</strong>: No matter how weird your original data looks, if you take enough samples and calculate their averages, those averages will form a beautiful normal distribution centered on the true population mean.</p>
<h2>The &quot;Magic&quot; Number 30</h2>
<p>Statisticians love to say $n \ge 30$ like it's some sacred commandment carved in statistical stone. It's not magic - it's just a decent rule of thumb for when the CLT starts working well.</p>
<p>(Though I admit, 30 does have a nice ring to it compared to $n \ge 27.3$...)</p>
<p>The truth is more nuanced:</p>
<ul>
<li><strong>Normal-ish data</strong>: CLT works with samples as small as 5-10</li>
<li><strong>Moderately skewed data</strong>: Usually need 15-30 samples</li>
<li><strong>Heavily skewed data</strong>: Might need 50+ samples</li>
<li><strong>Really bizarre distributions</strong>: Sometimes need 100+ samples</li>
</ul>
<p>Think of it like learning to ride a bike - some kids need training wheels longer than others, but eventually everyone gets there!</p>
<h2>Visual Proof: Seeing is Believing</h2>
<p>Let's watch the CLT work its magic. Here's a simple example with dice rolls:</p>
<pre><code class="language-python">import numpy as np
import matplotlib.pyplot as plt

# Set random seed for reproducibility
np.random.seed(42)

# Simulate rolling a die (uniform distribution, definitely not normal!)
die_rolls = np.random.randint(1, 7, 10000)

# Now let's see what happens when we average different numbers of rolls
sample_sizes = [1, 2, 5, 30]
fig, axes = plt.subplots(2, 2, figsize=(12, 8))
axes = axes.flatten()

for i, n in enumerate(sample_sizes):
    # Generate 1000 sample means
    sample_means = []
    for _ in range(1000):
        sample = np.random.choice(die_rolls, n)
        sample_means.append(np.mean(sample))

    # Plot histogram
    axes[i].hist(sample_means, bins=30, alpha=0.7, density=True)
    axes[i].set_title(f'Sample Size n = {n}')
    axes[i].set_xlabel('Sample Mean')
    axes[i].set_ylabel('Density')

plt.tight_layout()
plt.suptitle('Central Limit Theorem: From Uniform Die Rolls to Normal Averages')
plt.show()
</code></pre>
<p><strong>What to Notice:</strong></p>
<ul>
<li><strong>n=1</strong>: Flat distribution (just like individual die rolls)</li>
<li><strong>n=2</strong>: Starting to peak in the middle</li>
<li><strong>n=5</strong>: Looking more bell-shaped</li>
<li><strong>n=30</strong>: Beautiful normal distribution! 🎉</li>
</ul>
<p>The original die rolls were completely uniform (flat), but the averages became normal. That's the CLT magic!</p>
<h2>Common Misconceptions (Don't Fall for These!)</h2>
<h3>❌ &quot;CLT makes the original data normal&quot;</h3>
<p><strong>Wrong!</strong> CLT makes the <em>sample means</em> normal, not the original data. Your individual data points can still be as weird as they want.</p>
<p><strong>Think of it like this</strong>: Individual people at a party might be introverts, extroverts, or somewhere in between. But if you average the &quot;social energy&quot; of random groups at the party, those group averages will be remarkably consistent and normally distributed.</p>
<h3>❌ &quot;You need normal data to start with&quot;</h3>
<p><strong>Nope!</strong> That's the whole point - CLT works with ANY distribution. Exponential, uniform, bimodal, you name it.</p>
<h3>❌ &quot;Bigger samples are always better&quot;</h3>
<p><strong>Not necessarily!</strong> Due to the $\sqrt{n}$ in the denominator, going from 100 to 400 samples only doubles your precision. There are diminishing returns, and sometimes the cost isn't worth it.</p>
<h3>❌ &quot;CLT works with any sample size&quot;</h3>
<p><strong>Be careful!</strong> Very small samples ($n &lt; 10$) from skewed distributions won't work well. Always check your assumptions!</p>
<h2>Why This Matters: The Big Picture</h2>
<p>The CLT is the foundation that makes modern statistics possible. It explains:</p>
<ul>
<li><strong>Why polls work</strong>: 1,000 people can represent millions</li>
<li><strong>Why quality control works</strong>: Test a few products, understand the whole batch</li>
<li><strong>Why medical trials work</strong>: Study some patients, help everyone</li>
<li><strong>Why A/B testing works</strong>: Test with some users, apply to all users</li>
</ul>
<p>Without the CLT, we'd be stuck testing everything and everyone. It's the mathematical principle that lets us make confident decisions with incomplete information.</p>
<h2>A Quick Reality Check</h2>
<p>Here's what the CLT is really saying:</p>
<blockquote>
<p>&quot;Hey, I know your data is messy and unpredictable. But if you take enough samples and average them, I promise those averages will behave nicely and predictably. Trust me on this one!&quot;</p>
</blockquote>
<p>And remarkably, math keeps this promise every single time.</p>
<h2>Key Takeaways</h2>
<p>🎯 <strong>The Big Idea</strong>: Sample means become normal, regardless of the original data distribution</p>
<p>📏 <strong>The Rule</strong>: Generally need <code>n ≥ 30</code>, but depends on how skewed your data is</p>
<p>🔍 <strong>The Power</strong>: Allows us to make confident statements about populations using small samples</p>
<p>⚠️ <strong>The Catch</strong>: Doesn't fix biased sampling, needs independence, requires sufficient sample size</p>
<p>🛠️ <strong>The Applications</strong>: Everywhere! Quality control, medical research, polling, A/B testing, finance</p>
<h2>What's Next?</h2>
<p>Now that you understand the intuition behind CLT, you're ready to see it in action!</p>
<p><strong>Ready to put CLT to work?</strong> Explore some real world applications of CLT in <a href="/posts/central-limit-theorem-p2">Part 2</a> where we'll dive deep into:</p>
<ul>
<li>🏭 <strong>Complete case study</strong>: Battery factory quality control</li>
<li>📊 <strong>Confidence intervals</strong>: Your statistical superpower</li>
<li>🧮 <strong>Interactive challenges</strong>: Test your understanding</li>
<li>⚠️ <strong>Limitations</strong>: When CLT doesn't work</li>
<li>💻 <strong>Full code examples</strong>: Ready-to-run implementations</li>
</ul>
<p>The CLT isn't just a mathematical curiosity - it's the foundation that makes data-driven decisions possible in an uncertain world!</p>
<h2>Further Reading</h2>
<ul>
<li><strong>Books</strong>: &quot;The Signal and the Noise&quot; by Nate Silver (great for intuition)</li>
<li><strong>Online</strong>: Khan Academy's Statistics course</li>
<li><strong>Interactive</strong>: Play with CLT simulations at <a href="https://seing-theory.brown.edu">Seeing Theory</a></li>
</ul>
<hr>
<p><em>Remember: The Central Limit Theorem is your friend who brings order to chaos. Trust in the power of averaging!</em> 🚀</p>
]]></content:encoded>
            <author>nikhil@nikhilvytla.com (Nikhil Vytla)</author>
        </item>
        <item>
            <title><![CDATA[Supporting OSS]]></title>
            <link>https://nikhilvytla.com/posts/supporting-oss</link>
            <guid isPermaLink="true">https://nikhilvytla.com/posts/supporting-oss</guid>
            <pubDate>Wed, 31 May 2023 00:00:00 GMT</pubDate>
            <description><![CDATA[Open-source projects form the backbone of the internet, and it feels more important than ever before to support them.]]></description>
            <content:encoded><![CDATA[<p>[[toc]]</p>
<p>Every day, billions of people rely on software that runs the modern world. From the web browsers displaying this text to the servers hosting it, from the networks routing data packets to the cryptographic libraries securing transactions, nearly every digital interaction depends on open-source software (OSS).</p>
<p>Supporting open-source projects can happen in a few ways. Contributing code, fixing bugs, and discussing feature requests are great ways for folks to get involved, especially budding new software engineers.</p>
<p>I'd like to also bring attention to a less seen but equally as important facet of support: <strong>funding</strong>. I've had the pleasure of contributing to and maintaining open-source libraries over the years, and while I can't speak for everyone, I believe that most open-source maintainers do it for the <a href="https://opensource.org/about#:~:text=source%20ecosystem%20thrives.-,Mission,-The%20Open%20Source">mission</a>, not for the money.</p>
<p>That being said, most projects do operate on financial life support, maintained by volunteers or small teams that still need to keep the lights on and make a living so that they can (hopefully) spend more time maintaining and building their projects. Money doesn't grow on trees, and it sure doesn't grow in GitHub issue comments.</p>
<h2>Underappreciated Foundations</h2>
<p>I'm not exactly sure why, but open-source software seems to operates on a paradox: the more critical a project becomes, the more invisible it often is to end users.</p>
<p>Consider these statistics:</p>
<ul>
<li><strong>SQLite</strong> is embedded in every iPhone, Android device, and web browser, yet runs on donations</li>
<li><strong>OpenSSL</strong> secures nearly every HTTPS connection on the internet with a team smaller than most coffee shops</li>
<li><strong>curl</strong> powers data transfer for billions of applications but relies entirely on volunteer contributions</li>
<li><strong>FFmpeg</strong> processes virtually every video you watch online, and is maintained by a handful of developers</li>
</ul>
<p>There's a famous <a href="https://www.explainxkcd.com/wiki/index.php/2347:_Dependency">XKCD comic</a> that illustrates this:</p>
<p><img src="https://www.explainxkcd.com/wiki/images/d/d7/dependency.png" alt="XKCD #2347: Dependency" title="XKCD #2347: Dependency"></p>
<p>A brief aside on types of open-source projects:</p>
<p>Some projects are more &quot;user-facing&quot;. These projects are directly interacted with by developers and users daily, and benefit from large community forums and active discussions.</p>
<p>Others are &quot;below-the-surface&quot;. These are usually projects that most devs indirectly depend on, but don't realize!</p>
<p>In a similar vein to project/code visibility, some project maintainers and communities are more vocal about needing funding/financial support, and others (for a variety of reasons) are perhaps less so.</p>
<p>Regardless of their levels of visibility, at the end of the day, both types of projects have three things in common:</p>
<ol>
<li>they are vital to the OSS ecosystem,</li>
<li>they rely on small, dedicated teams of volunteers to continue building safer, scalable, more robust, and frankly kick-ass software that benefits all of us, and</li>
<li>they deserve our support. 💪</li>
</ol>
<h2>Jenga Blocks All the Way Down</h2>
<p>Companies (many of them <a href="https://netflixtechblog.com/why-we-use-and-contribute-to-open-source-software-1faa77c2e5c4">household names</a>) generate trillions in revenue using free open-source software, yet most contribute a fraction back to the projects they depend on. In today's chronically online world, this funding imbalance impacts both our technical capabilities and our economic strength.</p>
<p>In short:</p>
<ol>
<li>
<p>Without sustainable funding, critical projects depend on the goodwill of overworked volunteers who often quit, increasing attack vector space via security holes and technical debt.</p>
</li>
<li>
<p>Underfunded projects lack resources for proper security audits, testing infrastructure, and timely vulnerability responses.</p>
</li>
<li>
<p>Without funding for new features and optimizations, critical infrastructure stagnates, limiting technological innovation across the entire ecosystem.</p>
</li>
</ol>
<p>This results in:</p>
<ul>
<li><strong>Large positive externalities</strong>: One open-source project can create billions in economic value</li>
<li><strong>Tragedy of the commons</strong>: Everyone benefits from open-source but few contribute to maintenance</li>
<li><strong>Single points of failure</strong>: Key maintainers become irreplaceable bottlenecks</li>
</ul>
<p>Some innovative organizations have begun addressing this. <a href="https://eslint.org/blog/2022/02/paying-contributors-sponsoring-projects/">ESLint forwards sponsorships to dependencies</a>, and the <a href="https://openssf.org/">Open Source Security Foundation</a> coordinates security improvements. These efforts are promising, but they're only the start!</p>
<h2>A List of Projects to Consider Supporting</h2>
<p>The following list contains examples of projects that form different parts of the backbone of our digital infrastructure and the open internet. Note that this is a non-exhaustive list.</p>
<h3>Core Infrastructure &amp; Cryptography</h3>
<p><strong><a href="https://www.openssl.org/">OpenSSL</a></strong></p>
<ul>
<li><strong>What it does</strong>: Provides cryptographic functionality for virtually every secure web connection</li>
<li><strong>Impact</strong>: Secures trillions of dollars in online transactions daily</li>
<li><strong>Funding</strong>: <a href="https://github.com/sponsors/openssl">GitHub Sponsors</a> | <a href="https://openssl-foundation.org/donate/">OpenSSL Foundation</a></li>
</ul>
<p><strong><a href="https://curl.se/">curl</a></strong></p>
<ul>
<li><strong>What it does</strong>: Command-line tool and library for transferring data with URLs</li>
<li><strong>Impact</strong>: Used by virtually every programming language and application for HTTP requests</li>
<li><strong>Funding</strong>: <a href="https://github.com/sponsors/bagder">GitHub Sponsors</a> | <a href="https://opencollective.com/curl">Open Collective</a></li>
</ul>
<h3>Internet Infrastructure</h3>
<p><strong><a href="https://letsencrypt.org/">Let's Encrypt</a></strong></p>
<ul>
<li><strong>What it does</strong>: Free, automated certificate authority providing TLS certificates</li>
<li><strong>Impact</strong>: Enabled HTTPS for 95% of all websites, securing billions of connections</li>
<li><strong>Funding</strong>: <a href="https://letsencrypt.org/donate/">Donations</a> | <a href="https://letsencrypt.org/become-a-sponsor/">Corporate sponsorship</a></li>
</ul>
<p><strong><a href="https://wikimediafoundation.org/">Wikimedia Foundation</a></strong></p>
<ul>
<li><strong>What it does</strong>: Operates Wikipedia and related knowledge projects</li>
<li><strong>Impact</strong>: Provides free access to human knowledge for billions of people</li>
<li><strong>Funding</strong>: <a href="https://donate.wikimedia.org/">Individual donations</a> | Corporate partnerships</li>
</ul>
<p><strong><a href="https://archive.org/">Internet Archive</a></strong></p>
<ul>
<li><strong>What it does</strong>: Digital library preserving websites, books, movies, music, and software</li>
<li><strong>Impact</strong>: Preserves digital heritage and provides free access to historical information</li>
<li><strong>Funding</strong>: <a href="https://archive.org/donate/">Donations</a> | Grants</li>
</ul>
<h3>Development Tools &amp; Libraries</h3>
<p><strong><a href="https://ffmpeg.org/">FFmpeg</a></strong></p>
<ul>
<li><strong>What it does</strong>: Multimedia framework for recording, converting, and streaming audio/video</li>
<li><strong>Impact</strong>: Powers virtually every video platform, from YouTube to Netflix</li>
<li><strong>Funding</strong>: <a href="https://ffmpeg.org/donations.html">Donations</a></li>
</ul>
<h3>Programming Language Foundations</h3>
<p><strong><a href="https://www.python.org/psf/">Python Software Foundation</a></strong></p>
<ul>
<li><strong>What it does</strong>: Supports development of the Python programming language</li>
<li><strong>Impact</strong>: Python powers AI/ML, web development, and scientific computing globally</li>
<li><strong>Funding</strong>: <a href="https://www.python.org/psf/donations/">Donations</a> | <a href="https://www.python.org/psf/sponsorship/">Corporate sponsorship</a></li>
</ul>
<h2>The Path Forward</h2>
<p>Broadly, I believe it's important to recognize that open-source infrastructure is a public good that deserves to be invested in, just like roads, bridges, and utilities.</p>
<p>It's also equally important to show appreciation for OSS, to contribute and maintain code, and, if you are able to donate, please do!</p>
<p>For the record, I'm not proposing a silver bullet, and I also don't believe that it's only on the general population to &quot;foot the bill&quot;. Large companies, especially those who benefit the most from OSS, need to step up to the plate.</p>
<p>Just like how <a href="https://ssir.org/articles/entry/the_curb_cut_effect">sidewalk curbs, initially designed to benefit vulnerable groups, have ended up benefitting all of society</a>, the impacts of open-source projects benefit diverse, interdisciplinary communities around the world! ◡̈</p>
<hr>
<p><em>The projects listed above represent just the tip of the iceberg. If you have another project you'd like for me to feature on this list, please let me know!</em></p>
]]></content:encoded>
            <author>nikhil@nikhilvytla.com (Nikhil Vytla)</author>
        </item>
        <item>
            <title><![CDATA[Caramel Churro Chex Mix]]></title>
            <link>https://nikhilvytla.com/posts/caramel-churro-chex-mix</link>
            <guid isPermaLink="true">https://nikhilvytla.com/posts/caramel-churro-chex-mix</guid>
            <pubDate>Sat, 15 Jan 2022 00:00:00 GMT</pubDate>
            <content:encoded><![CDATA[<figure>
  <img src="https://nikhilvytla.com/images/caramel-churro-chex-mix.jpg" alt="Caramel Churro Chex Mix" />
  <figcaption>A bowl filled with yummy, crunchy, delicious Caramel Churro Chex Mix!</figcaption>
</figure>
<h3>Prep Time</h3>
<p>15 min</p>
<h3>Cook Time</h3>
<p>7 min</p>
<h2>Shopping List</h2>
<ul>
<li>4 1/2 c Rice Chex™ cereal</li>
<li>4 1/2 c Corn Chex™ cereal</li>
<li>1 c packed brown sugar</li>
<li>1/2 c (1 stick) salted butter</li>
<li>1/4 c light corn syrup</li>
<li>1/4 tsp baking soda</li>
<li>2/3 c granulated sugar</li>
<li>2 tsp ground cinnamon</li>
</ul>
<h2>Instructions</h2>
<ol>
<li>Preheat oven to 350. Place cereal in a large, heat safe bowl. Line a large baking sheet with foil and spray with nonstick baking spray.</li>
<li>In a small bowl, combine cinnamon and sugar and set aside.</li>
<li>In a heavy saucepan over medium heat, heat brown sugar, butter, and corn syrup until the mixture comes to a boil. Let boil for one minute, stirring constantly, then remove from heat and stir in baking soda. Pour over cereal and stir until cereal is coated.</li>
<li>Spread cereal on prepared baking sheet. Sprinkle evenly with cinnamon/sugar mixture. Bake for about 5 minutes, then flip with a spatula and bake for 3 more minutes, until cereal turns golden brown. Remove from oven and let cool completely, then break up and store in an airtight container.</li>
</ol>
]]></content:encoded>
            <author>nikhil@nikhilvytla.com (Nikhil Vytla)</author>
        </item>
    </channel>
</rss>