<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Interview on Neil's Space</title><link>https://neilmin.com/tags/interview/</link><description>Recent content in Interview on Neil's Space</description><image><title>Neil's Space</title><url>https://neilmin.com/images/papermod-cover.png</url><link>https://neilmin.com/images/papermod-cover.png</link></image><generator>Hugo</generator><language>en-US</language><lastBuildDate>Sat, 13 Jun 2026 07:00:00 -0700</lastBuildDate><atom:link href="https://neilmin.com/tags/interview/index.xml" rel="self" type="application/rss+xml"/><item><title>How RocksDB Works: A Minimal LSM-Tree Primer</title><link>https://neilmin.com/posts/how-rocksdb-works/</link><pubDate>Sat, 13 Jun 2026 07:00:00 -0700</pubDate><guid>https://neilmin.com/posts/how-rocksdb-works/</guid><description>I spent some time really learning how RocksDB works while prepping for interviews, and these are my notes: what RocksDB is, how data gets written and read, what compaction does in the background, and the unavoidable trade-off between the three amplification factors. Not exhaustive — just the core LSM-tree ideas, shared for anyone else trying to get it.</description><content:encoded><![CDATA[<p>While prepping for interviews, I spent some time really digging into how RocksDB works — how its storage engine is designed, how data gets written, and how it gets read back. RocksDB (and the LSM-tree underneath it) is one of those things a lot of people have heard of but can&rsquo;t quite explain — I couldn&rsquo;t either, before I sat down with it. Once it clicked, I wrote up the core ideas as these notes, to share with anyone else trying to get it.</p>
<p>I won&rsquo;t claim this is exhaustive or deeply expert, but I hope it leaves you (and future me) with a clear overall picture of how RocksDB actually turns.</p>
<h2 id="what-rocksdb-is">What RocksDB is</h2>
<p>In one line: <strong>an embeddable, persistent key-value store</strong>.</p>
<ul>
<li><strong>Embeddable</strong>: it isn&rsquo;t a standalone server like MySQL — it&rsquo;s a library you compile directly into your program, which cuts out inter-process communication overhead.</li>
<li><strong>Persistent</strong>: data lives on disk; nothing is lost on a crash.</li>
<li>Forked from Google&rsquo;s <strong>LevelDB</strong> in 2012, written in C++, optimized specifically for <strong>SSDs</strong> and <strong>write-heavy</strong> workloads. Meta, Microsoft, Netflix, and Uber all use it.</li>
<li>It is <strong>not distributed</strong> — replication and sharding are your job at a higher layer.</li>
</ul>
<p>The operations it exposes are humble: <code>put(key, value)</code> to write, <code>get(key)</code> to read, <code>delete(key)</code> to remove, <code>merge(key, value)</code> to combine, and <code>iterator.seek()</code> for range scans.</p>
<h2 id="the-core-idea-the-lsm-tree">The core idea: the LSM-tree</h2>
<p>Everything in RocksDB is built on the <strong>LSM-tree (Log-Structured Merge-Tree)</strong>.</p>
<p>The core tension it tackles: <strong>disks hate random writes and love sequential ones</strong>. The LSM-tree&rsquo;s trick is to buffer writes in memory, keep them sorted, then flush them to disk sequentially all at once. In other words, it <strong>batches a flood of random writes into sequential writes</strong> — and that&rsquo;s the fundamental reason it writes so fast.</p>
<p>Structurally, data is split across many levels: the top level lives in memory, and below it sit level after level on disk, numbered L0, L1, L2… The deeper you go, the older and larger the data (each level is typically ~10× the one above it).</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-text" data-lang="text"><span style="display:flex;"><span>memory   ┌──────────────────────────────┐
</span></span><span style="display:flex;"><span>         │  MemTable (writable, sorted)  │  ← new data lands here first
</span></span><span style="display:flex;"><span>         └──────────────────────────────┘
</span></span><span style="display:flex;"><span>- - - - - - - - - - - - - - - - - - - - - -  flush
</span></span><span style="display:flex;"><span>disk     L0   [SST] [SST] [SST]      ← newest; key ranges may overlap across files
</span></span><span style="display:flex;"><span>         L1   [SST][SST][SST][SST]   ← no overlap within a level, and bigger
</span></span><span style="display:flex;"><span>         L2   [SST][SST] ......      ← older and larger the deeper you go (~×10)
</span></span><span style="display:flex;"><span>         ...
</span></span></code></pre></div><p>This structure dates back to 1996 and was designed for write-intensive workloads. Besides RocksDB, Bigtable, HBase, Cassandra, and MongoDB&rsquo;s WiredTiger engine are all LSM-tree based.</p>
<h2 id="writing-how-data-gets-in">Writing: how data gets in</h2>
<p>A single write lands in <strong>two</strong> places at once:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-text" data-lang="text"><span style="display:flex;"><span>put(key, value)
</span></span><span style="display:flex;"><span>      │
</span></span><span style="display:flex;"><span>      ├──► WAL       (appended sequentially to disk, for crash safety)
</span></span><span style="display:flex;"><span>      │
</span></span><span style="display:flex;"><span>      └──► MemTable  (kept sorted in memory)
</span></span><span style="display:flex;"><span>                  │  fills up at ~64MB
</span></span><span style="display:flex;"><span>                  ▼
</span></span><span style="display:flex;"><span>            turns read-only; a background thread flushes it to one SST file → L0
</span></span></code></pre></div><p><strong>MemTable</strong>: the in-memory write buffer where every insert, update, and delete goes first. It&rsquo;s kept <strong>sorted by key</strong> internally (the default implementation is a <strong>skip list</strong>), which is what makes the later flush and range queries efficient. One detail: a delete doesn&rsquo;t actually erase anything — it writes a <strong>tombstone</strong> record meaning &ldquo;this key is deleted.&rdquo; The real cleanup is left to compaction later.</p>
<p><strong>WAL (Write-Ahead Log)</strong>: the MemTable is in memory, so a power loss would wipe it. So every write also <strong>appends</strong> a record to a WAL file on disk — key, value, operation type, and a checksum. After a crash, RocksDB replays the WAL to reconstruct the MemTable. Note the WAL is <strong>appended in write order, not sorted</strong> — it&rsquo;s optimizing purely for speed.</p>
<p><strong>Flush</strong>: once a MemTable fills up, it turns read-only and a fresh one takes over; a background thread then flushes the read-only MemTable into a single <strong>SST file</strong> on L0. Once that&rsquo;s done, the corresponding WAL can be discarded. Because the MemTable was already sorted, this flush is one <strong>sequential write</strong> — which is the whole point of the LSM-tree.</p>
<h2 id="what-an-sst-file-looks-like">What an SST file looks like</h2>
<p>An <strong>SST (Static Sorted Table)</strong> is the file that actually holds data on disk, and it&rsquo;s never modified once written. Inside is a pile of <strong>sorted key-value pairs</strong>, laid out in a carefully designed block format (blocks default to 4KB and can be compressed with Snappy, LZ4, ZSTD, etc.).</p>
<p>An SST is roughly split into a few sections:</p>
<ul>
<li><strong>Data blocks</strong>: the sorted key-value pairs. Since adjacent keys are similar, only the differences need to be stored (delta encoding) to save space.</li>
<li><strong>Index</strong>: records, for each data block, &ldquo;last key → offset in the file,&rdquo; so a lookup can <strong>binary-search</strong> straight to the right block instead of scanning the whole file.</li>
<li><strong>Bloom filter (optional)</strong>: a probabilistic structure that very quickly answers &ldquo;this key is <strong>definitely not</strong> in this file.&rdquo; It may give a false &ldquo;yes,&rdquo; but never a false &ldquo;no&rdquo; — perfect for skipping, on a read, a whole batch of files you don&rsquo;t need to touch.</li>
</ul>
<h2 id="reading-how-data-gets-found">Reading: how data gets found</h2>
<p>To read a key, you search <strong>newest to oldest</strong>, level by level — newer values sit higher, older ones lower, so the first hit is the latest value:</p>
<ol>
<li>Check the active MemTable;</li>
<li>Then the read-only MemTables not yet flushed;</li>
<li>Then each SST file in L0 (L0 files can overlap in key range, so you have to check them one by one, newest to oldest);</li>
<li>From L1 down, each level has non-overlapping key ranges, so you only need to <strong>locate and check one file per level</strong>.</li>
</ol>
<p>And within a <strong>single SST file</strong>, it&rsquo;s again three steps: first ask the <strong>Bloom filter</strong> whether the key is present — if not, skip the file entirely; if so, use the <strong>index</strong> to binary-search to the right data block; finally read that block and find the key inside it.</p>
<p>So the cost of a read comes down to how many levels and files you have to wade through — which leads straight into the next section.</p>
<h2 id="compaction-the-background-cleanup-that-never-stops">Compaction: the background cleanup that never stops</h2>
<p>As noted, a delete just writes a tombstone, and an update just writes a new value on top of the old one. Over time, the disk fills up with <strong>stale old versions and tombstones</strong>: they waste space <em>and</em> force reads to wade through more files.</p>
<p><strong>Compaction</strong> is the background job that cleans this up: it takes some SST files from one level, merges them with the overlapping files in the next level, <strong>throws away the shadowed old values and deleted keys</strong>, and writes fresh, clean SSTs into the lower level. Since every file is already sorted, the merge uses a <strong>k-way merge</strong> — a scaled-up version of the &ldquo;merge&rdquo; step in merge sort. It all runs on background threads, so it doesn&rsquo;t block foreground reads and writes.</p>
<p>RocksDB defaults to <strong>leveled compaction</strong>:</p>
<ul>
<li><strong>L0</strong> is special: its files <strong>may overlap</strong> in key range (since they&rsquo;re flushed straight from MemTables); compaction triggers once the L0 file count hits a threshold (4 by default).</li>
<li><strong>L1 and below</strong>: within each level, all files have <strong>non-overlapping</strong> key ranges and are globally ordered; when a level&rsquo;s total size exceeds its target, the excess is merged down into the next level — sometimes cascading down several levels in a chain.</li>
</ul>
<h2 id="its-all-trade-offs-the-three-amplifications">It&rsquo;s all trade-offs: the three amplifications</h2>
<p>The key to understanding RocksDB tuning (really, all LSM engines) is three <strong>amplification</strong> factors:</p>
<ul>
<li><strong>Space amplification</strong>: disk space actually used ÷ size of the logical data. The more stale versions and tombstones pile up, the higher it gets.</li>
<li><strong>Read amplification</strong>: how many I/O operations a single logical read actually performs. The more levels and files to wade through, the higher it gets.</li>
<li><strong>Write amplification</strong>: how many times a single logical write is actually written. The same piece of data gets rewritten to lower levels over and over during compaction, so this can get large.</li>
</ul>
<p>These three are a game of whack-a-mole: <strong>the more aggressively you compact, the smaller your space and read amplification, but the larger your write amplification</strong> — and vice versa. The right balance depends entirely on your workload, and the knobs are many and interdependent. Even the RocksDB authors admit it&rsquo;s hard to pin down the exact effect of each parameter, and recommend <strong>benchmarking a lot while keeping an eye on those three amplification factors</strong>.</p>
<blockquote>
<p><strong>An aside: the merge operation</strong></p>
<p>Besides put and delete, RocksDB has <code>merge</code>. When you need to apply lots of <em>incremental</em> updates to a value (say, repeatedly appending to a counter or a list), the traditional approach is read-modify-write: read it out, change it, write it back — clunky. <code>merge</code> lets you write just the <em>increment</em> and hands off the combining to a merge function you define, computing the final value only at read or compaction time. The <strong>upside</strong> is lower write amplification, plus it&rsquo;s thread-safe; the <strong>cost</strong> is that reads get more expensive — until the increments are consolidated, every read has to recompute them.</p>
</blockquote>
<h2 id="the-bits-worth-remembering">The bits worth remembering</h2>
<p>If I keep just one mental map, it&rsquo;s this:</p>
<ul>
<li><strong>RocksDB</strong> = an embeddable, persistent KV store, descended from LevelDB, built on the <strong>LSM-tree</strong>;</li>
<li><strong>Writes</strong>: into the in-memory <strong>MemTable</strong> (sorted) + a sequential <strong>WAL</strong> (crash safety) → once full, flushed to an <strong>SST</strong> file on L0 → <strong>compaction</strong> slowly tidies things downward in the background;</li>
<li><strong>Reads</strong>: search newest to oldest, level by level, using a <strong>Bloom filter</strong> + <strong>index</strong> to skip and locate so you read as few stray files as possible;</li>
<li><strong>The essence</strong>: it trades &ldquo;write amplification&rdquo; for the high throughput of &ldquo;turning random writes into sequential ones&rdquo; — and <strong>between space, read, and write amplification, it&rsquo;s always a trade-off; there&rsquo;s no free lunch</strong>.</li>
</ul>
<p>Hold onto those few lines and the overall shape of RocksDB stands up. The finer details — skip lists, delta encoding, the various compaction strategies, how to tune the knobs — you can dive into whenever you actually need them.</p>
<blockquote>
<p>A lot of my understanding here comes from Artem Krylysov&rsquo;s <a href="https://artem.krylysov.com/blog/2023/04/19/how-rocksdb-works/">How RocksDB Works</a>, which goes into far more depth — highly recommended if you want to go deeper.</p>
</blockquote>
]]></content:encoded></item><item><title>Sorting Algorithms for Coding Interviews: A Python Reference from Bubble Sort to Timsort</title><link>https://neilmin.com/posts/sorting-algorithms-interview-reference/</link><pubDate>Sat, 13 Jun 2026 00:00:00 -0700</pubDate><guid>https://neilmin.com/posts/sorting-algorithms-interview-reference/</guid><description>A reference I put together while reviewing sorting algorithms for coding interviews: Python implementations of 11 sorts, their time and space complexity, stability, and when to use each — plus quicksort partition variants and the non-comparison sorts that are easy to forget. Skim it to self-check what you still remember.</description><content:encoded><![CDATA[<p>I&rsquo;ve been prepping for coding interviews lately, and I went back through the sorting algorithms from scratch. The process gave me a bit of a scare: a lot of this I genuinely <em>used to</em> know — how quicksort&rsquo;s partition actually works, why it degrades — and now I had to pause to remember it. By the time I got to the non-comparison sorts — counting, radix, bucket — I realized that whole area had become more or less a blank.</p>
<p>So I decided to write this review down. Partly as a reference for other people getting ready for interviews, and partly as a record for my future self: next time I need to interview, I can come back here, skim through, and quickly figure out &ldquo;this one I still know, this one I forgot, let me focus there.&rdquo;</p>
<p>How to use this post:</p>
<ul>
<li>First look at the <strong>cheat sheet</strong> below — one glance tells you which algorithms you&rsquo;ve forgotten;</li>
<li>For anything you want to dig into, use the table of contents (TOC) on the right to jump straight there;</li>
<li>Every algorithm follows the same template: <strong>one-line idea → Python implementation → complexity → stability and in-place → interview notes</strong>, so they&rsquo;re easy to compare.</li>
</ul>
<p>All the code is in Python, because it reads closest to pseudocode and makes the logic easiest to see.</p>
<h2 id="one-page-cheat-sheet">One-page cheat sheet</h2>
<p>Conclusions first. The table below covers all 11 sorts in this post. When an interviewer asks about complexity or stability, this is the table that should flash into your head.</p>
<table>
  <thead>
      <tr>
          <th>Algorithm</th>
          <th>Best</th>
          <th>Average</th>
          <th>Worst</th>
          <th>Space</th>
          <th style="text-align: center">Stable</th>
          <th style="text-align: center">In-place</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Bubble</td>
          <td>O(n)</td>
          <td>O(n²)</td>
          <td>O(n²)</td>
          <td>O(1)</td>
          <td style="text-align: center">✅</td>
          <td style="text-align: center">✅</td>
      </tr>
      <tr>
          <td>Selection</td>
          <td>O(n²)</td>
          <td>O(n²)</td>
          <td>O(n²)</td>
          <td>O(1)</td>
          <td style="text-align: center">❌</td>
          <td style="text-align: center">✅</td>
      </tr>
      <tr>
          <td>Insertion</td>
          <td>O(n)</td>
          <td>O(n²)</td>
          <td>O(n²)</td>
          <td>O(1)</td>
          <td style="text-align: center">✅</td>
          <td style="text-align: center">✅</td>
      </tr>
      <tr>
          <td>Shell</td>
          <td>O(n log n)</td>
          <td>≈O(n^1.3)</td>
          <td>O(n²)</td>
          <td>O(1)</td>
          <td style="text-align: center">❌</td>
          <td style="text-align: center">✅</td>
      </tr>
      <tr>
          <td>Merge</td>
          <td>O(n log n)</td>
          <td>O(n log n)</td>
          <td>O(n log n)</td>
          <td>O(n)</td>
          <td style="text-align: center">✅</td>
          <td style="text-align: center">❌</td>
      </tr>
      <tr>
          <td>Quick</td>
          <td>O(n log n)</td>
          <td>O(n log n)</td>
          <td>O(n²)</td>
          <td>O(log n)</td>
          <td style="text-align: center">❌</td>
          <td style="text-align: center">✅</td>
      </tr>
      <tr>
          <td>Heap</td>
          <td>O(n log n)</td>
          <td>O(n log n)</td>
          <td>O(n log n)</td>
          <td>O(1)</td>
          <td style="text-align: center">❌</td>
          <td style="text-align: center">✅</td>
      </tr>
      <tr>
          <td>Counting</td>
          <td>O(n+k)</td>
          <td>O(n+k)</td>
          <td>O(n+k)</td>
          <td>O(n+k)</td>
          <td style="text-align: center">✅</td>
          <td style="text-align: center">❌</td>
      </tr>
      <tr>
          <td>Radix</td>
          <td>O(d·(n+k))</td>
          <td>O(d·(n+k))</td>
          <td>O(d·(n+k))</td>
          <td>O(n+k)</td>
          <td style="text-align: center">✅</td>
          <td style="text-align: center">❌</td>
      </tr>
      <tr>
          <td>Bucket</td>
          <td>O(n+k)</td>
          <td>O(n+k)</td>
          <td>O(n²)</td>
          <td>O(n+k)</td>
          <td style="text-align: center">✅*</td>
          <td style="text-align: center">❌</td>
      </tr>
      <tr>
          <td>Timsort</td>
          <td>O(n)</td>
          <td>O(n log n)</td>
          <td>O(n log n)</td>
          <td>O(n)</td>
          <td style="text-align: center">✅</td>
          <td style="text-align: center">❌</td>
      </tr>
  </tbody>
</table>
<p>A few notes so the table doesn&rsquo;t mislead you:</p>
<ul>
<li><strong>Shell sort</strong>&rsquo;s complexity depends on the <em>gap sequence</em>; its best case changes with the sequence you pick, so the numbers here are just typical orders of magnitude.</li>
<li><strong>Quick sort</strong>&rsquo;s listed space is the average recursion-stack depth O(log n); the worst case degrades to O(n). It partitions in place, but the recursion itself uses the stack.</li>
<li><strong>Bucket sort</strong>&rsquo;s stability has an asterisk: it&rsquo;s only stable if the per-bucket sort (e.g. insertion sort) is stable.</li>
<li><strong>k</strong> is the range of values, <strong>d</strong> is the number of digits — the complexity of the non-comparison sorts is always tied to properties of the data itself, which I&rsquo;ll get into below.</li>
</ul>
<h2 id="before-we-start-a-few-unavoidable-concepts">Before we start: a few unavoidable concepts</h2>
<p>Before going through the algorithms one by one, there are four concepts that nearly every sorting interview question relies on. Getting them straight first means I won&rsquo;t have to keep re-explaining them.</p>
<h3 id="comparison-vs-non-comparison-sorts">Comparison vs. non-comparison sorts</h3>
<p>A <strong>comparison sort</strong> decides order using only one operation: &ldquo;which of these two elements is bigger?&rdquo; Bubble, insertion, merge, quick, and heap are all comparison sorts. What they share: their theoretical lower bound is O(n log n) — nothing can beat it (the reason is below).</p>
<p>A <strong>non-comparison sort</strong> doesn&rsquo;t compare; instead it uses the element values themselves to <em>compute</em> where each one belongs. Counting, radix, and bucket are all like this. Because they sidestep comparison, they can hit linear time O(n) — but the price is extra requirements on the data (e.g. it must be integers in a bounded range).</p>
<h3 id="stability">Stability</h3>
<p>If two elements have equal sort keys, and their <strong>relative order</strong> is preserved after sorting, the sort is <strong>stable</strong>.</p>
<p>A concrete example. Say you have a batch of orders already sorted by time, and now you want to re-sort by amount:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-text" data-lang="text"><span style="display:flex;"><span>Before (sorted by time):  ($100, 9:00)  ($50, 9:01)  ($100, 9:02)
</span></span><span style="display:flex;"><span>Stable sort (by amount):  ($50, 9:01)  ($100, 9:00)  ($100, 9:02)   ← the two $100s keep their time order
</span></span><span style="display:flex;"><span>Unstable sort:            ($50, 9:01)  ($100, 9:02)  ($100, 9:00)   ← the two $100s got scrambled
</span></span></code></pre></div><p>Why do interviewers love this? Because <strong>multi-key sorting</strong> depends on it: sort by the secondary key first, then use a stable sort on the primary key, and the secondary order is preserved. Knowing which sorts are stable (bubble, insertion, merge, counting, radix, Timsort) and which aren&rsquo;t (selection, shell, quick, heap) is almost guaranteed to come up.</p>
<h3 id="in-place-sorting">In-place sorting</h3>
<p>If a sort needs only O(1) or O(log n) extra space, it&rsquo;s <strong>in-place</strong>. Merge sort allocates an extra O(n) array, so it isn&rsquo;t in-place; quick and heap only shuffle the original array, so they are. When an interviewer presses &ldquo;what if memory is tight?&rdquo;, this is usually what they&rsquo;re asking about.</p>
<h3 id="why-complexity-splits-into-best--average--worst">Why complexity splits into best / average / worst</h3>
<p>The same algorithm can behave wildly differently on different inputs. Quicksort is the classic case: O(n log n) on random input, but if the input is already sorted <em>and</em> you keep picking the worst pivot, it degrades to O(n²). When you state complexity in an interview, it&rsquo;s best to say which case you mean — that&rsquo;s exactly where you show how deeply you understand it.</p>
<blockquote>
<p><strong>Why can&rsquo;t comparison sorts beat O(n log n)?</strong>
Any comparison sort can be drawn as a <em>decision tree</em>: each internal node is one comparison, each leaf is one possible final arrangement. There are n! possible arrangements of n elements, so the tree must have at least n! leaves. A binary tree of height h has at most 2ʰ leaves, so 2ʰ ≥ n!, i.e. h ≥ log₂(n!). By Stirling&rsquo;s approximation, log₂(n!) ≈ n log n. The tree&rsquo;s height <em>is</em> the number of comparisons in the worst case, so the lower bound is Ω(n log n). This also explains why going faster means dropping &ldquo;comparison&rdquo; entirely — which is what the non-comparison sorts do.</p>
</blockquote>
<p>OK, concepts done. Let&rsquo;s go through them one by one.</p>
<h2 id="comparison-based-sorts">Comparison-based sorts</h2>
<h3 id="bubble-sort">Bubble Sort</h3>
<p><strong>One-line idea</strong>: compare adjacent elements pairwise, swap if out of order; each pass &ldquo;bubbles&rdquo; the current largest element to the end.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">bubble_sort</span>(arr):
</span></span><span style="display:flex;"><span>    n <span style="color:#f92672">=</span> len(arr)
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">for</span> i <span style="color:#f92672">in</span> range(n <span style="color:#f92672">-</span> <span style="color:#ae81ff">1</span>):
</span></span><span style="display:flex;"><span>        swapped <span style="color:#f92672">=</span> <span style="color:#66d9ef">False</span>
</span></span><span style="display:flex;"><span>        <span style="color:#75715e"># each pass bubbles the largest of the unsorted region to the right end</span>
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">for</span> j <span style="color:#f92672">in</span> range(n <span style="color:#f92672">-</span> <span style="color:#ae81ff">1</span> <span style="color:#f92672">-</span> i):
</span></span><span style="display:flex;"><span>            <span style="color:#66d9ef">if</span> arr[j] <span style="color:#f92672">&gt;</span> arr[j <span style="color:#f92672">+</span> <span style="color:#ae81ff">1</span>]:
</span></span><span style="display:flex;"><span>                arr[j], arr[j <span style="color:#f92672">+</span> <span style="color:#ae81ff">1</span>] <span style="color:#f92672">=</span> arr[j <span style="color:#f92672">+</span> <span style="color:#ae81ff">1</span>], arr[j]
</span></span><span style="display:flex;"><span>                swapped <span style="color:#f92672">=</span> <span style="color:#66d9ef">True</span>
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">if</span> <span style="color:#f92672">not</span> swapped:          <span style="color:#75715e"># a whole pass with no swaps means it&#39;s sorted; quit early</span>
</span></span><span style="display:flex;"><span>            <span style="color:#66d9ef">break</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> arr
</span></span></code></pre></div><ul>
<li><strong>Complexity</strong>: worst and average are both O(n²); with the <code>swapped</code> early-exit, it&rsquo;s O(n) on already-sorted input. Space O(1).</li>
<li><strong>Stability / in-place</strong>: stable (only swaps on a strict greater-than), in-place.</li>
<li><strong>Interview notes</strong>: basically never used in practice, but it&rsquo;s the textbook example of &ldquo;stable + early-exit reaches O(n)&rdquo;. Watch out for that <code>swapped</code> optimization — it&rsquo;s a common gotcha.</li>
<li><strong>LeetCode</strong>: <a href="https://leetcode.com/problems/sort-an-array/">912. Sort an Array</a> — there&rsquo;s no problem dedicated to bubble sort, but this generic sorting problem is a fine sandbox to practice the implementation (pure O(n²) will time out on large inputs, so it&rsquo;s practice only).</li>
</ul>
<h3 id="selection-sort">Selection Sort</h3>
<p><strong>One-line idea</strong>: each pass picks the smallest element from the unsorted region and places it at the end of the sorted region.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">selection_sort</span>(arr):
</span></span><span style="display:flex;"><span>    n <span style="color:#f92672">=</span> len(arr)
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">for</span> i <span style="color:#f92672">in</span> range(n <span style="color:#f92672">-</span> <span style="color:#ae81ff">1</span>):
</span></span><span style="display:flex;"><span>        min_idx <span style="color:#f92672">=</span> i
</span></span><span style="display:flex;"><span>        <span style="color:#75715e"># find the index of the minimum in the unsorted region [i+1, n)</span>
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">for</span> j <span style="color:#f92672">in</span> range(i <span style="color:#f92672">+</span> <span style="color:#ae81ff">1</span>, n):
</span></span><span style="display:flex;"><span>            <span style="color:#66d9ef">if</span> arr[j] <span style="color:#f92672">&lt;</span> arr[min_idx]:
</span></span><span style="display:flex;"><span>                min_idx <span style="color:#f92672">=</span> j
</span></span><span style="display:flex;"><span>        arr[i], arr[min_idx] <span style="color:#f92672">=</span> arr[min_idx], arr[i]
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> arr
</span></span></code></pre></div><ul>
<li><strong>Complexity</strong>: O(n²) no matter what the input looks like — it never speeds up on sorted data. Space O(1).</li>
<li><strong>Stability / in-place</strong>: <strong>unstable</strong>, in-place. For example <code>[5a, 5b, 2]</code>: the first pass swaps <code>2</code> with <code>5a</code>, and the two 5s flip relative order.</li>
<li><strong>Interview notes</strong>: its one redeeming trait is the <strong>minimum number of swaps</strong> (at most n−1), which matters when writes are expensive. It&rsquo;s also the counterexample to &ldquo;best case can save you&rdquo; — it never does — and is often compared against insertion sort.</li>
<li><strong>LeetCode</strong>: <a href="https://leetcode.com/problems/sort-an-array/">912. Sort an Array</a> — practice the implementation on this generic problem; selection sort is a good way to feel &ldquo;few swaps, but no fewer comparisons.&rdquo;</li>
</ul>
<h3 id="insertion-sort">Insertion Sort</h3>
<p><strong>One-line idea</strong>: like sorting a hand of cards — go left to right, inserting each new card into its correct spot among the already-sorted cards on the left.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">insertion_sort</span>(arr):
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">for</span> i <span style="color:#f92672">in</span> range(<span style="color:#ae81ff">1</span>, len(arr)):
</span></span><span style="display:flex;"><span>        key <span style="color:#f92672">=</span> arr[i]
</span></span><span style="display:flex;"><span>        j <span style="color:#f92672">=</span> i <span style="color:#f92672">-</span> <span style="color:#ae81ff">1</span>
</span></span><span style="display:flex;"><span>        <span style="color:#75715e"># shift everything bigger than key one slot right to make room</span>
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">while</span> j <span style="color:#f92672">&gt;=</span> <span style="color:#ae81ff">0</span> <span style="color:#f92672">and</span> arr[j] <span style="color:#f92672">&gt;</span> key:
</span></span><span style="display:flex;"><span>            arr[j <span style="color:#f92672">+</span> <span style="color:#ae81ff">1</span>] <span style="color:#f92672">=</span> arr[j]
</span></span><span style="display:flex;"><span>            j <span style="color:#f92672">-=</span> <span style="color:#ae81ff">1</span>
</span></span><span style="display:flex;"><span>        arr[j <span style="color:#f92672">+</span> <span style="color:#ae81ff">1</span>] <span style="color:#f92672">=</span> key
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> arr
</span></span></code></pre></div><ul>
<li><strong>Complexity</strong>: worst and average O(n²); close to O(n) on <strong>nearly-sorted</strong> input. Space O(1).</li>
<li><strong>Stability / in-place</strong>: stable (the <code>while</code> condition uses <code>&gt;</code>, not <code>&gt;=</code>), in-place.</li>
<li><strong>Interview notes</strong>: don&rsquo;t underestimate it. <strong>On small or nearly-sorted data, insertion sort beats quicksort</strong>, which is exactly why production-grade sorts like Timsort and Introsort fall back to it on small chunks. Of the three basic sorts, it&rsquo;s the most practically useful.</li>
<li><strong>LeetCode</strong>: <a href="https://leetcode.com/problems/insertion-sort-list/">147. Insertion Sort List</a> — a problem built for insertion sort: insert in place on a linked list.</li>
</ul>
<h3 id="shell-sort">Shell Sort</h3>
<p><strong>One-line idea</strong>: an upgraded insertion sort. First do insertion sort on elements spaced by a large &ldquo;gap&rdquo;, then shrink the gap step by step; the last pass uses gap 1 (plain insertion sort), but by then the array is &ldquo;mostly sorted&rdquo;, so it flies.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">shell_sort</span>(arr):
</span></span><span style="display:flex;"><span>    n <span style="color:#f92672">=</span> len(arr)
</span></span><span style="display:flex;"><span>    gap <span style="color:#f92672">=</span> n <span style="color:#f92672">//</span> <span style="color:#ae81ff">2</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">while</span> gap <span style="color:#f92672">&gt;</span> <span style="color:#ae81ff">0</span>:
</span></span><span style="display:flex;"><span>        <span style="color:#75715e"># insertion sort on each subsequence with stride gap</span>
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">for</span> i <span style="color:#f92672">in</span> range(gap, n):
</span></span><span style="display:flex;"><span>            key <span style="color:#f92672">=</span> arr[i]
</span></span><span style="display:flex;"><span>            j <span style="color:#f92672">=</span> i <span style="color:#f92672">-</span> gap
</span></span><span style="display:flex;"><span>            <span style="color:#66d9ef">while</span> j <span style="color:#f92672">&gt;=</span> <span style="color:#ae81ff">0</span> <span style="color:#f92672">and</span> arr[j] <span style="color:#f92672">&gt;</span> key:
</span></span><span style="display:flex;"><span>                arr[j <span style="color:#f92672">+</span> gap] <span style="color:#f92672">=</span> arr[j]
</span></span><span style="display:flex;"><span>                j <span style="color:#f92672">-=</span> gap
</span></span><span style="display:flex;"><span>            arr[j <span style="color:#f92672">+</span> gap] <span style="color:#f92672">=</span> key
</span></span><span style="display:flex;"><span>        gap <span style="color:#f92672">//=</span> <span style="color:#ae81ff">2</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> arr
</span></span></code></pre></div><ul>
<li><strong>Complexity</strong>: depends on the gap sequence. The <code>n//2</code> halving sequence above is O(n²) in the worst case; better sequences (Knuth&rsquo;s <code>3k+1</code>, Sedgewick&rsquo;s) reach O(n^1.5) or better. Space O(1).</li>
<li><strong>Stability / in-place</strong>: <strong>unstable</strong> (gapped swaps scramble the relative order of equal elements), in-place.</li>
<li><strong>Interview notes</strong>: it&rsquo;s the poster child for &ldquo;making data roughly sorted first lets insertion sort go faster.&rdquo; Rarely asked directly, but worth knowing as the bridge — it pushes a simple O(n²) sort toward O(n log n).</li>
<li><strong>LeetCode</strong>: <a href="https://leetcode.com/problems/sort-an-array/">912. Sort an Array</a> — use it to practice shell sort and experiment with how different gap sequences affect runtime.</li>
</ul>
<h3 id="merge-sort">Merge Sort</h3>
<p><strong>One-line idea</strong>: divide and conquer. Split the array in half until you can&rsquo;t split further, then merge two <strong>already-sorted</strong> small arrays into one larger sorted array.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">merge_sort</span>(arr):
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">if</span> len(arr) <span style="color:#f92672">&lt;=</span> <span style="color:#ae81ff">1</span>:
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">return</span> arr
</span></span><span style="display:flex;"><span>    mid <span style="color:#f92672">=</span> len(arr) <span style="color:#f92672">//</span> <span style="color:#ae81ff">2</span>
</span></span><span style="display:flex;"><span>    left <span style="color:#f92672">=</span> merge_sort(arr[:mid])
</span></span><span style="display:flex;"><span>    right <span style="color:#f92672">=</span> merge_sort(arr[mid:])
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> merge(left, right)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">merge</span>(left, right):
</span></span><span style="display:flex;"><span>    result <span style="color:#f92672">=</span> []
</span></span><span style="display:flex;"><span>    i <span style="color:#f92672">=</span> j <span style="color:#f92672">=</span> <span style="color:#ae81ff">0</span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># two pointers, take the smaller of the two each time; &lt;= keeps it stable</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">while</span> i <span style="color:#f92672">&lt;</span> len(left) <span style="color:#f92672">and</span> j <span style="color:#f92672">&lt;</span> len(right):
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">if</span> left[i] <span style="color:#f92672">&lt;=</span> right[j]:
</span></span><span style="display:flex;"><span>            result<span style="color:#f92672">.</span>append(left[i])
</span></span><span style="display:flex;"><span>            i <span style="color:#f92672">+=</span> <span style="color:#ae81ff">1</span>
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">else</span>:
</span></span><span style="display:flex;"><span>            result<span style="color:#f92672">.</span>append(right[j])
</span></span><span style="display:flex;"><span>            j <span style="color:#f92672">+=</span> <span style="color:#ae81ff">1</span>
</span></span><span style="display:flex;"><span>    result<span style="color:#f92672">.</span>extend(left[i:])   <span style="color:#75715e"># whatever&#39;s left just gets appended</span>
</span></span><span style="display:flex;"><span>    result<span style="color:#f92672">.</span>extend(right[j:])
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> result
</span></span></code></pre></div><ul>
<li><strong>Complexity</strong>: best, average, and worst are <strong>all O(n log n)</strong> — rock solid, never degrades. Space O(n) (the merge needs an extra array).</li>
<li><strong>Stability / in-place</strong>: stable, <strong>not in-place</strong>.</li>
<li><strong>Interview notes</strong>: bulletproof complexity and naturally stable — the first choice when you need &ldquo;stable + guaranteed O(n log n) worst case.&rdquo;</li>
<li><strong>LeetCode</strong>: <a href="https://leetcode.com/problems/sort-list/">148. Sort List</a> — the optimal solution for sorting a linked list is merge sort; for the array version use <a href="https://leetcode.com/problems/sort-an-array/">912. Sort an Array</a>.</li>
</ul>
<p>Two high-frequency extensions:</p>
<blockquote>
<p><strong>Concept break: linked-list sorting and external sorting</strong></p>
<p><strong>Linked-list sorting</strong>: merge sort is especially friendly to linked lists — merging only rewires pointers, no extra array needed, so it can achieve O(1) extra space (not counting the recursion stack). This is why the standard answer to &ldquo;sort a linked list in O(n log n)&rdquo; is merge sort, not quicksort.</p>
<p><strong>External sorting</strong>: when the data is too big to fit in memory (the classic interview question: &ldquo;how do you sort a 10 GB file with 1 GB of memory?&rdquo;), the answer is <strong>external merge sort</strong> — split the big file into chunks small enough to fit in memory, read each in, sort it, write it back to disk, then use a <em>k-way merge</em> to combine those sorted files into the final result. Merge&rsquo;s essence — &ldquo;combining multiple sorted sequences&rdquo; — is taken to the extreme here.</p>
</blockquote>
<h3 id="quick-sort">Quick Sort</h3>
<p>This is the section I most needed to pick back up — the partition details got fuzzy after five years untouched. Let&rsquo;s take it slow.</p>
<p><strong>One-line idea</strong>: divide and conquer. Pick a pivot, <strong>partition</strong> the array into &ldquo;less than the pivot&rdquo; and &ldquo;greater than the pivot&rdquo;, put the pivot in its final place, then recurse on both sides.</p>
<h4 id="1-lomuto-partition-the-easiest-to-memorize">1. Lomuto partition (the easiest to memorize)</h4>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">quick_sort</span>(arr, low<span style="color:#f92672">=</span><span style="color:#ae81ff">0</span>, high<span style="color:#f92672">=</span><span style="color:#66d9ef">None</span>):
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">if</span> high <span style="color:#f92672">is</span> <span style="color:#66d9ef">None</span>:
</span></span><span style="display:flex;"><span>        high <span style="color:#f92672">=</span> len(arr) <span style="color:#f92672">-</span> <span style="color:#ae81ff">1</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">if</span> low <span style="color:#f92672">&lt;</span> high:
</span></span><span style="display:flex;"><span>        p <span style="color:#f92672">=</span> partition(arr, low, high)
</span></span><span style="display:flex;"><span>        quick_sort(arr, low, p <span style="color:#f92672">-</span> <span style="color:#ae81ff">1</span>)    <span style="color:#75715e"># recurse left half</span>
</span></span><span style="display:flex;"><span>        quick_sort(arr, p <span style="color:#f92672">+</span> <span style="color:#ae81ff">1</span>, high)   <span style="color:#75715e"># recurse right half</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> arr
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">partition</span>(arr, low, high):
</span></span><span style="display:flex;"><span>    pivot <span style="color:#f92672">=</span> arr[high]            <span style="color:#75715e"># Lomuto: always take the rightmost element as pivot</span>
</span></span><span style="display:flex;"><span>    i <span style="color:#f92672">=</span> low <span style="color:#f92672">-</span> <span style="color:#ae81ff">1</span>                  <span style="color:#75715e"># i is the right boundary of the &#34;less than pivot&#34; region</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">for</span> j <span style="color:#f92672">in</span> range(low, high):
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">if</span> arr[j] <span style="color:#f92672">&lt;</span> pivot:
</span></span><span style="display:flex;"><span>            i <span style="color:#f92672">+=</span> <span style="color:#ae81ff">1</span>
</span></span><span style="display:flex;"><span>            arr[i], arr[j] <span style="color:#f92672">=</span> arr[j], arr[i]
</span></span><span style="display:flex;"><span>    arr[i <span style="color:#f92672">+</span> <span style="color:#ae81ff">1</span>], arr[high] <span style="color:#f92672">=</span> arr[high], arr[i <span style="color:#f92672">+</span> <span style="color:#ae81ff">1</span>]   <span style="color:#75715e"># put the pivot in place</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> i <span style="color:#f92672">+</span> <span style="color:#ae81ff">1</span>
</span></span></code></pre></div><p>Lomuto&rsquo;s advantage is that it advances a single pointer <code>i</code>, so the logic is intuitive and easy to remember. For hand-writing quicksort in an interview, this version is the default.</p>
<h4 id="2-why-it-degrades-and-how-to-fix-it">2. Why it degrades, and how to fix it</h4>
<p>Always taking the rightmost element as pivot has a fatal flaw: <strong>when the input is already sorted (or reverse-sorted), every partition splits the array into sizes 0 and n−1</strong>, the recursion depth becomes n, complexity degrades to O(n²), and it can blow the stack.</p>
<p>The fix is to <strong>stop letting the input &ldquo;predict&rdquo; the pivot</strong> — pick one at random, or take the median of the first, middle, and last elements (median-of-three):</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">import</span> random
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">partition</span>(arr, low, high):
</span></span><span style="display:flex;"><span>    rand <span style="color:#f92672">=</span> random<span style="color:#f92672">.</span>randint(low, high)
</span></span><span style="display:flex;"><span>    arr[rand], arr[high] <span style="color:#f92672">=</span> arr[high], arr[rand]   <span style="color:#75715e"># random pivot, swap it to the right, reuse the logic above</span>
</span></span><span style="display:flex;"><span>    pivot <span style="color:#f92672">=</span> arr[high]
</span></span><span style="display:flex;"><span>    i <span style="color:#f92672">=</span> low <span style="color:#f92672">-</span> <span style="color:#ae81ff">1</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">for</span> j <span style="color:#f92672">in</span> range(low, high):
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">if</span> arr[j] <span style="color:#f92672">&lt;</span> pivot:
</span></span><span style="display:flex;"><span>            i <span style="color:#f92672">+=</span> <span style="color:#ae81ff">1</span>
</span></span><span style="display:flex;"><span>            arr[i], arr[j] <span style="color:#f92672">=</span> arr[j], arr[i]
</span></span><span style="display:flex;"><span>    arr[i <span style="color:#f92672">+</span> <span style="color:#ae81ff">1</span>], arr[high] <span style="color:#f92672">=</span> arr[high], arr[i <span style="color:#f92672">+</span> <span style="color:#ae81ff">1</span>]
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> i <span style="color:#f92672">+</span> <span style="color:#ae81ff">1</span>
</span></span></code></pre></div><p>Two added lines plug the most common pitfall — degrading on sorted input. When the interviewer presses &ldquo;what about quicksort&rsquo;s worst case&rdquo;, this is the standard answer.</p>
<h4 id="3-three-way-quicksort-handling-lots-of-duplicates">3. Three-way quicksort: handling lots of duplicates</h4>
<p>If the array has <strong>many duplicate values</strong> (say, all 0s and 1s), plain quicksort still does a lot of pointless recursion. Three-way quicksort (based on the &ldquo;Dutch national flag problem&rdquo;) splits the array into <code>&lt; pivot</code>, <code>== pivot</code>, and <code>&gt; pivot</code>, and skips the whole equal-to-pivot segment:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">quick_sort_3way</span>(arr, low<span style="color:#f92672">=</span><span style="color:#ae81ff">0</span>, high<span style="color:#f92672">=</span><span style="color:#66d9ef">None</span>):
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">if</span> high <span style="color:#f92672">is</span> <span style="color:#66d9ef">None</span>:
</span></span><span style="display:flex;"><span>        high <span style="color:#f92672">=</span> len(arr) <span style="color:#f92672">-</span> <span style="color:#ae81ff">1</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">if</span> low <span style="color:#f92672">&gt;=</span> high:
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">return</span> arr
</span></span><span style="display:flex;"><span>    pivot <span style="color:#f92672">=</span> arr[low]
</span></span><span style="display:flex;"><span>    lt, i, gt <span style="color:#f92672">=</span> low, low, high   <span style="color:#75715e"># [low,lt)&lt;pivot  [lt,i)==pivot  (gt,high]&gt;pivot</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">while</span> i <span style="color:#f92672">&lt;=</span> gt:
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">if</span> arr[i] <span style="color:#f92672">&lt;</span> pivot:
</span></span><span style="display:flex;"><span>            arr[lt], arr[i] <span style="color:#f92672">=</span> arr[i], arr[lt]
</span></span><span style="display:flex;"><span>            lt <span style="color:#f92672">+=</span> <span style="color:#ae81ff">1</span>
</span></span><span style="display:flex;"><span>            i <span style="color:#f92672">+=</span> <span style="color:#ae81ff">1</span>
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">elif</span> arr[i] <span style="color:#f92672">&gt;</span> pivot:
</span></span><span style="display:flex;"><span>            arr[gt], arr[i] <span style="color:#f92672">=</span> arr[i], arr[gt]
</span></span><span style="display:flex;"><span>            gt <span style="color:#f92672">-=</span> <span style="color:#ae81ff">1</span>               <span style="color:#75715e"># the swapped-in element isn&#39;t checked yet, so i stays</span>
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">else</span>:
</span></span><span style="display:flex;"><span>            i <span style="color:#f92672">+=</span> <span style="color:#ae81ff">1</span>
</span></span><span style="display:flex;"><span>    quick_sort_3way(arr, low, lt <span style="color:#f92672">-</span> <span style="color:#ae81ff">1</span>)
</span></span><span style="display:flex;"><span>    quick_sort_3way(arr, gt <span style="color:#f92672">+</span> <span style="color:#ae81ff">1</span>, high)
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> arr
</span></span></code></pre></div><ul>
<li><strong>Complexity</strong>: average O(n log n), worst O(n²) (almost never seen once you use a random pivot). Space O(log n), for the recursion stack.</li>
<li><strong>Stability / in-place</strong>: <strong>unstable</strong> (the long-distance swaps in partitioning scramble equal elements), <strong>in-place</strong>.</li>
<li><strong>Interview notes</strong>: default to Lomuto when hand-writing; bring up random / median-of-three when asked about the worst case; bring up three-way quicksort when asked about lots of duplicates.</li>
<li><strong>LeetCode</strong>: <a href="https://leetcode.com/problems/sort-an-array/">912. Sort an Array</a> — remember to use a random pivot on submission, or sorted / heavily-duplicated data will time out or overflow the recursion stack.</li>
</ul>
<p>One more high-frequency extension:</p>
<blockquote>
<p><strong>Concept break: Quickselect</strong></p>
<p>&ldquo;Find the k-th largest / smallest element&rdquo; is an interview regular. If you only need the k-th one, there&rsquo;s no need to fully sort: use quicksort&rsquo;s partition, and after each partition look at where the pivot landed — then <strong>recurse into only the side that contains k</strong>. Average O(n), faster than sorting first and then indexing (O(n log n)).</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">quickselect</span>(arr, k):
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;&#34;&#34;return the k-th smallest element, k counting from 1&#34;&#34;&#34;</span>
</span></span><span style="display:flex;"><span>    low, high, target <span style="color:#f92672">=</span> <span style="color:#ae81ff">0</span>, len(arr) <span style="color:#f92672">-</span> <span style="color:#ae81ff">1</span>, k <span style="color:#f92672">-</span> <span style="color:#ae81ff">1</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">while</span> low <span style="color:#f92672">&lt;=</span> high:
</span></span><span style="display:flex;"><span>        p <span style="color:#f92672">=</span> partition(arr, low, high)
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">if</span> p <span style="color:#f92672">==</span> target:
</span></span><span style="display:flex;"><span>            <span style="color:#66d9ef">return</span> arr[p]
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">elif</span> p <span style="color:#f92672">&lt;</span> target:
</span></span><span style="display:flex;"><span>            low <span style="color:#f92672">=</span> p <span style="color:#f92672">+</span> <span style="color:#ae81ff">1</span>        <span style="color:#75715e"># target is on the right</span>
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">else</span>:
</span></span><span style="display:flex;"><span>            high <span style="color:#f92672">=</span> p <span style="color:#f92672">-</span> <span style="color:#ae81ff">1</span>       <span style="color:#75715e"># target is on the left</span>
</span></span></code></pre></div><p><strong>Practice</strong>: <a href="https://leetcode.com/problems/kth-largest-element-in-an-array/">215. Kth Largest Element in an Array</a> — solve it with quickselect at average O(n), a nice contrast to the heap solution.</p>
</blockquote>
<h3 id="heap-sort">Heap Sort</h3>
<p><strong>One-line idea</strong>: first build the array into a <strong>max-heap</strong> (every parent ≥ its children), so the root is the maximum; swap the root to the end, shrink the heap by one, sift the new root down, and repeat until sorted.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">heap_sort</span>(arr):
</span></span><span style="display:flex;"><span>    n <span style="color:#f92672">=</span> len(arr)
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 1. build the heap: starting from the last non-leaf node, sift each one down</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">for</span> i <span style="color:#f92672">in</span> range(n <span style="color:#f92672">//</span> <span style="color:#ae81ff">2</span> <span style="color:#f92672">-</span> <span style="color:#ae81ff">1</span>, <span style="color:#f92672">-</span><span style="color:#ae81ff">1</span>, <span style="color:#f92672">-</span><span style="color:#ae81ff">1</span>):
</span></span><span style="display:flex;"><span>        sift_down(arr, i, n)
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 2. repeatedly swap the root (max) to the end, then fix the remaining heap</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">for</span> end <span style="color:#f92672">in</span> range(n <span style="color:#f92672">-</span> <span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">0</span>, <span style="color:#f92672">-</span><span style="color:#ae81ff">1</span>):
</span></span><span style="display:flex;"><span>        arr[<span style="color:#ae81ff">0</span>], arr[end] <span style="color:#f92672">=</span> arr[end], arr[<span style="color:#ae81ff">0</span>]
</span></span><span style="display:flex;"><span>        sift_down(arr, <span style="color:#ae81ff">0</span>, end)
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> arr
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">sift_down</span>(arr, root, size):
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">while</span> <span style="color:#66d9ef">True</span>:
</span></span><span style="display:flex;"><span>        largest <span style="color:#f92672">=</span> root
</span></span><span style="display:flex;"><span>        left, right <span style="color:#f92672">=</span> <span style="color:#ae81ff">2</span> <span style="color:#f92672">*</span> root <span style="color:#f92672">+</span> <span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span> <span style="color:#f92672">*</span> root <span style="color:#f92672">+</span> <span style="color:#ae81ff">2</span>
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">if</span> left <span style="color:#f92672">&lt;</span> size <span style="color:#f92672">and</span> arr[left] <span style="color:#f92672">&gt;</span> arr[largest]:
</span></span><span style="display:flex;"><span>            largest <span style="color:#f92672">=</span> left
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">if</span> right <span style="color:#f92672">&lt;</span> size <span style="color:#f92672">and</span> arr[right] <span style="color:#f92672">&gt;</span> arr[largest]:
</span></span><span style="display:flex;"><span>            largest <span style="color:#f92672">=</span> right
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">if</span> largest <span style="color:#f92672">==</span> root:      <span style="color:#75715e"># the parent is already the largest; stop sinking</span>
</span></span><span style="display:flex;"><span>            <span style="color:#66d9ef">break</span>
</span></span><span style="display:flex;"><span>        arr[root], arr[largest] <span style="color:#f92672">=</span> arr[largest], arr[root]
</span></span><span style="display:flex;"><span>        root <span style="color:#f92672">=</span> largest
</span></span></code></pre></div><ul>
<li><strong>Complexity</strong>: best, average, and worst are <strong>all O(n log n)</strong>. Building the heap is O(n) (not O(n log n) — a commonly-tested counterintuitive point), then n sift-downs of O(log n) each. Space O(1).</li>
<li><strong>Stability / in-place</strong>: <strong>unstable</strong>, <strong>in-place</strong>.</li>
<li><strong>Interview notes</strong>: it&rsquo;s the <strong>only</strong> sort that both guarantees worst-case O(n log n) <em>and</em> uses only O(1) space — pick it when memory is extremely tight and you can&rsquo;t afford to degrade. Note that it&rsquo;s the same machinery as a <strong>priority queue / heap</strong>: <code>heapq</code>, Top-K problems, the heap inside Dijkstra — all variants of this <code>sift_down</code>. Maintaining a size-k min-heap to find the Top-K is a chained follow-up in this area.</li>
<li><strong>LeetCode</strong>: <a href="https://leetcode.com/problems/kth-largest-element-in-an-array/">215. Kth Largest Element in an Array</a> — the classic heap problem (maintain a size-k min-heap); it can also be solved with quickselect, a nice way to contrast the two approaches.</li>
</ul>
<h2 id="non-comparison-based-sorts">Non-comparison-based sorts</h2>
<p>Every algorithm so far relies on &ldquo;comparison&rdquo;, which is why they&rsquo;re stuck at the O(n log n) line. The next three sidestep comparison, using <strong>the element values themselves</strong> as indices to place items — which lets them hit linear time. The price is requirements on the data. This is also the area where my own memory was blankest, so I&rsquo;ll go into a bit more detail.</p>
<h3 id="counting-sort">Counting Sort</h3>
<p><strong>One-line idea</strong>: count how many times each value occurs, then use a <em>prefix sum</em> to compute each value&rsquo;s position in the result and drop it straight in. Good for <strong>integers with a small value range k</strong>.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">counting_sort</span>(arr):
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">if</span> <span style="color:#f92672">not</span> arr:
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">return</span> arr
</span></span><span style="display:flex;"><span>    lo, hi <span style="color:#f92672">=</span> min(arr), max(arr)
</span></span><span style="display:flex;"><span>    k <span style="color:#f92672">=</span> hi <span style="color:#f92672">-</span> lo <span style="color:#f92672">+</span> <span style="color:#ae81ff">1</span>
</span></span><span style="display:flex;"><span>    count <span style="color:#f92672">=</span> [<span style="color:#ae81ff">0</span>] <span style="color:#f92672">*</span> k
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">for</span> x <span style="color:#f92672">in</span> arr:                <span style="color:#75715e"># 1. count</span>
</span></span><span style="display:flex;"><span>        count[x <span style="color:#f92672">-</span> lo] <span style="color:#f92672">+=</span> <span style="color:#ae81ff">1</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">for</span> i <span style="color:#f92672">in</span> range(<span style="color:#ae81ff">1</span>, k):        <span style="color:#75715e"># 2. prefix sum: count[i] becomes &#34;number of elements &lt;= i&#34;</span>
</span></span><span style="display:flex;"><span>        count[i] <span style="color:#f92672">+=</span> count[i <span style="color:#f92672">-</span> <span style="color:#ae81ff">1</span>]
</span></span><span style="display:flex;"><span>    result <span style="color:#f92672">=</span> [<span style="color:#ae81ff">0</span>] <span style="color:#f92672">*</span> len(arr)
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">for</span> x <span style="color:#f92672">in</span> reversed(arr):      <span style="color:#75715e"># 3. fill back-to-front to stay stable</span>
</span></span><span style="display:flex;"><span>        count[x <span style="color:#f92672">-</span> lo] <span style="color:#f92672">-=</span> <span style="color:#ae81ff">1</span>
</span></span><span style="display:flex;"><span>        result[count[x <span style="color:#f92672">-</span> lo]] <span style="color:#f92672">=</span> x
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> result
</span></span></code></pre></div><ul>
<li><strong>Complexity</strong>: O(n + k), where n is the element count and k is the value range. Space O(n + k).</li>
<li><strong>Stability / in-place</strong>: stable (the key is iterating <strong>back-to-front</strong> in step 3), not in-place.</li>
<li><strong>Interview notes</strong>: when k is far smaller than n (e.g. sorting a hundred thousand scores in 0–100), it crushes any O(n log n) sort. But once k is large (e.g. sorting arbitrary 32-bit integers), the space blows up — that&rsquo;s exactly its limit, and the problem radix sort exists to solve.</li>
<li><strong>LeetCode</strong>: <a href="https://leetcode.com/problems/sort-colors/">75. Sort Colors</a> — only three values (0, 1, 2), so counting sort (or three-way quicksort) handles it in one pass.</li>
</ul>
<h3 id="radix-sort">Radix Sort</h3>
<p><strong>One-line idea</strong>: sort digit by digit. Starting from the least significant digit (ones place), run one <strong>stable</strong> counting sort per digit, all the way up to the most significant digit. Because each pass is stable, the whole thing is sorted once you finish the highest digit.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">radix_sort</span>(arr):
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">if</span> <span style="color:#f92672">not</span> arr:
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">return</span> arr
</span></span><span style="display:flex;"><span>    max_val <span style="color:#f92672">=</span> max(arr)
</span></span><span style="display:flex;"><span>    exp <span style="color:#f92672">=</span> <span style="color:#ae81ff">1</span>                           <span style="color:#75715e"># current digit: 1=ones, 10=tens, 100=hundreds...</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">while</span> max_val <span style="color:#f92672">//</span> exp <span style="color:#f92672">&gt;</span> <span style="color:#ae81ff">0</span>:
</span></span><span style="display:flex;"><span>        arr <span style="color:#f92672">=</span> counting_sort_by_digit(arr, exp)
</span></span><span style="display:flex;"><span>        exp <span style="color:#f92672">*=</span> <span style="color:#ae81ff">10</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> arr
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">counting_sort_by_digit</span>(arr, exp):
</span></span><span style="display:flex;"><span>    count <span style="color:#f92672">=</span> [<span style="color:#ae81ff">0</span>] <span style="color:#f92672">*</span> <span style="color:#ae81ff">10</span>                  <span style="color:#75715e"># base 10, each digit is only 0-9</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">for</span> x <span style="color:#f92672">in</span> arr:
</span></span><span style="display:flex;"><span>        count[(x <span style="color:#f92672">//</span> exp) <span style="color:#f92672">%</span> <span style="color:#ae81ff">10</span>] <span style="color:#f92672">+=</span> <span style="color:#ae81ff">1</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">for</span> i <span style="color:#f92672">in</span> range(<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">10</span>):
</span></span><span style="display:flex;"><span>        count[i] <span style="color:#f92672">+=</span> count[i <span style="color:#f92672">-</span> <span style="color:#ae81ff">1</span>]
</span></span><span style="display:flex;"><span>    result <span style="color:#f92672">=</span> [<span style="color:#ae81ff">0</span>] <span style="color:#f92672">*</span> len(arr)
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">for</span> x <span style="color:#f92672">in</span> reversed(arr):           <span style="color:#75715e"># back-to-front to keep this digit&#39;s sort stable (key to radix sort&#39;s correctness)</span>
</span></span><span style="display:flex;"><span>        digit <span style="color:#f92672">=</span> (x <span style="color:#f92672">//</span> exp) <span style="color:#f92672">%</span> <span style="color:#ae81ff">10</span>
</span></span><span style="display:flex;"><span>        count[digit] <span style="color:#f92672">-=</span> <span style="color:#ae81ff">1</span>
</span></span><span style="display:flex;"><span>        result[count[digit]] <span style="color:#f92672">=</span> x
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> result
</span></span></code></pre></div><ul>
<li><strong>Complexity</strong>: O(d·(n + k)), where d is the number of digits in the largest value and k is the base (here, base 10, so k=10). Space O(n + k).</li>
<li><strong>Stability / in-place</strong>: stable, not in-place.</li>
<li><strong>Interview notes</strong>: it solves counting sort&rsquo;s &ldquo;space blows up when the range is large&rdquo; problem — by breaking a big integer into a few small digits. The version above only handles non-negative integers; to support negatives, shift everything to be non-negative first, or handle positives and negatives separately. Common interview questions: <strong>why must you go from low digit to high digit? Why must each digit&rsquo;s sort be stable?</strong> (Because sorting a higher digit relies on stability to preserve the order already established by the lower digits.)</li>
<li><strong>LeetCode</strong>: <a href="https://leetcode.com/problems/maximum-gap/">164. Maximum Gap</a> — it demands linear time and space, and the standard solution is exactly radix sort or bucket sort.</li>
</ul>
<h3 id="bucket-sort">Bucket Sort</h3>
<p><strong>One-line idea</strong>: distribute the data evenly into a number of &ldquo;buckets&rdquo; by value, sort each bucket internally, then concatenate the buckets in order. Good for <strong>uniformly distributed</strong> data.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">bucket_sort</span>(arr):
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">if</span> <span style="color:#f92672">not</span> arr:
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">return</span> arr
</span></span><span style="display:flex;"><span>    n <span style="color:#f92672">=</span> len(arr)
</span></span><span style="display:flex;"><span>    buckets <span style="color:#f92672">=</span> [[] <span style="color:#66d9ef">for</span> _ <span style="color:#f92672">in</span> range(n)]
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">for</span> x <span style="color:#f92672">in</span> arr:                     <span style="color:#75715e"># assume elements are uniformly distributed in [0, 1)</span>
</span></span><span style="display:flex;"><span>        buckets[int(n <span style="color:#f92672">*</span> x)]<span style="color:#f92672">.</span>append(x)
</span></span><span style="display:flex;"><span>    result <span style="color:#f92672">=</span> []
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">for</span> bucket <span style="color:#f92672">in</span> buckets:
</span></span><span style="display:flex;"><span>        insertion_sort(bucket)        <span style="color:#75715e"># stable sort within buckets keeps the whole thing stable</span>
</span></span><span style="display:flex;"><span>        result<span style="color:#f92672">.</span>extend(bucket)
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> result
</span></span></code></pre></div><ul>
<li><strong>Complexity</strong>: average O(n + k) when the data is uniformly distributed; the worst case (all elements crammed into one bucket) degrades to O(n²). Space O(n + k).</li>
<li><strong>Stability / in-place</strong>: depends on the per-bucket sort — stable if you use insertion sort; not in-place.</li>
<li><strong>Interview notes</strong>: its performance rides entirely on &ldquo;is the data uniformly distributed&rdquo;, which is the biggest difference from counting and radix. Counting and radix are insensitive to the shape of the data; bucket sort is sensitive to it. Classic use case: sorting a batch of floats uniformly distributed in [0, 1).</li>
<li><strong>LeetCode</strong>: <a href="https://leetcode.com/problems/top-k-frequent-elements/">347. Top K Frequent Elements</a> — bucketing by frequency is the slickest solution to this one.</li>
</ul>
<h2 id="what-gets-used-in-the-real-world-timsort">What gets used in the real world: Timsort</h2>
<p>Everything above is a &ldquo;textbook algorithm&rdquo;. But every time you call <code>sorted()</code>, what Python actually runs underneath is <strong>Timsort</strong> — a hybrid carefully tuned for real-world data. It&rsquo;s worth a section of its own, because being able to bring it up in an interview is often a plus.</p>
<p><strong>Core idea</strong>: real data is rarely fully random — it&rsquo;s often <em>partially sorted already</em>. Timsort seizes on this:</p>
<ol>
<li>First scan the array for naturally-sorted contiguous segments, called <strong>runs</strong>;</li>
<li>Pad runs that are too short up to a minimum length (<code>minrun</code>, usually 32–64) using <strong>insertion sort</strong> — as noted earlier, insertion sort is fastest on small arrays;</li>
<li>Then <strong>merge</strong> these runs pairwise following a set of rules, with a &ldquo;galloping&rdquo; mode to speed up the merges.</li>
</ol>
<p>So Timsort = <strong>the skeleton of merge sort + the small-chunk optimization of insertion sort + special-casing for already-sorted data</strong>.</p>
<ul>
<li><strong>Complexity</strong>: worst O(n log n), but down to O(n) on nearly-sorted data. Space O(n).</li>
<li><strong>Stability</strong>: stable. This is why Python&rsquo;s <code>sorted()</code> and <code>list.sort()</code> are guaranteed stable.</li>
<li><strong>Trivia</strong>: Java&rsquo;s object sort (<code>Arrays.sort(Object[])</code>) also uses a Timsort variant; while C++&rsquo;s <code>std::sort</code> uses a different hybrid, <strong>Introsort</strong> (quicksort as the base, switching to heap sort when recursion gets too deep to avoid degrading, and insertion sort on small chunks). &ldquo;Quick + heap + insertion&rdquo; rolled into one — the same spirit as Timsort: <strong>there&rsquo;s no silver bullet; production-grade sorts are all hybrids</strong>.</li>
<li><strong>LeetCode</strong>: <a href="https://leetcode.com/problems/merge-intervals/">56. Merge Intervals</a> — sort first, then sweep and merge; in Python that <code>sorted()</code> call is running Timsort, so it&rsquo;s a good way to feel the speedup from &ldquo;real data is partially sorted&rdquo;.</li>
</ul>
<h2 id="how-to-actually-choose--how-to-answer-in-interviews">How to actually choose / how to answer in interviews</h2>
<p>Here&rsquo;s everything above boiled down to a &ldquo;which one should I use&rdquo; checklist:</p>
<ul>
<li><strong>No special requirements, just want speed</strong> → quicksort (random pivot). The default for most situations.</li>
<li><strong>Need stable, and the worst case must stay O(n log n)</strong> → merge sort.</li>
<li><strong>Memory is extremely tight (need O(1) space) and can&rsquo;t degrade</strong> → heap sort.</li>
<li><strong>Very small data (a few dozen) or nearly sorted</strong> → insertion sort.</li>
<li><strong>Sorting a linked list</strong> → merge sort.</li>
<li><strong>Integers with a small value range</strong> → counting sort.</li>
<li><strong>Integers but a huge range (e.g. fixed-length integers / strings)</strong> → radix sort.</li>
<li><strong>Data uniformly distributed over an interval</strong> → bucket sort.</li>
<li><strong>Only need the k-th largest / the median, not a full sort</strong> → quickselect.</li>
<li><strong>Data too big to fit in memory</strong> → external merge sort.</li>
</ul>
<p>A few common chained follow-ups — have the answers ready:</p>
<ul>
<li><strong>&ldquo;Which sorts are stable?&rdquo;</strong> → bubble, insertion, merge, counting, radix, bucket (when the per-bucket sort is stable), Timsort.</li>
<li><strong>&ldquo;Quicksort&rsquo;s worst case and how to avoid it?&rdquo;</strong> → sorted input + a bad pivot degrades to O(n²); use a random pivot or median-of-three.</li>
<li><strong>&ldquo;Can you beat O(n log n)?&rdquo;</strong> → comparison sorts can&rsquo;t (the decision-tree lower bound); but if the data is integers in a bounded range, non-comparison sorts get you to O(n).</li>
<li><strong>&ldquo;Is there a sort that&rsquo;s O(n log n), stable, <em>and</em> in-place?&rdquo;</strong> → not in typical implementations; merge is stable but not in-place, heap is in-place but not stable, quick is in-place but not stable. A great prompt for testing whether you understand the trade-offs.</li>
</ul>
<h2 id="a-few-common-mistakes">A few common mistakes</h2>
<p>Pitfalls I stepped in (or nearly did) while reviewing:</p>
<ul>
<li><strong>Selection sort doesn&rsquo;t speed up on sorted input</strong> — it has no early-exit, so it&rsquo;s always O(n²). Don&rsquo;t confuse it with bubble / insertion.</li>
<li><strong>Building a heap is O(n), not O(n log n)</strong>. Intuitively it looks like n elements at O(log n) each, but a careful count (most nodes are near the bottom) gives O(n). A common counterintuitive question.</li>
<li><strong>Counting / radix sort fill the result back-to-front</strong> — that step is the source of stability; do it the other way and it&rsquo;s unstable, and an unstable radix sort is simply wrong.</li>
<li><strong>Quicksort&rsquo;s space isn&rsquo;t O(1)</strong> — it partitions in place, but the recursion stack is O(log n) on average and O(n) at worst.</li>
<li><strong>Bucket sort&rsquo;s worst case is O(n²)</strong> — don&rsquo;t only remember the O(n) average; it degrades when the distribution is skewed.</li>
<li><strong>Stable ≠ in-place</strong> — these are two independent dimensions, often asked together in interviews. Don&rsquo;t conflate them.</li>
</ul>
<h2 id="wrapping-up">Wrapping up</h2>
<p>The framework for this topic is actually pretty clear:</p>
<ul>
<li><strong>Comparison sorts</strong> are stuck at O(n log n); among them quicksort is fastest but degrades, merge is stable but space-hungry, heap is in-place but unstable — <strong>there&rsquo;s no all-rounder, it&rsquo;s all trade-offs</strong>;</li>
<li><strong>Non-comparison sorts</strong> trade requirements on the data for linear time, but they make demands on that data;</li>
<li><strong>Production-grade sorts (Timsort, Introsort) are all hybrids</strong>, stitching together the strengths of several algorithms.</li>
</ul>
<p>If, like me, you&rsquo;re picking this back up after a few years, I&rsquo;d suggest running through that cheat sheet at the top: skip anything you can implement from memory and explain the complexity and stability of, and go back to the relevant section for whatever you stumble on. Next time you&rsquo;re prepping for interviews, just come back and skim it again.</p>
<p>Good luck with your interviews.</p>
]]></content:encoded></item></channel></rss>