index 0340cab..052845e 100644
@@ -10,7 +10,7 @@ the digital world is built on math. not approximately, not metaphorically — li
in 1937, claude shannon's master's thesis showed that boolean algebra (true/false, AND/OR/NOT) could be implemented with electrical switches. this single insight is the foundation of all digital computing. every transistor in your phone is computing a boolean function. a modern chip has billions of them, all doing math.
-the instruction your CPU is executing right now — reading a value from memory, comparing two numbers, jumping to a different part of the program — is a sequence of boolean operations on binary numbers. [arithmetic](/wiki/immediate/arithmetic-everywhere) at the hardware level. and boolean algebra is really [[structural/set-theory-as-thinking|set theory]] in disguise — AND is intersection, OR is union, NOT is complement. the entire digital world rests on set operations.
+the instruction your CPU is executing right now — reading a value from memory, comparing two numbers, jumping to a different part of the program — is a sequence of boolean operations on binary numbers. [[arithmetic-everywhere|arithmetic]] at the hardware level. and boolean algebra is really [[set-theory-as-thinking|set theory]] in disguise — AND is intersection, OR is union, NOT is complement. the entire digital world rests on set operations.
## algorithms and complexity
@@ -20,7 +20,7 @@ complexity theory classifies problems by how their difficulty scales:
- **O(1)**: constant time. looking up an element in a hash table. doesn't matter how big the table is.
- **O(log n)**: binary search. searching a sorted list of a million items takes ~20 steps.
- **O(n)**: linear. reading every element once.
-- **O(n log n)**: sorting. the best you can do for comparison-based sorting (connects to [ordering](/wiki/immediate/ordering-and-comparison)).
+- **O(n log n)**: sorting. the best you can do for comparison-based sorting (connects to [[ordering-and-comparison|ordering]]).
- **O(n²)**: quadratic. naive pairwise comparison. starts hurting around n = 10,000.
- **O(2ⁿ)**: exponential. brute-force search. at n = 100, there are more possibilities than atoms in the universe.
@@ -32,7 +32,7 @@ the security of the internet rests on a mathematical asymmetry: multiplying two
RSA encryption works because of this. your browser uses it right now. the math is number theory — a branch of "pure" math that was considered the most useless branch of mathematics for centuries. G.H. Hardy, the famous number theorist, wrote proudly that his work had no practical applications. he was wrong — it now secures trillions of dollars in transactions.
-this is a beautiful example of [abstraction as power](/wiki/structural/abstraction-as-power): the most abstract math becoming the most practical.
+this is a beautiful example of [[abstraction-as-power|abstraction as power]]: the most abstract math becoming the most practical.
## information theory
@@ -40,15 +40,15 @@ shannon (again) founded information theory in 1948. the key idea: information ca
shannon's theorems set fundamental limits on communication: there's a maximum rate at which you can transmit information through a noisy channel, and no encoding scheme can beat it. every wifi standard, every cell phone protocol, every streaming video codec is engineered to approach these mathematical limits.
-the connection to [probability](/wiki/immediate/probability-in-daily-life): information theory is deeply connected to probability. the "surprise" of an event is -log₂(probability). unlikely events carry more information than likely ones. "the sun rose today" is low-information; "a meteor hit the earth" is high-information. entropy — the average surprise — measures uncertainty.
+the connection to [[probability-in-daily-life|probability]]: information theory is deeply connected to probability. the "surprise" of an event is -log₂(probability). unlikely events carry more information than likely ones. "the sun rose today" is low-information; "a meteor hit the earth" is high-information. entropy — the average surprise — measures uncertainty.
## machine learning and linear algebra
-modern AI is mostly [linear algebra](/wiki/structural/linear-algebra-as-thinking) and [calculus](/wiki/structural/calculus-as-thinking). a neural network is a sequence of matrix multiplications and nonlinear functions. training is gradient descent — [multivariable calculus](/wiki/structural/multivariable-calculus-as-thinking) applied to a loss function with millions of parameters.
+modern AI is mostly [[linear-algebra-as-thinking|linear algebra]] and [[calculus-as-thinking|calculus]]. a neural network is a sequence of matrix multiplications and nonlinear functions. training is gradient descent — [[multivariable-calculus-as-thinking|multivariable calculus]] applied to a loss function with millions of parameters.
the semantic space example from my essay is a direct application: word embeddings represent meanings as vectors, and the geometry of the vector space captures semantic relationships. "woman" + "king" - "man" ≈ "queen" works because the vector arithmetic in embedding space mirrors conceptual relationships.
-[[structural/topology-as-thinking|topological data analysis]] is an emerging tool — using persistent homology to find the "shape" of high-dimensional datasets that linear methods miss.
+[[topology-as-thinking|topological data analysis]] is an emerging tool — using persistent homology to find the "shape" of high-dimensional datasets that linear methods miss.
## the deep point