Create wiki/research-notes/signal-processing-workflow.md
f2a041935aa6 harrisonqian 2026-04-12 1 file
new file mode 100644
index 0000000..80cefeb
@@ -0,0 +1,69 @@
+---
+visibility: public-edit
+---
+
+# EEG/signal processing approach at a neurotech startup
+
+over a summer internship at a neurotech startup, i went from knowing nothing about EEG to running signal processing pipelines, debugging hardware, and designing experiments. this is the workflow that emerged — not textbook-clean, but real.
+
+## the problem space
+
+we were trying to detect and measure brain responses to stimuli — visual evoked potentials (VEPs), event-related potentials (ERPs), and various other signals. the core challenge: EEG data is incredibly noisy, and the signals we cared about are tiny.
+
+key signal types:
+- **VEPs**: brain response to visual stimuli (checkerboard patterns, flashes). key peaks at N75, P100, N145.
+- **ERPs**: brain response to any stimulus (visual, audio, motor). P300 was the easiest to detect.
+- **alpha waves**: oscillations in the 8-13Hz range, measurable with eyes closed.
+- **frontal alpha asymmetry**: difference in alpha power between left and right frontal regions.
+
+## the processing pipeline
+
+### filtering
+- bandpass filter: 1-100Hz (sometimes 0.1-10Hz or 0.8-100Hz depending on what we were looking for)
+- notch filter at 60Hz for electrical noise
+- tried both hand-coded and library implementations (MNE)
+
+### artifact handling
+- blinks and movements were the biggest contaminants
+- MNE had artifact rejection tools but getting them working was non-trivial
+- environmental noise: footsteps, electrical devices, even heartbeats were picked up
+- "others' movements — it picks up on footsteps. your movements — move your head slightly and it messes up the data"
+
+### epoch averaging
+- average over 100-200 trials per stimulus to improve signal-to-noise
+- "if our data were perfect we wouldn't have to do this but it's chill"
+- considered z-score averaging to preserve relative magnitude in spiking
+
+### plotting and analysis
+- filtered data overlaid with stimulus timestamps
+- FFT / power spectral density
+- averaged responses over epochs
+- peak detection (automated and manual)
+
+## what i actually learned
+
+### implementation approaches (ranked by usefulness)
+1. **library + vibe coded (MNE)**: most robust, best artifact handling
+2. **hand coded**: bin average, rolling average — good for understanding what's happening
+3. **fully vibe coded**: interpolation lines, rolling averages — sometimes produced plausible but wrong results
+
+### the LLM trap
+"data analysis is hard because: big data, for robustness can't really use LLMs that much, every time you use them a lot you fall into a pitfall that makes you cooked."
+
+this was a hard lesson. LLMs are great for boilerplate code but terrible for signal processing decisions. they'll confidently suggest parameters that produce clean-looking but meaningless results. had to learn to [[read papers|wiki/research-notes/reading-papers]] and understand the actual science.
+
+### the iteration loop
+"lots of iteration. lots of failure. graphs that don't look right, code that doesn't work. need to get correct times. interpreting graphs — sometimes good but looks bad, sometimes bad but looks good."
+
+the workflow was: write script → run → look at output → realize something is wrong → research → fix → repeat. "writing script robustly actually takes time" — couldn't just hack something together and move on.
+
+## the current testing procedure (by the end)
+
+- conditions: one eye, fixate at center, dark room
+- data processing: bandpass from [0.1, 0.8]Hz to [10, 100]Hz, notch at 60Hz
+- plot: filtered data with stimuli, FFT, averaged over epochs
+- validation: alpha band testing, self-testing with flashing
+
+---
+
+*see also: [[debugging hardware|wiki/research-notes/debugging-hardware]], [[experiment design|wiki/research-notes/experiment-design]], [[reading papers|wiki/research-notes/reading-papers]]*
\ No newline at end of file