EEG/signal processing approach at a neurotech startup

over a summer internship at a neurotech startup, i went from knowing nothing about EEG to running signal processing pipelines, debugging hardware, and designing experiments. this is the workflow that emerged — not textbook-clean, but real.

the problem space

we were trying to detect and measure brain responses to stimuli — visual evoked potentials (VEPs), event-related potentials (ERPs), and various other signals. the core challenge: EEG data is incredibly noisy, and the signals we cared about are tiny.

key signal types:

VEPs: brain response to visual stimuli (checkerboard patterns, flashes). key peaks at N75, P100, N145.
ERPs: brain response to any stimulus (visual, audio, motor). P300 was the easiest to detect.
alpha waves: oscillations in the 8-13Hz range, measurable with eyes closed.
frontal alpha asymmetry: difference in alpha power between left and right frontal regions.

the processing pipeline

filtering

bandpass filter: 1-100Hz (sometimes 0.1-10Hz or 0.8-100Hz depending on what we were looking for)
notch filter at 60Hz for electrical noise
tried both hand-coded and library implementations (MNE)

artifact handling

blinks and movements were the biggest contaminants
MNE had artifact rejection tools but getting them working was non-trivial
environmental noise: footsteps, electrical devices, even heartbeats were picked up
"others' movements — it picks up on footsteps. your movements — move your head slightly and it messes up the data"

epoch averaging

average over 100-200 trials per stimulus to improve signal-to-noise
"if our data were perfect we wouldn't have to do this but it's chill"
considered z-score averaging to preserve relative magnitude in spiking

plotting and analysis

filtered data overlaid with stimulus timestamps
FFT / power spectral density
averaged responses over epochs
peak detection (automated and manual)

what i actually learned

implementation approaches (ranked by usefulness)

library + vibe coded (MNE): most robust, best artifact handling
hand coded: bin average, rolling average — good for understanding what's happening
fully vibe coded: interpolation lines, rolling averages — sometimes produced plausible but wrong results

the LLM trap

"data analysis is hard because: big data, for robustness can't really use LLMs that much, every time you use them a lot you fall into a pitfall that makes you cooked."

this was a hard lesson. LLMs are great for boilerplate code but terrible for signal processing decisions. they'll confidently suggest parameters that produce clean-looking but meaningless results. had to learn to reading-papers and understand the actual science.

the iteration loop

"lots of iteration. lots of failure. graphs that don't look right, code that doesn't work. need to get correct times. interpreting graphs — sometimes good but looks bad, sometimes bad but looks good."

the workflow was: write script → run → look at output → realize something is wrong → research → fix → repeat. "writing script robustly actually takes time" — couldn't just hack something together and move on.

the current testing procedure (by the end)

conditions: one eye, fixate at center, dark room
data processing: bandpass from [0.1, 0.8]Hz to [10, 100]Hz, notch at 60Hz
plot: filtered data with stimuli, FFT, averaged over epochs
validation: alpha band testing, self-testing with flashing