[[
wikihub
]]
Search
⌘K
Explore
People
For Agents
Sign in
@harrisonqian / Work Reflections / wiki/research-notes/experiment-design.md
public-edit · collaborator
Cancel
Save
Edit
Preview
--- visibility: public-edit --- # how to design tests, iterate, and ensure robustness from running dozens of experiments at a [[neurotech startup|signal-processing-workflow]] — on brains, on hardware, on signals — i developed a sense for what separates experiments that produce useful data from experiments that produce noise. ## the experiment design framework ### define what you're measuring before you start the most common failure: "not insanely intentional with testing. could do much better with understanding what is going on and for what reason." concretely, before any test: - what signal am i looking for? - what would success look like in the data? - what would failure look like? - what are the confounds? for our visual evoked potential tests, the checklist was: - one eye (monocular) - fixate on central target - dark room - 70-100cm from screen - consider contrast - which test type (pattern reversal at 2Hz, onset/offset, flash) ### control one variable at a time we made the mistake of changing multiple things between tests — new stimulus pattern AND new filtering AND new electrode placement. when results changed, we couldn't tell why. the better approach: "do something to get a VEP on myself first." test the simplest possible case. if that doesn't work, the problem is fundamental. if it does work, add complexity one variable at a time. ### the validation ladder from simplest to most complex: 1. can you see alpha waves with eyes closed? (if no, hardware is broken) 2. can you see a response to a flash on yourself? (if no, timing or processing is broken) 3. can you see a response on another person? (if no, setup or parameters might be off) 4. can you reproduce results on a second trial? (if no, noise is too high) 5. can you see differences between conditions? (this is the actual experiment) skipping to step 5 without passing steps 1-4 was a mistake i made repeatedly. ## iteration patterns ### the research → test → analyze loop "lots of iteration. lots of failure." the actual workflow: 1. research: read standards, look at what parameters others use 2. set up: configure hardware, write stimulus script, prepare environment 3. run: execute the test, usually 15-45 minutes 4. analyze: process data, plot, look at results 5. interpret: is this real signal or noise? compare to expected results. 6. adjust: change one parameter and go back to step 3 steps 3-6 repeat dozens of times. "tested checkerboard on myself, tried flash on myself, tried checkerboard on another person, old stuff was all noise." ### when to pivot vs persist "just check differences and try to maximize it" vs "maybe they are just good, we are reasonably satisfied." the tension between perfectionism and pragmatism. heuristic: if you've tried 5 different parameter combinations and none work, the problem is probably not the parameters. step back and question the approach. "tried many training things, didn't do much" — knowing when to pivot to a completely different method rather than tweaking the current one. ## the timing problem timestamp synchronization was a recurring nightmare. each device (EEG headset, stimulus presentation, sensors) has its own clock. getting them aligned: - tried using a wire to send sync pulses — "doesn't work because it sends a pulse that is picked up by the headset in a huge spike" - tried software timestamps — "current script might not be using some settings that are important" - tried logging approach — eventually worked but required careful validation "need to get correct times" — without precise timing, epoch-averaging is meaningless. this is an unsexy but critical part of [[experiment design|experiment-design]] that textbooks don't emphasize enough. ## designing for robustness ### the subject experience "if want longer tests, distraction — brain not focusing." for human experiments, the subject's experience matters: - boredom causes attention drift - watching videos might confound the signal (occipital lobe activation, pupil changes from brightness) - need to balance test duration with data quality "how to have them not be bored? video? inscapes might not be bad — could confound occipital lobe, could affect pupils due to brightness." ### the engineering gamble sometimes you have to make a call with incomplete information. "data analysis stuff: hard because big data, for robustness can't really use LLMs that much." the meta-skill: knowing when you have enough data to make a decision vs when you're just pattern-matching on noise. "interpreting graphs — sometimes good but looks bad, sometimes bad but looks good. need to balance time scrutinizing graph and writing script." --- *see also: [[signal processing workflow|signal-processing-workflow]], [[debugging hardware|debugging-hardware]], [[reading papers|reading-papers]]*
Markdown
Ready