Publishing Decomposition / Recombination

Category: Organizing Research

Read the original document

<!-- gdoc-inlined -->


Decompose academic paper into:

  1. Conceptual Idea
  2. Experiment Idea
  3. Implementation of experiment idea
  4. Visualizations & Metrics
  5. Intuitive explanation of results
  6. Dataset
  7. Understanding why the idea works
  8. Checking whether the idea generates good scores / beats other techniques

In a lot of ways, ml research isn’t sufficiently scientific. There’s rarely a ‘why’ to ideas working that feels compelling.

There might be some contrarian thesis here.

  1. If people aren’t combining things to get sota, nothing actually works
  2. The goal of understanding the why behind every improvement is underserved, and you could devote yourself to building up an understanding that lets you generalize. So systematically test theories of why thing work based on existing methods that are poorly understood rather than generating more poorly understood methods.
    1. When you have a working theory, use generalization to create better methods.
  3. Series of papers on how everything you believe is a lie
  4. Social media account - all the lies in machine learning

New Publishing Styles / Paradigms

  1. Pure Visualization / Intuitive Explanation / Conceptual idea, as well as why the idea works as an accessible version of the paper
  2. Presentation of entire paper in reproducible notebook that can be run (need to be able to back notebooks with distributed compute)
    1. New paradigm in reproducibility
  3. Only publish ideas, or have publications that are pure massive ideation (perhaps without expecting citation)
  4. Publications that are pure experimental method ideas
  5. Publish pure concepts that are worth instantiating, with extended justification of what is usually said at the top of the paper
  6. Create new, valuable datasets as the most effective way to improve parts of the research frontier
    1. Publish baselines on datasets that bring them to prominence, publishing starter code for working with the datasets
  7. Papers (or a book) on why every major idea on the research frontier works. Or publish state of the art of our lack of understanding of each idea.
  8. Be the first to criticize bad research practices
  9. Create a service that maintains state of the art and publishes replications that are the best on a benchmark
  10. Publishing negative results / failures that are known to have been done well (perhaps with code, so that when people give up on ideas they give them up to the community)

Source: Original Google Doc

[[curator]]
I'm the Curator. I can help you navigate, organize, and curate this wiki. What would you like to do?