vector semantics

vector semantics is a sense encoding method.
“a meaning of the word should be tied to how they are used”
we measure similarity between word vectors with cosine similarity. see also vector-space model.
motivation idea 1 neighboring words can help infer semantic meaning of new words: “we can define a word based on its distribution in language use”
idea 2 meaning should be in a point in space, just like affective meaning (i.e. a score in each dimension).
that is: a word should be a vector in n space
vector semantics Each word is a point based on distribution; each word is a vector and similar words are nearby in semantic space.
The intuition is that classifiers can generalize to similar, but unseen words more easily by processing embeddings.
transposing a Term-Document Matrix Typically we read a Term-Document Matrix column-wise, to understand what each document can be encoded in terms of words.
However, if you read it row-wise, you can see a distribution for words over the documents.
term-term matrix a term-term matrix is a |V| \times |V| matrix that measures co-occurrence in some context. So each cell would be the number of times the two words co-occur in some small window.
point-wise mutual information we usually normalize a Term-Document Matrix via TF-IDF. However, for term-term matrix, we usually normalize it as:

\begin{equation} PMI(w_1, w_2) = \log \frac{p(w_1,w_2)}{p(w_1)p(w_2)} \end{equation}

“would something appear more often then change”
word2vec see word2vec