@article{,
title = {Documents and Dependencies: an Exploration of Vector Space Models for Semantic Composition},
journal = {},
author = {Alona Fyshe},
year = {2013},
url = {https://www.cs.cmu.edu/~afyshe/},
abstract = {This zip should contain 4 files:
- README.txt (this file)
- doc2Dep20MWU57k_1000concat2000.tab
- doc2Dep20MWU57k_1000concat2000.txt
- doc2Dep20MWU57k_1000concat2000.mat
****doc2Dep20MWU57k_1000concat2000.tab****
This file contains the 54975 word-units with POS tags. The order of the words in this file corresponds to the order of the rows in doc2Dep20MWU57k_1000concat2000.tab
****doc2Dep20MWU57k_1000concat2000.tab****
This tab-separated-value file contains the concatenated SVD matrices as created described in "Documents and Dependencies: an Exploration of Vector Space Models for Semantic Composition"(Fyshe 2013). The size of the matrix is 54975x2000. The first 1000 dimensions are Document dimensions, the second 1000 (1001-2000) are Dependency dimensions. The rows appear in the same order as the word-units in doc2Dep20MWU57k_1000concat2000.txt
****doc2Dep20MWU57k_1000concat2000.mat****
For convenience, this is the data contained in doc2Dep20MWU57k_1000concat2000.tab & doc2Dep20MWU57k_1000concat2000.txt saved into two matlab variables. count_matrix is the concatenated SVD matrices (tab file), words are the words (txt file).
Questions may be directed to Alona Fyshe, afyshe at cs dot cmu dot edu.
}
}