Documents and Dependencies: an Exploration of Vector Space Models for Semantic Composition
Alona Fyshe

conll2013.zip1.32GB
Type: Dataset
Tags: Dataset

Bibtex:
@article{,
title = {Documents and Dependencies: an Exploration of Vector Space Models for Semantic Composition},
journal = {},
author = {Alona Fyshe},
year = {2013},
url = {https://www.cs.cmu.edu/~afyshe/},
abstract = {This zip should contain 4 files:
- README.txt (this file)
- doc2Dep20MWU57k_1000concat2000.tab
- doc2Dep20MWU57k_1000concat2000.txt
- doc2Dep20MWU57k_1000concat2000.mat

****doc2Dep20MWU57k_1000concat2000.tab****
This file contains the 54975 word-units with POS tags.  The order of the words in this file corresponds to the order of the rows in doc2Dep20MWU57k_1000concat2000.tab

****doc2Dep20MWU57k_1000concat2000.tab****
This tab-separated-value file contains the concatenated SVD matrices as created described in  "Documents and Dependencies: an Exploration of Vector Space Models for Semantic Composition"(Fyshe 2013).  The size of the matrix is 54975x2000.  The first 1000 dimensions are Document dimensions, the second 1000 (1001-2000) are Dependency dimensions.  The rows appear in the same order as the word-units in doc2Dep20MWU57k_1000concat2000.txt

****doc2Dep20MWU57k_1000concat2000.mat****
For convenience, this is the data contained in doc2Dep20MWU57k_1000concat2000.tab & doc2Dep20MWU57k_1000concat2000.txt saved into two matlab variables.  count_matrix is the concatenated SVD matrices (tab file), words are the words (txt file).

Questions may be directed to Alona Fyshe, afyshe at cs dot cmu dot edu.
}
}

Send Feedback