Name | DL | Torrents | Total Size | PubMed CentralĀ® (PMC) [edit] | 2 | 84.51GB | 5 | 0 |
oa_bulk-ratarmount_indexes_compressed (18 files)
comm_use.0-9A-B.txt.tar.gz.index.sqlite.gz | 18.99MB |
comm_use.A-B.xml.tar.gz.index.sqlite.gz | 22.96MB |
comm_use.C-H.txt.tar.gz.index.sqlite.gz | 23.29MB |
comm_use.C-H.xml.tar.gz.index.sqlite.gz | 28.76MB |
comm_use.I-N.txt.tar.gz.index.sqlite.gz | 23.96MB |
comm_use.I-N.xml.tar.gz.index.sqlite.gz | 30.19MB |
comm_use.O-Z.txt.tar.gz.index.sqlite.gz | 30.64MB |
comm_use.O-Z.xml.tar.gz.index.sqlite.gz | 45.13MB |
mount.sh | 1.02kB |
non_comm_use.0-9A-B.txt.tar.gz.index.sqlite.gz | 11.00MB |
non_comm_use.A-B.xml.tar.gz.index.sqlite.gz | 10.66MB |
non_comm_use.C-H.txt.tar.gz.index.sqlite.gz | 15.41MB |
non_comm_use.C-H.xml.tar.gz.index.sqlite.gz | 14.65MB |
non_comm_use.I-N.txt.tar.gz.index.sqlite.gz | 28.51MB |
non_comm_use.I-N.xml.tar.gz.index.sqlite.gz | 28.19MB |
non_comm_use.O-Z.txt.tar.gz.index.sqlite.gz | 17.31MB |
non_comm_use.O-Z.xml.tar.gz.index.sqlite.gz | 11.84MB |
README.md | 1.04kB |
Type: Dataset
Tags: PMC, PubMed, ratarmount
Bibtex:
Tags: PMC, PubMed, ratarmount
Bibtex:
@article{, title= {ratarmount indexes for PMC OpenAccess subset}, journal= {}, author= {rngadam@coderbunker.com}, year= {}, url= {}, abstract= {## the problem PMC Open Access bulk article (commercial and non-commercial) is a hefty set of files that weight in compressed at 79G and uncompressed at 388G. Archive decompression time in itself can take hours. A bittorrent mirror exists on: https://academictorrents.com/details/06d6badd7d1b0cfee00081c28fddd5e15e106165 ## the solution ratarmount (https://github.com/mxmlnkn/ratarmount), a python application, allows us to use FUSE (through fusepy) to mount a compressed archive as a disk, allowing us randomly access files in the archive as a disk without first decompression. To achieve good performance, it creates an index (an sqlite database per archive). This set of indexes still weight in at 1.4G uncompressed (345M compressed). ## usage * decompress all indexes in the same directory you've downloaded oa_bulk * install ratarmount * use ratarmount to mount the oa_bulk archives on the disk a sample script ```mount.sh``` is provided as an example ## distribution we also use bittorrent to distribute the set of indexes. }, keywords= {PMC, PubMed, ratarmount}, terms= {}, license= {CC BY 4.0}, superseded= {} }