Load WARC (Web ARChive) files into Apache Spark using 'sparklyr'. This allows to read files from the Common Crawl project <http://commoncrawl.org/>.
Version: | 0.1.5 |
Imports: | DBI, sparklyr, Rcpp |
LinkingTo: | Rcpp |
Published: | 2020-12-15 |
Author: | Yitao Li |
Maintainer: | Yitao Li <yitao at rstudio.com> |
BugReports: | https://github.com/r-spark/sparkwarc |
License: | Apache License 2.0 |
NeedsCompilation: | yes |
SystemRequirements: | C++11 |
Materials: | README |
CRAN checks: | sparkwarc results |
Reference manual: | sparkwarc.pdf |
Package source: | sparkwarc_0.1.5.tar.gz |
Windows binaries: | r-devel: sparkwarc_0.1.5.zip, r-release: sparkwarc_0.1.5.zip, r-oldrel: sparkwarc_0.1.5.zip |
macOS binaries: | r-release: sparkwarc_0.1.5.tgz, r-oldrel: sparkwarc_0.1.5.tgz |
Old sources: | sparkwarc archive |
Please use the canonical form https://CRAN.R-project.org/package=sparkwarc to link to this page.