corpustools

Managing, Querying and Analyzing Tokenized Text

CRAN Package

Provides text analysis in R, focusing on the use of a tokenized text format. In this format, the positions of tokens are maintained, and each token can be annotated (e.g., part-of-speech tags, dependency relations). Prominent features include advanced Lucene-like querying for specific tokens or contexts (e.g., documents, sentences), similarity statistics for words and documents, exporting to DTM for compatibility with many text analysis packages, and the possibility to reconstruct original text from tokens to facilitate interpretation.

  • Version0.5.1
  • R version≥ 3.5.0
  • LicenseGPL-3
  • Needs compilation?Yes
  • Last release05/08/2023

Documentation


Team


Insights

Last 30 days

Last 365 days

The following line graph shows the downloads per day. You can hover over the graph to see the exact number of downloads per day.

Data provided by CRAN


Binaries


Dependencies

  • Depends1 package
  • Imports15 packages
  • Suggests5 packages
  • Linking To2 packages
  • Reverse Imports1 package
  • Reverse Suggests1 package