udpipe

Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit

CRAN Package

This natural language processing toolkit provides language-agnostic 'tokenization', 'parts of speech tagging', 'lemmatization' and 'dependency parsing' of raw text. Next to text parsing, the package also allows you to train annotation models based on data of 'treebanks' in 'CoNLL-U' format as provided at https://universaldependencies.org/format.html. The techniques are explained in detail in the paper: 'Tokenizing, POS Tagging, Lemmatizing and Parsing UD 2.0 with UDPipe', available at doi:10.18653/v1/K17-3009. The toolkit also contains functionalities for commonly used data manipulations on texts which are enriched with the output of the parser. Namely functionalities and algorithms for collocations, token co-occurrence, document term matrix handling, term frequency inverse document frequency calculations, information retrieval metrics (Okapi BM25), handling of multi-word expressions, keyword detection (Rapid Automatic Keyword Extraction, noun phrase extraction, syntactical patterns) sentiment scoring and semantic similarity analysis.


Documentation


Team


Insights

Last 30 days

This package has been downloaded 5,802 times in the last 30 days. That's a lot of interest! Someone might even write a blog post about it. The following heatmap shows the distribution of downloads per day. Yesterday, it was downloaded 195 times.

Sun
Mon
Tue
Wed
Thu
Fri
Sat
0 downloadsMar 16, 2025
0 downloadsMar 17, 2025
191 downloadsMar 18, 2025
235 downloadsMar 19, 2025
198 downloadsMar 20, 2025
173 downloadsMar 21, 2025
97 downloadsMar 22, 2025
130 downloadsMar 23, 2025
240 downloadsMar 24, 2025
226 downloadsMar 25, 2025
178 downloadsMar 26, 2025
161 downloadsMar 27, 2025
201 downloadsMar 28, 2025
105 downloadsMar 29, 2025
112 downloadsMar 30, 2025
130 downloadsMar 31, 2025
273 downloadsApr 1, 2025
354 downloadsApr 2, 2025
259 downloadsApr 3, 2025
137 downloadsApr 4, 2025
128 downloadsApr 5, 2025
136 downloadsApr 6, 2025
238 downloadsApr 7, 2025
201 downloadsApr 8, 2025
191 downloadsApr 9, 2025
261 downloadsApr 10, 2025
263 downloadsApr 11, 2025
243 downloadsApr 12, 2025
150 downloadsApr 13, 2025
208 downloadsApr 14, 2025
188 downloadsApr 15, 2025
195 downloadsApr 16, 2025
0 downloadsApr 17, 2025
0 downloadsApr 18, 2025
0 downloadsApr 19, 2025
97
354

The following line graph shows the downloads per day. You can hover over the graph to see the exact number of downloads per day.

Last 365 days

This package has been downloaded 56,870 times in the last 365 days. This work is reaching a lot of screens. A significant achievement indeed! The day with the most downloads was Oct 29, 2024 with 381 downloads.

The following line graph shows the downloads per day. You can hover over the graph to see the exact number of downloads per day.

Data provided by CRAN


Binaries


Dependencies

  • Imports3 packages
  • Suggests4 packages
  • Linking To1 package
  • Reverse Imports6 packages
  • Reverse Suggests13 packages
  • Reverse Enhances1 package