sentencepiece
Text Tokenization using Byte Pair Encoding and Unigram Modelling
Unsupervised text tokenizer allowing to perform byte pair encoding and unigram modelling. Wraps the 'sentencepiece' library
- Version0.2.3
- R version≥ 2.10
- LicenseMPL-2.0
- Needs compilation?Yes
- Last release11/13/2022
Documentation
Team
Jan Wijffels
BNOSAC
Show author detailsRolesCopyright holderGoogle Inc.
Show author detailsRolesContributor, Copyright holderat src/sentencepiece/src
The Abseil Authors
Show author detailsRolesContributor, Copyright holderat src/third_party/absl
Google Inc.
Show author detailsRolesContributor, Copyright holderKenton Varda
Show author detailsRolesContributor, Copyright holderSanjay Ghemawat
Show author detailsRolesContributor, Copyright holderJeff Dean
Show author detailsRolesContributor, Copyright holderLaszlo Csomor
Show author detailsRolesContributor, Copyright holderWink Saville
Show author detailsRolesContributor, Copyright holderJim Meehan
Show author detailsRolesContributor, Copyright holderChris Atenasio
Show author detailsRolesContributor, Copyright holderJason Hsueh
Show author detailsRolesContributor, Copyright holderAnton Carver
Show author detailsRolesContributor, Copyright holderMaxim Lifantsev
Show author detailsRolesContributor, Copyright holderSusumu Yata
Show author detailsRolesContributor, Copyright holderat src/third_party/darts_clone
Daisuke Okanohara
Show author detailsRolesContributor, Copyright holderYuta Mori
Show author detailsRolesContributor, Copyright holderBenjamin Heinzerling
Show author detailsRolesContributor, Copyright holder
Insights
Last 30 days
Last 365 days
The following line graph shows the downloads per day. You can hover over the graph to see the exact number of downloads per day.
Data provided by CRAN
Binaries
Dependencies
- Depends1 package
- Imports2 packages
- Suggests2 packages
- Linking To1 package
- Reverse Suggests1 package