OpenRefine (formerly Google Refine) is a popular, open source data cleaning software. rrefine enables users to programmatically trigger data transfer between R and OpenRefine. Using the functions available in this package, you can import, export, apply data cleaning operations, or delete a project in OpenRefine directly from R. There are several client libraries for automating OpenRefine tasks via Python, nodeJS and Ruby. rrefine extends this functionality to R users.
The development version of rrefine is available on GitHub and can be installed via devtools:
# install.packages("devtools")
devtools::install_github("vpnagraj/rrefine")
library(rrefine)
rrefine is also available on CRAN:
install.packages("rrefine")
library(rrefine)
The package includes the following functionality to interface with OpenRefine projects:
refine_upload(): Upload data to a projectrefine_export(): Export data from a projectrefine_delete(): Delete a projectrefine_metadata(): Retrieve metadata from all
projectsrefine_project_summary(): Get project summary datarefine_operations(): Apply arbitrary operations to a
projectrefine_remove_column(): Remove a column from a
projectrefine_add_column(): Add a column to a projectrefine_rename_column(): Rename an existing column in a
projectrefine_move_column(): Move a column to a new indexrefine_transform(): Apply arbitrary text
transformationsrefine_to_lower(): Coerce text to lowercaserefine_to_upper(): Coerce text to uppercaserefine_to_title(): Coerce text to title caserefine_to_null(): Set values to NULLrefine_to_empty(): Set text values to empty string
("")refine_to_text(): Coerce value to stringrefine_to_number(): Coerce value to numericrefine_to_date(): Coerce value to daterefine_trim_whitespace(): Remove leading and trailing
whitespacesrefine_collapse_whitespace(): Collapse consecutive
whitespaces to single whitespacerefine_unescape_html(): Unescape HTML in stringDescriptions and examples of usage are available in the package manual and vignette.
Feature requests, bug reports or other questions should be directed to the issue queue.