Distributed Computing

Auto Added by WPeMatico

sparklyr 1.7: New data sources and spark_apply() capabilities, better interfaces for sparklyr extensions, and more!

Sparklyr 1.7 delivers much-anticipated improvements, including R interfaces for image and binary data sources, several new spark_apply() capabilities, and better integration with sparklyr extensions.

sparklyr 1.7: New data sources and spark_apply() capabilities, better interfaces for sparklyr extensions, and more! Read More »

sparklyr 1.4: Weighted Sampling, Tidyr Verbs, Robust Scaler, RAPIDS, and more

Sparklyr 1.4 is now available! This release comes with delightful new features such as weighted sampling and tidyr verbs support for Spark dataframes, robust scaler for standardizing data based on median and interquartile range, spark_connect interface for RAPIDS GPU acceleration plugin, as well as a number of dplyr-related improvements.

sparklyr 1.4: Weighted Sampling, Tidyr Verbs, Robust Scaler, RAPIDS, and more Read More »

sparklyr 1.5: better dplyr interface, more sdf_* functions, and RDS-based serialization routines

Unlike all three previous sparklyr releases, the recent release of sparklyr 1.5 placed much more emphasis on enhancing existing sparklyr features rather than creating new ones. As a result, many valuable suggestions from sparklyr users were taken into account and were successfully addressed in a long list of bug fixes and improvements.

sparklyr 1.5: better dplyr interface, more sdf_* functions, and RDS-based serialization routines Read More »

sparklyr 1.2: Foreach, Spark 3.0 and Databricks Connect

A new sparklyr release is now available. This sparklyr 1.2 release features new functionalities such as support for Databricks Connect, a Spark backend for the ‘foreach’ package, inter-op improvements for working with Spark 3.0 preview, as well as a number of bug fixes and improvements addressing user-visible pain points.

sparklyr 1.2: Foreach, Spark 3.0 and Databricks Connect Read More »