Posit AI Weblog: safetensors 0.1.0

on

|

views

and

comments


safetensors is a brand new, easy, quick, and protected file format for storing tensors. The design of the file format and its unique implementation are being led
by Hugging Face, and it’s getting largely adopted of their standard ‘transformers’ framework. The safetensors R bundle is a pure-R implementation, permitting to each learn and write safetensor information.

The preliminary model (0.1.0) of safetensors is now on CRAN.

Motivation

The primary motivation for safetensors within the Python group is safety. As famous
within the official documentation:

The primary rationale for this crate is to take away the necessity to use pickle on PyTorch which is utilized by default.

Pickle is taken into account an unsafe format, because the motion of loading a Pickle file can
set off the execution of arbitrary code. This has by no means been a priority for torch
for R customers, for the reason that Pickle parser that’s included in LibTorch solely helps a subset
of the Pickle format, which doesn’t embody executing code.

Nonetheless, the file format has extra benefits over different generally used codecs, together with:

  • Help for lazy loading: You may select to learn a subset of the tensors saved within the file.

  • Zero copy: Studying the file doesn’t require extra reminiscence than the file itself.
    (Technically the present R implementation does makes a single copy, however that may
    be optimized out if we actually want it in some unspecified time in the future).

  • Easy: Implementing the file format is straightforward, and doesn’t require complicated dependencies.
    Which means it’s a very good format for exchanging tensors between ML frameworks and
    between completely different programming languages. As an illustration, you’ll be able to write a safetensors file
    in R and cargo it in Python, and vice-versa.

There are extra benefits in comparison with different file codecs frequent on this house, and
you’ll be able to see a comparability desk right here.

Format

The safetensors format is described within the determine beneath. It’s principally a header file
containing some metadata, adopted by uncooked tensor buffers.

Diagram describing the safetensors file format.

Fundamental utilization

safetensors could be put in from CRAN utilizing:

Nick Fewings on Unsplash

Reuse

Textual content and figures are licensed beneath Artistic Commons Attribution CC BY 4.0. The figures which were reused from different sources do not fall beneath this license and could be acknowledged by a notice of their caption: “Determine from …”.

Quotation

For attribution, please cite this work as

Falbel (2023, June 15). Posit AI Weblog: safetensors 0.1.0. Retrieved from https://blogs.rstudio.com/tensorflow/posts/2023-06-15-safetensors/

BibTeX quotation

@misc{safetensors,
  writer = {Falbel, Daniel},
  title = {Posit AI Weblog: safetensors 0.1.0},
  url = {https://blogs.rstudio.com/tensorflow/posts/2023-06-15-safetensors/},
  12 months = {2023}
}
Share this
Tags

Must-read

Nvidia CEO reveals new ‘reasoning’ AI tech for self-driving vehicles | Nvidia

The billionaire boss of the chipmaker Nvidia, Jensen Huang, has unveiled new AI know-how that he says will assist self-driving vehicles assume like...

Tesla publishes analyst forecasts suggesting gross sales set to fall | Tesla

Tesla has taken the weird step of publishing gross sales forecasts that recommend 2025 deliveries might be decrease than anticipated and future years’...

5 tech tendencies we’ll be watching in 2026 | Expertise

Hi there, and welcome to TechScape. I’m your host, Blake Montgomery, wishing you a cheerful New Yr’s Eve full of cheer, champagne and...

Recent articles

More like this

LEAVE A REPLY

Please enter your comment!
Please enter your name here