From Rust to Python (with bindings)

Introduction

The other day I rewrote a Python library in Rust [AT2018]. I was pretty happy with the results; however, I happen to use the library in within a larger Django app, so realistically, it wasn’t useful to me unless there were Python bindings. Making native code libraries for Python is something of a complicated beast, and while the Rust tooling does not (yet) tame the beast, it will help you ride the best without falling off.

To make a long story short, there’s basically two parts to making a python module out of Rust code: the bindings themselves, and a packaging tool (to make your wheels and upload them to PyPI). The current actively developed packaging tool is pyo3-pack. Pyo3-pack supports 4 kinds of bindings:

pyo3

Powerful library, but requires Rust nightly

rust-cpython

What I used

cffi

Complex, but allows you to make version agnostic builds

bin

We don’t talk about bin

In the end, I used rust-cpython because I didn’t mind having to make separate releases for different python versions, and I didn’t want to have to deal with the code generation that cffi required.

Results

Run times of Python and Rust programs

Program

Source File

Task Clock (msec)

Memory usage (MB)

Pure Python

Empty file

92.3 ± 0.16%

26

Python w/ Rust

Empty file

24.0 ± 0.34%

12

Rust

Empty file

0.977 ± 1.9%

4

Pure Python

Rust homepage

102 ± 0.13%

27

Python w/ Rust

Rust homepage

25.8 ± 2.0%

13

Rust

Rust homepage

2.26 ± 1.2%

5

Pure Python

War & Peace

1580 ± 0.18%

87

Python w/ Rust

War & Peace

199 ± 2.4%

61

Rust

War & Peace

167 ± 0.0%

34

I ran these the same way I ran the tests in the previous benchmark [AT2018]. The vales for the “Pure Python” version were actually just copied from that benchmark. I re-ran the Rust versions because I’d made some significant changes to the library since the last test, but it doesn’t seem to have affected performance much.

The “Python w/ Rust” version doesn’t come with a command line version (there would be no point, you could just use the Rust program), so I had to make a little script to call the python code. It was just:

#!/usr/bin/env python3

import sys
import august
print(august.convert(sys.stdin.read()))

Discussion

Performance

Performance wise, the results are pretty good. The “Python w/ Rust” version is still a little slower. Unsurprisingly, most of this (23 msec) is the time it takes to load the Python runtime [1]. Apart from that, it’s still slightly slower (about 4%) as files get larger.

The “Python w/ Rust” does take up a fair bit more memory though. About 8 MB for the Python runtime, and then about 50% more memory with the larger files. I’m not sure why this is, but in the Python version, Python code is still responsible for opening stdin and reading the contents to the library, and I suspect that has something to do with it.

The one thing I was a little suprised at was how much faster the “Python w/ Rust” version was than the “Pure Python” version on blank files. Presumably this is due to the time it took to load the Python code itself and import beautifulsoup.

Project Structure

I toyed around with project structure a little bit, but in the end, it seemed more straightforward to keep the Python bindings in a separate project. The upside to this is that each project is simpler and you won’t have to worry about Python binding code getting muddled with your regular code. Another upside to this is it lets you specify different details in your Cargo.toml file, and most importantly, have a different readme file.

The downside to having a separate project is that your Cargo.toml files end up with a fair bit of overlap, and you have to separately bump your version numbers in the Python library project whenever you make a release.

The Actual Code

The code for the project is on Gitlab [AP2018]. The actual binding code is:

#[macro_use] extern crate cpython;
use cpython::{PyResult, Python};
use august;

py_module_initializer!(august, initlibaugust, PyInit_august, |py, m| {
    m.add(py, "__doc__", "A library for converting HTML to plain text.")?;
    m.add(py, "convert", py_fn!(py, convert(input: &str, width: Option<usize>=Some(79))))?;
    Ok(())
});


fn convert(_: Python, input: &str, width: Option<usize>) -> PyResult<String> {
    Ok(august::convert(input, width.unwrap_or(79)))
}

Almost all the work is being done by the py_module_initializer and py_fn macros in rust-cpython. py_fn has some magic that lets you set optional keyword arguments, it’s not very well documented, unfortunately.

Conclusion

Writing Python modules in Rust makes writing Python libraries in native code easier than it’s ever been. The results are pretty good, even simply avoiding an import or two might save a lot of time.

Footnotes

References

[AT2018] (1,2) Alan Trick. 2018-12-22. Converting a Python library to Rust.