Introduction
The other day I rewrote a Python library in Rust [AT2018]. I was pretty happy with the results; however, I happen to use the library in within a larger Django app, so realistically, it wasn’t useful to me unless there were Python bindings. Making native code libraries for Python is something of a complicated beast, and while the Rust tooling does not (yet) tame the beast, it will help you ride the best without falling off.
To make a long story short, there’s basically two parts to making a python module out of Rust code: the bindings themselves, and a packaging tool (to make your wheels and upload them to PyPI). The current actively developed packaging tool is pyo3-pack. Pyo3-pack supports 4 kinds of bindings:
- pyo3
Powerful library, but requires Rust nightly
- rust-cpython
What I used
- cffi
Complex, but allows you to make version agnostic builds
- bin
We don’t talk about bin
In the end, I used rust-cpython because I didn’t mind having to make separate releases for different python versions, and I didn’t want to have to deal with the code generation that cffi required.
Results
Run times of Python and Rust programs
Program |
Source File |
Task Clock (msec) |
Memory usage (MB) |
---|---|---|---|
Pure Python |
Empty file |
92.3 ± 0.16% |
26 |
Python w/ Rust |
Empty file |
24.0 ± 0.34% |
12 |
Rust |
Empty file |
0.977 ± 1.9% |
4 |
Pure Python |
Rust homepage |
102 ± 0.13% |
27 |
Python w/ Rust |
Rust homepage |
25.8 ± 2.0% |
13 |
Rust |
Rust homepage |
2.26 ± 1.2% |
5 |
Pure Python |
War & Peace |
1580 ± 0.18% |
87 |
Python w/ Rust |
War & Peace |
199 ± 2.4% |
61 |
Rust |
War & Peace |
167 ± 0.0% |
34 |
I ran these the same way I ran the tests in the previous benchmark [AT2018]. The vales for the “Pure Python” version were actually just copied from that benchmark. I re-ran the Rust versions because I’d made some significant changes to the library since the last test, but it doesn’t seem to have affected performance much.
The “Python w/ Rust” version doesn’t come with a command line version (there would be no point, you could just use the Rust program), so I had to make a little script to call the python code. It was just:
#!/usr/bin/env python3
import sys
import august
print(august.convert(sys.stdin.read()))
Discussion
Performance
Performance wise, the results are pretty good. The “Python w/ Rust” version is still a little slower. Unsurprisingly, most of this (23 msec) is the time it takes to load the Python runtime [1]. Apart from that, it’s still slightly slower (about 4%) as files get larger.
The “Python w/ Rust” does take up a fair bit more memory though. About 8 MB for the Python runtime, and then about 50% more memory with the larger files. I’m not sure why this is, but in the Python version, Python code is still responsible for opening stdin and reading the contents to the library, and I suspect that has something to do with it.
The one thing I was a little suprised at was how much faster the “Python w/ Rust” version was than the “Pure Python” version on blank files. Presumably this is due to the time it took to load the Python code itself and import beautifulsoup.
Project Structure
I toyed around with project structure a little bit, but in the end, it seemed more straightforward to keep the Python bindings in a separate project. The upside to this is that each project is simpler and you won’t have to worry about Python binding code getting muddled with your regular code. Another upside to this is it lets you specify different details in your Cargo.toml file, and most importantly, have a different readme file.
The downside to having a separate project is that your Cargo.toml files end up with a fair bit of overlap, and you have to separately bump your version numbers in the Python library project whenever you make a release.
The Actual Code
The code for the project is on Gitlab [AP2018]. The actual binding code is:
#[macro_use] extern crate cpython;
use cpython::{PyResult, Python};
use august;
py_module_initializer!(august, initlibaugust, PyInit_august, |py, m| {
m.add(py, "__doc__", "A library for converting HTML to plain text.")?;
m.add(py, "convert", py_fn!(py, convert(input: &str, width: Option<usize>=Some(79))))?;
Ok(())
});
fn convert(_: Python, input: &str, width: Option<usize>) -> PyResult<String> {
Ok(august::convert(input, width.unwrap_or(79)))
}
Almost all the work is being done by the py_module_initializer and py_fn macros in rust-cpython. py_fn has some magic that lets you set optional keyword arguments, it’s not very well documented, unfortunately.
Conclusion
Writing Python modules in Rust makes writing Python libraries in native code easier than it’s ever been. The results are pretty good, even simply avoiding an import or two might save a lot of time.