More progress was made on the NumPy front in the past month. On the compatibility front, we now pass ~130 more tests from NumPy's suite since the end of January. Currently, we pass 2336 tests out of 3265 tests run, with many of the failures representing portions of NumPy that we don't plan to implement in the near future (object dtypes, unicode, etc). There are still some failures that do represent issues, such as special indexing cases and failures to respect subclassed ndarrays in return values, which we do plan to resolve. There are also some unimplemented components and ufuncs remaining which we hope to implement, such as nditer and mtrand. Overall, the most common array functionality should be working.
Additionally, I began to take a look at some of the loops generated by our code. One widely used loop is dot, and we were running about 5x slower than NumPy's C version. I was able to optimize the dot loop and also the general array iterator to get us to ~1.5x NumPy C time on dot operations of various sizes. Further progress in this area could be made by using CFFI to tie into BLAS libraries, when available. Also, work remains in examining traces generated for our other loops and checking for potential optimizations.
To try out PyPy + NumPy, grab a nightly PyPy and install our NumPy fork. Feel free to report comments/issues to IRC, our mailing list, or bug tracker. Thanks to the contributors to the NumPy on PyPy proposal for supporting this work.
We're just finishing up a cleanup of int/long types. This work helps the py3k
branch unify these types into the Python 3 int and restore JIT compilation of
machine sized integers.
This cleanup also removes multimethods from these types. PyPy has
historically used a clever implementation of multimethod dispatch for declaring
methods of the __builtin__ types in RPython.
This multimethod scheme provides some convenient features for doing this,
however we've come to the conclusion that it may be more trouble than it's
worth. A major problem of multimethods is that they generate a large amount of
stub methods which burden the already lengthy and memory hungry RPython
translation process. Also, their implementation and behavior can be somewhat
The alternative to multimethods involves doing the work of the type checking
and dispatching rules in a more verbose, manual way. It's a little more work in
the end but less magical.
Recently, Manuel Jacob finished a large cleanup effort of the
unicode/string/bytearray types that also removed their multimethods. This work
also benefits the py3k branch: it'll help with future PEP 393 (or PEP 393
alternative) work. This effort was partly sponsored by Google's Summer of
Code: thanks Manuel and Google!
Now there's only a couple major pieces left in the multimethod removal (the
float/complex types and special marshaling code) and a few minor pieces that
should be relatively easy.
In conclusion, there's been some good progress made on py3k and multimethod
removal this winter, albeit a bit slower than we would have liked.
A quick note about the Software Transactional Memory (STM) front.
Since the previous post, we believe we progressed a lot by discovering an alternative core model for software transactions. Why do I say "believe"? It's because it means again that we have to rewrite from scratch the C library handling STM. This is currently work in progress. Once this is done, we should be able to adapt the existing pypy-stm to run on top of it without much rewriting efforts; in fact it should simplify the difficult issues we ran into for the JIT. So while this is basically yet another restart similar to last June's, the difference is that the work that we have already put in the PyPy part (as opposed to the C library) remains.
You can read about the basic ideas of this new C library here. It is still STM-only, not HTM, but because it doesn't constantly move objects around in memory, it would be easier to adapt an HTM version. There are even potential ideas about a hybrid TM, like using HTM but only to speed up the commits. It is based on a Linux-only system call, remap_file_pages() (poll: who heard about it before? :-). As previously, the work is done by Remi Meier and myself.
Currently, the C library is incomplete, but early experiments show good results in running duhton, the interpreter for a minimal language created for the purpose of testing STM. Good results means we brough down the slow-downs from 60-80% (previous version) to around 15% (current version). This number measures the slow-down from the non-STM-enabled to the STM-enabled version, on one CPU core; of course, the idea is that the STM version scales up when using more than one core.
This means that we are looking forward to a result that is much better than originally predicted. The pypy-stm has chances to run at a one-thread speed that is only "n%" slower than the regular pypy-jit, for a value of "n" that is optimistically 15 --- but more likely some number around 25 or 50. This is seriously better than the original estimate, which was "between 2x and 5x". It would mean that using pypy-stm is quite worthwhile even with just two cores.
More updates later...
Work continued on the NumPy + PyPy front steadily in December and more lightly in January. The continued focus was compatibility, targeting incorrect or unimplemented features that appeared in multiple NumPy test suite failures. We now pass ~2/3 of the NumPy test suite. The biggest improvements were made in these areas:
- Bugs in conversions of arrays/scalars to/from native types
- Fix cases where we would choose incorrect dtypes when initializing or computing results
- Improve handling of subclasses of ndarray through computations
- Support some optional arguments for array methods that are used in the pure-python part of NumPy
- Support additional attributes in arrays, array.flags, and dtypes
- Fix some indexing corner cases that arise in NumPy testing
- Implemented part of numpy.fft (cffti and cfftf)
Looking forward, we plan to continue improving the correctness of the existing implemented NumPy functionality, while also beginning to look at performance. The initial focus for performance will be to look at areas where we are significantly worse than CPython+NumPy. Those interested in trying these improvements out will need a PyPy nightly, and an install of the PyPy NumPy fork. Thanks again to the NumPy on PyPy donors for funding this work.
Since the PyPy 2.2 release last month, more progress has been made on the NumPy compatibility front. Initial work has been directed by running the NumPy test suite and targeting failures that appear most frequently, along with fixing the few bugs reported on the bug tracker.
Improvements were made in these areas:
- Many missing/broken scalar functionalities were added/fixed. The scalar API should match up more closely with arrays now.
- Some missing dtype functionality was added (newbyteorder, hasobject, descr, etc)
- Support for optional arguments (axis, order) was added to some ndarray functions
- Fixed some corner cases for string/record types
Most of these improvements went onto trunk after 2.2 was split, so if you're interested in trying them out or running into problems on 2.2, try the nightly.
Thanks again to the NumPy on PyPy donors who make this continued progress possible.
One of the RaspberryPi's goals is to be a fun toolkit for school children (and adults!) to learn programming and electronics with. Python and pygame are part of this toolkit. Recently the RaspberryPi Foundation funded parts of the effort of porting of pypy to the Pi -- making Python programs on the Pi faster!
Unfortunately pygame is written as a Python C extension that wraps SDL which means performance of pygame under pypy remains mediocre. To fix this pygame needs to be rewritten using cffi to wrap SDL instead.
RaspberryPi sponsored a CTPUG (Cape Town Python User Group) hackathon to put together a proof-of-concept pygame-cffi. The day was quite successful - we got a basic version of the bub'n'bros client working on pygame-cffi (and on PyPy). The results can be found on github with contributions from the five people present at the sprint.
While far from complete, the proof of concept does show that there are no major obstacles to porting pygame to cffi and that cffi is a great way to bind your Python package to C libraries.
Amazingly, we managed to have machines running all three major platforms (OS X, Linux and Windows) at the hackathon so the code runs on all of them!
We would like to thank the Praekelt foundation for providing the venue and The Raspberry Pi foundation for providing food and drinks!
Simon Cross, Jeremy Thurgood, Neil Muller, David Sharpe and fijal.
The next PyPy sprint will be in Leysin, Switzerland, for the ninth time. This is a fully public sprint: newcomers and topics other than those proposed below are welcome.
Goals and topics of the sprint
- Py3k: work towards supporting Python 3 in PyPy
- NumPyPy: work towards supporting the numpy module in PyPy
- STM: work towards supporting Software Transactional Memory
- And as usual, the main side goal is to have fun in winter sports :-) We can take a day off for ski.
For a change, and as an attempt to simplify things, I specified the dates as 11-19 January 2014, where 11 and 19 are travel days. We will work full days between the 12 and the 18. You are of course allowed to show up for a part of that time only, too.
Location & Accomodation
Leysin, Switzerland, "same place as before". Let me refresh your memory: both the sprint venue and the lodging will be in a very spacious pair of chalets built specifically for bed & breakfast: https://www.ermina.ch/. The place has a good ADSL Internet connexion with wireless installed. You can of course arrange your own lodging anywhere (as long as you are in Leysin, you cannot be more than a 15 minutes walk away from the sprint venue), but I definitely recommend lodging there too -- you won't find a better view anywhere else (though you probably won't get much worse ones easily, either :-)
Please confirm that you are coming so that we can adjust the reservations as appropriate. The rate so far has been around 60 CHF a night all included in 2-person rooms, with breakfast. There are larger rooms too (less expensive per person) and maybe the possibility to get a single room if you really want to.
Please register by Mercurial:
or on the pypy-dev mailing list if you do not yet have check-in rights:
You need a Swiss-to-(insert country here) power adapter. There will be some Swiss-to-EU adapters around -- bring a EU-format power strip if you have one.
We're pleased to announce PyPy 2.2.1, which targets version 2.7.3 of the Python language. This is a bugfix release over 2.2.
You can download the PyPy 2.2.1 release here:
What is PyPy?
PyPy is a very compliant Python interpreter, almost a drop-in replacement for CPython 2.7. It's fast (pypy 2.2 and cpython 2.7.2 performance comparison) due to its integrated tracing JIT compiler.
This release supports x86 machines running Linux 32/64, Mac OS X 64, Windows 32, or ARM (ARMv6 or ARMv7, with VFPv3).
Work on the native Windows 64 is still stalling, we would welcome a volunteer to handle that.
This is a bugfix release. The most important bugs fixed are:
- an issue in sockets' reference counting emulation, showing up notably when using the ssl module and calling makefile().
- Tkinter support on Windows.
- If sys.maxunicode==65535 (on Windows and maybe OS/X), the json decoder incorrectly decoded surrogate pairs.
- some FreeBSD fixes.
Note that CFFI 0.8.1 was released. Both versions 0.8 and 0.8.1 are compatible with both PyPy 2.2 and 2.2.1.
Cheers, Armin Rigo & everybody
CFFI 0.8 for CPython (2.6-3.x) has been released.
Quick download: pip install cffi --upgrade
What's new: a number of small fixes;
ffi.getwinerror(); integrated support for C99 variable-sized structures; multi-thread safety.
Update: CFFI 0.8.1, with fixes on Python 3 on OS/X, and some FreeBSD fixes (thanks Tobias).
The biggest change is that we shifted to using an external fork of numpy rather than a minimal numpypy module. The idea is that we will be able to reuse most of the upstream pure-python numpy components, replacing the C modules with appropriate RPython micronumpy pieces at the correct places in the module namespace.
The numpy fork should work just as well as the old numpypy for functionality that existed previously, and also include much new functionality from the pure-python numpy pieces that simply hadn't been imported yet in numpypy. However, this new functionality will not have been "hand picked" to only include pieces that work, so you may run into functionality that relies on unimplemented components (which should fail with user-level exceptions).
This setup also allows us to run the entire numpy test suite, which will help in directing future compatibility development. The recent PyPy release includes these changes, so download it and let us know how it works! And if you want to live on the edge, the nightly includes even more numpy progress made in November.
To install the fork, download the latest release, and then install numpy either separately with a virtualenv: pip install git+https://bitbucket.org/pypy/numpy.git; or directly: git clone https://bitbucket.org/pypy/numpy.git; cd numpy; pypy setup.py install.
EDIT: if you install numpy as root, you may need to also import it once as root before it works: sudo pypy -c 'import numpy'
Along with this change, progress was made in fixing internal micronumpy bugs and increasing compatibility:
- Fixed a bug with strings in record dtypes
- Fixed a bug where the multiplication of an ndarray with a Python int or float resulted in loss of the array's dtype
- Fixed several segfaults encountered in the numpy test suite (suite should run now without segfaulting)
We also began working on __array_prepare__ and __array_wrap__, which are necessary pieces for a working matplotlib module.
Romain and Brian