PyPy is now able to load and run CPython extension modules (i.e. .pyd and .so files) natively by using the new CPyExt subsystem. Unlike the solution presented in another blog post (where extension modules like numpy etc. were run on CPython and proxied through TCP), this solution does not require a running CPython anymore. We do not achieve full binary compatiblity yet (like Ironclad), but recompiling the extension is generally enough.
The only prerequisite is that the necessary functions of the C API of CPython are already implemented in PyPy. If you are a user or an author of a module and miss certain functions in PyPy, we invite you to implement them. Up until now, a lot of people (including a lot of new committers) have stepped up and implemented a few functions to get their favorite module running. See the end of this post for a list of names.
Regarding speed, we tried the following: even though there is a bit of overhead when running these modules, we could run the regular expression engine of CPython (_sre.so) and execute the spambayes benchmark of the Unladen Swallow benchmark suite (cf. speed.pypy.org) and experience a speedup: It became two times faster on pypy-c than with the built-in regular expression engine of PyPy. From Amdahl's Law it follows that the _sre.so must run several times faster than the built-in engine.
Currently pursued modules include PIL and others. Distutils support is nearly ready. If you would like to participate or want information on how to use this new feature, come and join our IRC channel #pypy on freenode.
Amaury Forgeot d'Arc and Alexander Schremmer
Further CPyExt Contributors:
- Alex Gaynor
- Benjamin Peterson
- Jean-Paul Calderone
- Maciej Fijalkowski
- Jan de Mooij
- Lucian Branescu Mihaila
- Andreas Stührk
- Zooko Wilcox-O Hearn
Nightly builds are what they are - pure pypy executables with JIT compiled in (for linux only now). They require either a pypy checkout or a release download. The main difference is that by default display more debugging information than release builds and that they contain recent bugfixes and improvements of course :-)Cheers
If you want to read a detailed analysis about why speed.pypy.org is cool, head over to Saveen Reddy's blog at the MSDN.
Now that the release is done I wanted to list and to thank some people that were essential in the process of getting it out of the door, particularly because the work of some of them is not very visible usually.
Armin Rigo and Maciej Fijałkowski tirelessly worked on most aspects of the release, be it fixing the last known bugs and performance problems, packaging or general wizardry.
Amaury Forgeot d'Arc made sure that PyPy 1.2 actually supports Windows as a platform properly and compiled the Windows binaries.
Miquel Torres designed and implemented our new speed overview page, https://speed.pypy.org which is a great tool for us to spot performance regressions and to showcase our improvements to the general public.
tav designed the new user-oriented web page, https://pypy.org which is a lot nicer for people that only want to use PyPy as a Python implementation (and not be confused by how PyPy is actually made).
Holger Krekel fixed our main development server codespeak.net, even while being on vacation and not really having online connectivity. Without that, we couldn't actually have released anything.
Bartosz Skowron worked a lot on making Ubuntu packages for PyPy, which is really cool. Even though he didn't quite finish in time for the release, we will hopefully get them soon.
Thanks to all you guys!
We are pleased to announce PyPy's 1.2 release. This version 1.2 is a major milestone and it is the first release to ship a Just-in-Time compiler that is known to be faster than CPython (and unladen swallow) on some real-world applications (or the best benchmarks we could get for them). The main theme for the 1.2 release is speed.
The JIT is stable and we don't observe crashes. Nevertheless we would recommend you to treat it as beta software and as a way to try out the JIT to see how it works for you.
- The JIT compiler.
- Various interpreter optimizations that improve performance as well as help save memory. Read our various blog posts about achievements.
- Introducing a new PyPy website at pypy.org made by tav and improved by the PyPy team.
- Introducing speed.pypy.org made by Miquel Torres, a new service that monitors our performance nightly.
- There will be ubuntu packages on PyPy's PPA made by Bartosz Skowron, however various troubles prevented us from having them as of now.
Known JIT problems (or why you should consider this beta software) are:
- The only supported platform is 32bit x86 for now, we're looking for help with other platforms.
- It is still memory-hungry. There is no limit on the amount of RAM that the assembler can consume; it is thus possible (although unlikely) that the assembler ends up using unreasonable amounts of memory.
If you want to try PyPy, go to the download page on our excellent new site and find the binary for your platform. If the binary does not work (e.g. on Linux, because of different versions of external .so dependencies), or if your platform is not supported, you can try building from the source.
The PyPy release team,
Armin Rigo, Maciej Fijalkowski and Amaury Forgeot d'Arc
Antonio Cuni, Carl Friedrich Bolz, Holger Krekel, Samuele Pedroni and many others.
The last PyPy video from pycon has been uploaded. It's a very short (less than 10 minutes) "keynote" talk about state of PyPy.
Some time ago, we introduced our nightly performance graphs. This was a quick hack to allow us to see performance regressions. Thanks to Miquel Torres, we can now introduce https://speed.pypy.org, which is a Django-powered web app sporting a more polished visualisation of our nightly performance runs.
While this website is not finished yet, it's already far better than our previous approach :-)
Details about announcement on pypy-dev are found here.
If you're are interested in having something similar for other benchmark runs, contact Miquel (tobami at gmail).
Quoting Miquel: "I would also like to note, that if other performance-oriented opensource projects are interested, I would be willing to see if we can set-up such a Speed Center for them. There are already people interested in contributing to make it into a framework to be plugged into buildbots, software forges and the like. Stay tuned!"
I recently did some benchmarking of twisted on top of PyPy. For the very impatient: PyPy is up to 285% faster than CPython. For more patient people, there is a full explanation of what I did and how I performed measurments, so they can judge themselves.
The benchmarks are living in twisted-benchmarks and were mostly written by Jean Paul Calderone. Even though he called them "initial exploratory investigation into a potential direction for future development resulting in performance oriented metrics guiding the process of optimization and avoidance of complexity regressions", they're still much much better than average benchmarks found out there.
The methodology was to run each benchmark for quite some time (about 1 minute), measuring number of requests each 5s. Then I looked at dump of data and substracted some time it took for JIT-capable interpreters to warm up (up to 15s), averaging everything after that. Averages of requests per second are in the table below (the higher the better):
|names||10930||11940 (9% faster)||15429 (40% faster)|
|pb||1705||2280 (34% faster)||3029 (78% faster)|
|iterations||75569||94554 (25% faster)||291066 (285% faster)|
|accept||2176||2166 (same speed)||2290 (5% faster)|
|web||879||854 (3% slower)||1040 (18% faster)|
|tcp||105M||119M (7% faster)||60M (46% slower)|
To reproduce, run each benchmark with:
benchname.py -n 12 -d 5
WARNING: running tcp-based benchmarks that open new connection for each request (web & accept) can exhaust number of some kernel structures, limit n or wait until next run if you see drops in request per second.
The first obvious thing is that various benchmarks are more or less amenable to speedups by JIT compilation. Accept and tcp getting smallest speedups, if at all. This is understandable, since JIT is mostly about reducing interpretation and frame overhead, which is probably not large when it comes to accepting connections. However, if you actually loop around, doing something, JIT can give you a lot of speedup.
The other obvious thing is that PyPy is the fastest python interpreter here, almost across-the board (Jython and IronPython won't run twisted), except for raw tcp throughput. However, speedups can vary and I expect this to improve after the release, as there are points, where PyPy can be improved. Regarding raw tcp throughput - this can be a problem for some applications and we're looking forward to improve this particular bit.
The main reason to use twisted for this comparison is a lot of support from twisted team and JP Calderone in particular, especially when it comes to providing benchmarks. If some open source project wants to be looked at by PyPy team, please provide a reasonable set of benchmarks and infrastructure.
If, however, you're a closed source project fighting with performance problems of Python, we're providing contracting for investigating opportunities, how PyPy and not only PyPy, can speed up your project.
- names - simple DNS server
- web - simple http hello world server
- pb - perspective broker, RPC mechanism for twisted
- iterations - empty twisted loop
- accept - number of tcp connections accepted per second
- tcp - raw socket transfer throughput
- CPython 2.6.2 - as packaged by ubuntu
- Unladen swallow svn trunk, revision 1109
- PyPy svn trunk, revision 71439
Twisted version used: svn trunk, revision 28580
Machine: unfortunately 32bit virtual-machine under qemu, running ubuntu karmic, on top of Quad core intel Q9550 with 6M cache. Courtesy of Michael Schneider.
Greetings to everybody from Pycon 2010 Atlanta. Right now I'm sitting in a sprint room with people sprinting on various projects, like CPython, twisted etc. The conference was really great, and I've seen some good talks, although I've been too exhausted from my own talks to go to too many. Probably I should stay away from proposing that many talks to next pycon :-)
The highlight of sprints was that we got a common mercurial repository at python.org for python benchmarks. We might be able to come up with "the python benchmark suite" which will mostly consist of simple benchmarks using large python libraries, rather than microbenchmarks. The repository was started by the Unladen Swallow people and we already have common commit access among PyPy, CPython, Unladen Swallow, Jython and Iron Python. We don't have yet a common place to run benchmarks, but we should be able to fix that soon.
Regarding the talks, there are online videos for How to write cross-interpreter python programs and Speed of PyPy talks, among other talks from Pycon. There should be a video for my short keynote shortly.
The talks were well received as there is interest in PyPy's progress.