Skip to main content

4 weeks of GDB

Hello.

So, according to our jit plan we're mostly done with point 1, that is to provide a JIT that compiles python code to assembler in the most horrible manner possible but doesn't break. That meant mostly 4 weeks of glaring at GDB and megabytess of assembler generated by C code generated from python code. The figure of 4 weeks proves that our approach is by far superior to the one of psyco, since Armin says it's "only 4 weeks" :-)

Right now, pypy compiled with JIT can run the whole CPython test suite without crashing, which means we're done with obvious bugs and the only ones waiting for us are really horrible. (Or they really don't exist. At least they should never be about obscure Python corner cases: they can only be in the 10'000 lines of relatively clear code that is our JIT generator.)

But... the fun thing is that we can actually concentrate on optimizations! So the next step is to provide a JIT that is correct *and* actually speeds up python. Stay tuned for more :-)

Cheers,
fijal, armin & benjamin

UPDATE: for those of you blessed with no knowledge of C, gdb stands for GNU debugger, a classic debugger for C. (It's also much more powerful than python debugger, pdb, which is kind of surprising).

Comments

Alexander Kellett wrote on 2009-04-30 23:15:

*bow*

Luis wrote on 2009-05-01 00:00:

I love this kind of posts. Keep'em coming!

Unknown wrote on 2009-05-01 01:06:

This is probably the most exciting thing I've heard since I started tracking PyPy. Can't wait to see how fast JIT Python flies. :-)

René Dudfield wrote on 2009-05-01 01:56:

nice one! Really looking forward to it.

Is this for just i386? Or is this for amd64/ppc etc?

Maciej Fijalkowski wrote on 2009-05-01 02:11:

amd64 and ppc are only available in enterprise version :-)

We cannot really solve all problems at once, it's one-by-one approach.

Armin Rigo wrote on 2009-05-01 09:47:

illume: if you are comparing with Psyco, then it's definitely "any platform provided someone writes a backend for it". Writing a backend is really much easier than porting the whole of Psyco...

Our vague plans include an AMD64 backend and an LLVM-JIT one, the latter being able to target any platform that LLVM targets.

DSM wrote on 2009-05-01 10:33:

Nice!

I assume that it would be (relatively, as these things go) straightforward for those us interested to turn the x86 assembly backend into a C backend?

I know that even mentioning number-crunching applications gets certain members of the pypy team beating their heads against the wall (lurkers can read the grumbling on irc too!). But with a delegate-to-C backend, those of us who have unimplemented architectures and are in the happy regime where we don't care about compilation overhead can get the benefits of icc's excellent optimizations without having to do any of the work. We'd just need to make sure that the generated C is code that icc can handle. (There are unfortunately idioms that gcc and icc don't do very well with.)

To be clear, I'm not suggesting that the pypy team itself go this route: at the moment it feels like the rest of us should stay out of your way.. laissez les bon temps roulez! :^)

I'm asking instead if there are any obvious gotchas involved in doing so.

Tim Parkin wrote on 2009-05-01 11:28:

Congrats for stage one... exciting times for python..

proteusguy wrote on 2009-05-02 11:48:

Nice job guys! Once you announce that PyPy is pretty much of comparable (90% or better) speed to that of CPython then we will be happy to start running capacity tests of our web services environment on top of it and report back our results.

Given the growing number of python implementations has there ever been a discussion of PyPy replacing CPython as the canonical implementation of python once it consistently breaks performance & reliability issues? I don't know enough of the details to advocate such a position - just curious if there's been any official thought to the possibility.

Armin Rigo wrote on 2009-05-04 13:55:

DSM, Proteusguy: I'd be happy to answer your questions on the pypy-dev mailing list. I think that there is no answer short enough to fit a blog post comment.

Jacob Hallén wrote on 2009-05-04 14:45:

proteusguy: It is our hope that PyPy can one day replace CPython as the reference implementation, but this depends on many factors. Most of them are way out of our control. It will depend very much on the level of PyPy uptake in the community, but this is just a first step. With enough adoption, the Python developers (the people actually making new versions of CPython) need to be convinced that working from PyPy as a base to develop the language makes sense. If they are convinced, Guido may decide that it is a good idea and make the switch, but not before then.

Anonymous wrote on 2009-05-08 15:36:

See https://moderator.appspot.com/#9/e=c9&t=pypy for Guido's opinion about PyPy.

Anonymous wrote on 2009-05-10 11:19:

Surely gdb is more powerful than pdb because many more people are forced to used gdb. c code is much harder to debug than python code, and needs debugging more often than python code.

Cacas Macas wrote on 2009-05-14 08:31:

Good day.
I use Python for about 3 years and i am following your blog almost every day to see news.
I am very excited to see more Pypy, though i don't understand how to use it (?!?!) and i never managed to install it!
Wikipedia says "PyPy is a followup to the Psyco project" and i use Psyco, so Pypy must be a very good thing. I use Psyco very intense, in all my applications, but it's very easy to use.
I have Windows and Windows document is incomplete "https://codespeak.net/pypy/dist/pypy/doc/windows.html". I have MinGW compiler.
Pypy is not very friendly with users. I think more help documents would be very useful. When i will understand how to install Pypy, i will use it.
Keep up the good work!

stracin wrote on 2009-06-13 14:56:

"""Rumors have it that the secret goal is being faster-than-C which is nonsense, isn't it?"""

what does this statement from the pypy homepage mean?

that c-pypy will be faster than cpython?

or that code run in c-pypy will be faster than compiled C code? :o

because of the "nonsense" i think you mean the latter? but isn't it nonsense? :) would be awesome though.