7 comments

In the last days I finally understood how to do virtualizables. Now the frame overhead is gone. This was done with the help of discussion with Samuele, porting ideas from PyPy's first JIT attempt.

This is of course work in progress, but it works in PyPy (modulo a few XXXs, but no bugs so far). The performance of the resulting code is quite good: even with Boehm (the GC that is easy to compile to but gives a slowish pypy-c), a long-running loop typically runs 50% faster than CPython. That's "baseline" speed, moreover: we will get better speed-ups by applying optimizations on the generated code. Doing so is in progress, but it suddenly became easier because that optimization phase no longer has to consider virtualizables -- they are now handled earlier.

Update:Virtualizables is basically a way to avoid frame overhead. The frame object is allocated and has a pointer, but the JIT is free to unpack it's fields (for example python level locals) and store them somewhere else (stack or registers). Each external (out of jit) access to frame managed by jit, needs to go via special accessors that can ask jit where those variables are.

Luis wrote on 2009-06-23 22:06:

I have no clue of what you're talking about, bit it sounds great! Keep it up!!

Anonymous wrote on 2009-06-23 23:51:

What are virtualizables?

Leonardo Santagada wrote on 2009-06-24 00:06:

From what I understand virtualizables are objects that you use to represent objects that are expensive to construct. For example frame objects in python are very expensive so they are virtualizables and if a function is executed and it doesn't try to access its frame object it is never created.

Probably armin can give a more precise answer.

What I want to know, couldn't CPython have virtualizables for frame objects? I guess the answer is that it could but would involve a lot of C code.

Maciej Fijalkowski wrote on 2009-06-24 00:09:

Ok, I updated the post with quick explanation of what actually virtualizables are. Leonardo: you need compiler in the first place for that :-) Psyco has some kind of virtualizables (but psyco frames are read only).

Cheers,
fijal

Unknown wrote on 2009-06-24 10:12:

Could you use virtualizables to avoid constructing the frame at all, and then only allocate it if it is accessed?

Anonymous wrote on 2009-06-24 14:22:

@Leonardo:

I'm guessing that yes, CPython COULD have virtualizables. However, the people who built CPython a) didn't know about them, b) didn't know how to code that in "C", or c) didn't consider it a priority item.

Either way, these are the types of advantages I would imagine coding python using python would expose. Optimize what you need to, and then start to see the real ROI of PyPy!

Antonio Cuni wrote on 2009-06-24 14:50:

@Ben: no. In the current incarnation, the JITs generated by PyPy optimize only hot loops, when they are executed more than N times. At that point, the frame object has already been allocated.

The real advantage of virtualizables is that they allows to:

1) produce very fast code, as if the frame weren't allocated at all (e.g. by storing local variables on the stack or in the registers)

2) they don't compromise the compatibility with CPython; in particular, sys._getframe() & co. still works fine, because the JIT knows how and when to synchronize the virtualizable (i.e., the frame) with the values that are on the stack.

@gregturn: I don't see how you can implement something similar to virtualizables without writing a compiler, and CPython is not such a thing :-)

News from the jit front

Maciej Fijalkowski

2009-06-15 23:59

14 comments

As usual, progress is going slower then predicted, but nevertheless, we're working hard to make some progress.

We recently managed to make our nice GCs cooperate with our JIT. This is one point from our detailed plan. As of now, we have a JIT with GCs and no optimizations. It already speeds up some things, while slowing down others. The main reason for this is that the JIT generates assembler which is kind of ok, but it does not do the same level of optimizations gcc would do.

So the current status of the JIT is that it can produce assembler out of executed python code (or any interpreter written in RPython actually), but the results are not high quality enough since we're missing optimizations.

The current plan, as of now, looks as follows:

Improve the handling of GCs in JIT with inlining of malloc-fast paths, that should speed up things by a constant, not too big factor.
Write a simplified python interpreter, which will be a base for experiments and to make sure that our JIT does correct things with regard to optimizations. That would work as mid-level integration test.
Think about ways to inline loop-less python functions into their parent's loop.
Get rid of frame overhead (by virtualizables)
Measure, write benchmarks, publish
Profit

Cheers,
fijal

Anonymous wrote on 2009-06-16 08:03:

nice to see the progresses on pypy jit!!

Anonymous wrote on 2009-06-16 09:22:

Do you expect to produce jit faster, then Unladen-Swallow's LLVM based ?

Anonymous wrote on 2009-06-16 13:20:

Thanks for all the hard work, guys. Keep it up!

Anonymous wrote on 2009-06-16 13:46:

ah, this jit business is so exciting!

Anonymous wrote on 2009-06-16 17:00:

I am not really shure how this plan relates to the roadmap that was presented in April.

Armin Rigo wrote on 2009-06-16 18:15:

How this plan relates: it does not. Fijal's style is to give the current idea of the plans. Don't believe him too much :-) This and April's plan need somehow to be added to each other, or something :-)

Armin Rigo wrote on 2009-06-16 18:22:

Unladen-Swallow's LLVM JIT is a very different beast: it compiles each Python function as a unit. You can only get a uniform bit of speedup this way (maybe 2-3x). By contrast, what we are doing gives a non-uniform speedup: like Psyco, we will probably obtain speedups between 2x and 100x depending on the use case.

(Of course the plan is to be faster than Psyco in the common case :-)

Luis wrote on 2009-06-17 00:11:

Armin: regarding Unladen-Swallow, does this approach prevent coming up later with a tracing jit? Or it could be done on top of it?

Nighteh3 wrote on 2009-06-17 05:45:

Sweet !! Good luck guys :)

Maciej Fijalkowski wrote on 2009-06-17 05:55:

No no no no, trust me :-)

The thing is that I'm trying to present "current plan"
as live as it can be. Which means we might change
our mind completely. But otherwise, the whole blog
would be mostly empty and boring...

Cheers,
fijal

tobami wrote on 2009-06-17 11:22:

Could you please, elaborate on the second point about a simplified python interpreter?

tobami wrote on 2009-06-17 11:26:

Also, wouldn't it be better to refactor the plan as follows?:

- Improve the handling of GCs in JIT with inlining of malloc-fast paths, that should speed up things by a constant, not too big factor.
- Measure, write benchmarks
- Write a simplified python interpreter, which will be a base for experiments and to make sure that our JIT does correct things with regard to optimizations. That would work as mid-level integration test.
- Think about ways to inline loop-less python functions into their parent's loop.
- Measure, publish benchmarks, RELEASE 1.2
- Get rid of frame overhead (by virtualizables)
- Measure, publish benchmarks
- Iterate...

Anonymous wrote on 2009-06-17 14:01:

Concerning current ideas vs April's roadmap: I understand that plans change and that's ok of course. But as April's roadmap isn't mentioned at all, I have no idea how the current ideas relate to the previous roadmap (like the current ideas replace the old road map or parts of it / they are additional ideas and the old roadmap is postponed / they are a detailing of (parts of) April's roadmap). Maybe that's obvious to people with better pypy-knowledge than me. I understand Armin's comment that they are additional ideas.

Keep up the good work!

Branko

Anonymous wrote on 2009-06-18 14:40:

What about threading? Will we have a GIL-less interpreter in the end (assuming the GCs support that)?

ICOOOLPS Submissions

Carl Friedrich Bolz-Tereick

2009-05-16 10:24

2 comments

Both of the papers that people from the PyPy team submitted to ICOOOLPS have been accepted. They are:

"Faster than C#: efficient implementation of dynamic languages on .NET" (pdf1) by Armin, Anto and Davide Ancona, who is Anto's Ph.D. advisor

"Tracing the Meta-Level: PyPy’s Tracing JIT Compiler" (pdf2) by Carl Friedrich, Armin, Anto and Maciek

(the pdfs are obviously the submitted versions, not the final ones).

This year ICOOOLPS (Implementation, Compilation, Optimization of Object-Oriented Languages, Programs and Systems) is being held on July the 6th at ECOOP 2009 in Genova, Italy. Other than these two papers, Anto and Carl Friedrich will also present a PyPy tutorial, on July the 7th.

Unknown wrote on 2009-05-16 11:22:

It does seem like an odd idea to trace the bytecode of an interpreter of the bytecode of a language, rather than just tracing the bytecode for a language. For example, it requires that you annotate the interpreter to retain information that you would otherwise naturally have, and it requires that you trace lots of extra bookkeeping code in the interpreter.

Given that you're writing a JIT that traces the execution of some bytecode, what advantages does tracing the outer bytecode have over tracing the inner bytecode? Is it that the outer bytecode is simpler than the inner bytecode; if so, is there no way to (inefficiently) compile the inner bytecode to the outer bytecode?

Carl Friedrich Bolz-Tereick wrote on 2009-05-16 12:08:

John: The main reason for writing a JIT that traces the bytecode of the "outer" interpreter (which we call language interpreter in the paper) is that then we need to write only one tracing JIT in PyPy, and can use it for a variety of languages.

The tracing of the extra bookkeeping code is not a problem is not such a large problem, as the paper shows. None of these opcodes are actually part of the final trace.

If you want to discuss this more, I would suggest that we move this discussion to pypy-dev@codespeak.net which is the project mailing list. Not everybody is reading comments here :).

4 weeks of GDB

Maciej Fijalkowski

2009-04-30 22:40

15 comments

Hello.

So, according to our jit plan we're mostly done with point 1, that is to provide a JIT that compiles python code to assembler in the most horrible manner possible but doesn't break. That meant mostly 4 weeks of glaring at GDB and megabytess of assembler generated by C code generated from python code. The figure of 4 weeks proves that our approach is by far superior to the one of psyco, since Armin says it's "only 4 weeks" :-)

Right now, pypy compiled with JIT can run the whole CPython test suite without crashing, which means we're done with obvious bugs and the only ones waiting for us are really horrible. (Or they really don't exist. At least they should never be about obscure Python corner cases: they can only be in the 10'000 lines of relatively clear code that is our JIT generator.)

But... the fun thing is that we can actually concentrate on optimizations! So the next step is to provide a JIT that is correct *and* actually speeds up python. Stay tuned for more :-)

Cheers,
fijal, armin & benjamin

UPDATE: for those of you blessed with no knowledge of C, gdb stands for GNU debugger, a classic debugger for C. (It's also much more powerful than python debugger, pdb, which is kind of surprising).

Alexander Kellett wrote on 2009-04-30 23:15:

*bow*

Luis wrote on 2009-05-01 00:00:

I love this kind of posts. Keep'em coming!

Unknown wrote on 2009-05-01 01:06:

This is probably the most exciting thing I've heard since I started tracking PyPy. Can't wait to see how fast JIT Python flies. :-)

René Dudfield wrote on 2009-05-01 01:56:

nice one! Really looking forward to it.

Is this for just i386? Or is this for amd64/ppc etc?

Maciej Fijalkowski wrote on 2009-05-01 02:11:

amd64 and ppc are only available in enterprise version :-)

We cannot really solve all problems at once, it's one-by-one approach.

Armin Rigo wrote on 2009-05-01 09:47:

illume: if you are comparing with Psyco, then it's definitely "any platform provided someone writes a backend for it". Writing a backend is really much easier than porting the whole of Psyco...

Our vague plans include an AMD64 backend and an LLVM-JIT one, the latter being able to target any platform that LLVM targets.

DSM wrote on 2009-05-01 10:33:

Nice!

I assume that it would be (relatively, as these things go) straightforward for those us interested to turn the x86 assembly backend into a C backend?

I know that even mentioning number-crunching applications gets certain members of the pypy team beating their heads against the wall (lurkers can read the grumbling on irc too!). But with a delegate-to-C backend, those of us who have unimplemented architectures and are in the happy regime where we don't care about compilation overhead can get the benefits of icc's excellent optimizations without having to do any of the work. We'd just need to make sure that the generated C is code that icc can handle. (There are unfortunately idioms that gcc and icc don't do very well with.)

To be clear, I'm not suggesting that the pypy team itself go this route: at the moment it feels like the rest of us should stay out of your way.. laissez les bon temps roulez! :^)

I'm asking instead if there are any obvious gotchas involved in doing so.

Tim Parkin wrote on 2009-05-01 11:28:

Congrats for stage one... exciting times for python..

proteusguy wrote on 2009-05-02 11:48:

Nice job guys! Once you announce that PyPy is pretty much of comparable (90% or better) speed to that of CPython then we will be happy to start running capacity tests of our web services environment on top of it and report back our results.

Given the growing number of python implementations has there ever been a discussion of PyPy replacing CPython as the canonical implementation of python once it consistently breaks performance & reliability issues? I don't know enough of the details to advocate such a position - just curious if there's been any official thought to the possibility.

Armin Rigo wrote on 2009-05-04 13:55:

DSM, Proteusguy: I'd be happy to answer your questions on the pypy-dev mailing list. I think that there is no answer short enough to fit a blog post comment.

Jacob Hallén wrote on 2009-05-04 14:45:

proteusguy: It is our hope that PyPy can one day replace CPython as the reference implementation, but this depends on many factors. Most of them are way out of our control. It will depend very much on the level of PyPy uptake in the community, but this is just a first step. With enough adoption, the Python developers (the people actually making new versions of CPython) need to be convinced that working from PyPy as a base to develop the language makes sense. If they are convinced, Guido may decide that it is a good idea and make the switch, but not before then.

Anonymous wrote on 2009-05-08 15:36:

See https://moderator.appspot.com/#9/e=c9&t=pypy for Guido's opinion about PyPy.

Anonymous wrote on 2009-05-10 11:19:

Surely gdb is more powerful than pdb because many more people are forced to used gdb. c code is much harder to debug than python code, and needs debugging more often than python code.

Cacas Macas wrote on 2009-05-14 08:31:

Good day.
I use Python for about 3 years and i am following your blog almost every day to see news.
I am very excited to see more Pypy, though i don't understand how to use it (?!?!) and i never managed to install it!
Wikipedia says "PyPy is a followup to the Psyco project" and i use Psyco, so Pypy must be a very good thing. I use Psyco very intense, in all my applications, but it's very easy to use.
I have Windows and Windows document is incomplete "https://codespeak.net/pypy/dist/pypy/doc/windows.html". I have MinGW compiler.
Pypy is not very friendly with users. I think more help documents would be very useful. When i will understand how to install Pypy, i will use it.
Keep up the good work!

stracin wrote on 2009-06-13 14:56:

"""Rumors have it that the secret goal is being faster-than-C which is nonsense, isn't it?"""

what does this statement from the pypy homepage mean?

that c-pypy will be faster than cpython?

or that code run in c-pypy will be faster than compiled C code? :o

because of the "nonsense" i think you mean the latter? but isn't it nonsense? :) would be awesome though.

1.1 final released

Carl Friedrich Bolz-Tereick

2009-04-28 15:39

4 comments

We just released PyPy 1.1 final. Not much changed since the beta, apart from some more fixed bugs. Have fun with it!

Anonymous wrote on 2009-04-28 17:09:

Congratulations on the new release!

Anonymous wrote on 2009-04-28 19:09:

Congrats! This is a great project :)

Anonymous wrote on 2009-04-29 11:33:

Any chance of prebuilt binaries? I tried to compile but had to give up after 2 hours (I guess my laptop is not up to the task).

By the way, you should put the release note somewhere on the main page of the PyPy site. Currently this page gives no indication that a release of PyPy exists at all.

Armin Rigo wrote on 2009-04-30 11:08:

Thanks, added a link from the main page to release-1.1.0.html.

About binaries: there are just too many possible combinations, not only of platforms but of kinds of pypy-c. I suppose that we can list other people's pages with some of them, if they mention them to us.

Roadmap for JIT

Maciej Fijalkowski

2009-04-21 19:38

19 comments

Hello.

First a disclaimer. This post is more about plans for future than current status. We usually try to write about things that we have done, because it's much much easier to promise things than to actually make it happen, but I think it's important enough to have some sort of roadmap.

In recent months we came to the point where the 5th generation of JIT prototype was working as nice or even a bit nicer than 1st one back in 2007. Someone might ask "so why did you spend all this time without going forward?". And indeed, we spend a lot of time moving sideways, but as posted, we also spent a lot of time doing some other things, which are important as well. The main advantage of current JIT incarnation is much much simpler than the first one. Even I can comprehend it, which is much of an improvement :-)

So, the prototype is working and gives very nice speedups in range of 20-30x over CPython. We're pretty confident this prototype will work and will produce fast python interpreter eventually. So we decided that now we'll work towards changing prototype into something stable and solid. This might sound easy, but in fact it's not. Having stable assembler backend and optimizations that keep semantics is not as easy as it might sound.

The current roadmap, as I see it, looks like as following:

Provide a JIT that does not speedup things, but produce assembler without optimizations turned on, that is correct and able to run CPython's library tests on a nightly basis.
Introduce simple optimizations, that should make above JIT a bit faster than CPython. With optimizations disabled JIT is producing incredibly dumb assembler, which is slower than correspoding C code, even with removal of interpretation overhead (which is not very surprising).
Backport optimizations from JIT prototype, one by one, keeping an eye on how they perform and making sure they don't break anything.
Create new optimizations, like speeding up attribute access.
Profit.

This way, we can hopefully provide a working JIT, which gives fast python interpreter, which is a bit harder than just a nice prototype.

Tell us what you think about this plan.

Cheers,
fijal & others.

Anonymous wrote on 2009-04-21 20:58:

I think it's a great idea. If the test suite succeeds on the basic JIT, it's much easier to spot regressions when you start adding the cool stuff. It also gives you a solid foundation to build on.

Good luck, this project is amazing :)

rjw wrote on 2009-04-21 21:54:

Its not obvious from this post what would actually be the difference between the prototype and the final jit with all the prototypes optimisations. So ... it sounds like a lot of work for zero gain. I'm sure there is missing information, like what is actually missing from or wrong with the prototype ( is it in a different language? Prolog?) Without this information its impossible to judge this plan.

Michael Foord wrote on 2009-04-21 22:54:

This sounds like a very pragmatic approach and is very encouraging. Nice work guys - very much looking forward to what the future has to offer.

Tim Parkin wrote on 2009-04-21 23:06:

I'm extremely excited about seeing this happen. It is an unfortunate fact that the majority of people won't get PyPy until they see a 'big win'. Once they've noticed the big win they will start to see the 'hidden genius'. I'm glad that you are taking such a professional approach to this next phase and look forward to the day when people will start to look give PyPy the attention it deserves (if not for quite the right reason).

Alex wrote on 2009-04-22 00:34:

I agree with Michael, one of the hallmarks of Python philosophy has always been "make it right, and then make it fast", sounds like you guys have taken this to heart.

Leonardo Santagada wrote on 2009-04-22 02:52:

Great guys, the plan seems very solid and reasonable!

responding to rjw: I think the problem was that the prototype was really incomplete, putting all the complexity needed for the rest of the language could be done without removing the optimizations but would make bug finding way harder.

I hope that this could be the only new feature for the next pypy release. Focusing on the JIT might be the best way to attract many more eyes and hands to the project.

Michael Hudson-Doyle wrote on 2009-04-22 04:12:

This sounds like a very sane plan. Good luck with it!

Anonymous wrote on 2009-04-22 07:59:

I like how for once step 2 isn't "???", but a well thought out plan =).

Zemantic dreams wrote on 2009-04-22 10:20:

guys, you rock! I can't wait to see the results!

bye
Andraz Tori, Zemanta

Anonymous wrote on 2009-04-22 13:21:

Very sensible plan! Good luck guys. Here's to pypy taking over the world (-:

herse wrote on 2009-04-22 19:36:

"It's super easy to provide 95% of python in a reasonable speed, just the last 5% gets tricky."

i often come across this statement.

wouldn't it make sense then to offer a pypy compile option for producing an interpreter which leaves away those 5% in favor of speed for people who don't need those 5%?

or isn't this feasible or wanted for some reason?

i am just curious... :) pypy is an awesome project and i am looking forward to the jit!

Anonymous wrote on 2009-04-24 09:34:

The roadmap is okay. The only thing I miss is a rough timeline.

Anonymous wrote on 2009-04-24 22:18:

Tenretn hör eviece ne Pypy tan cafretn anretx. Lbisi programma o oitcenno ih ecafretn cabpöo, anretn 'retupmo ih nis secorpbut pypy eka LD oitcenno huob raa rawtfo laweri anosre Python code?

René Leonhardt wrote on 2009-04-24 23:26:

Congratulations, the LLVM backend for JIT has been accepted, I am eager to see the results :)

Armin Rigo wrote on 2009-04-28 20:18:

herse: that's an approach which is often mentioned, but which does not make sense in PyPy. The JIT is generated from the language spec; whether this spec covers 95% or 100% of Python doesn't change anything. The 95%-versus-100% debate only makes sense at another level, e.g. if we wanted to make PyPy faster without a JIT at all.

Richard Emslie wrote on 2009-04-29 23:47:

Awesome work thus far & congratulations guys. Sounds like a good strategy to having something that works. Best of luck and I'm looking forward to see how things pan out. :-)

herse wrote on 2009-04-30 05:12:

"""The JIT is generated from the language spec; whether this spec covers 95% or 100% of Python doesn't change anything."""

i see. the whole pypy idea really sounds awesome to me.

i have another question. your python interpeter is written in rpython so it is supposed to be simpler to work with than the c implementation. but i could imagine that it is incredibly hard to debug problems in pypy-c? doesn't this counterbalance the advantage again?

Maciej Fijalkowski wrote on 2009-04-30 05:58:

We're usually not debugging problems in pypy-c. It turns out that 99% of the problems you can debug by running on top of CPython, so you can test things really deeply, without compilation.

Collin Winter wrote on 2009-06-08 23:21:

This looks like a good plan. I look forward to sharing ideas with you in the future :)

When you say, "So, the prototype is working and gives very nice speedups in range of 20-30x over CPython", what benchmarks is that on? Can you be more specific?

Leysin Sprint Report

Carl Friedrich Bolz-Tereick

2009-04-21 15:47

2 comments

The Leysin sprint is nearing its end, as usual here is an attempt at a summary

of what we did.

Release Work

Large parts of the sprint were dedicated to fixing bugs. Since the easy bugs seem to have been fixed long ago, those were mostly very annoying and hard bugs. This work was supported by our buildbots, which we tried to get free of test-failures. This was worked on by nearly all participants of the sprint (Samuele, Armin, Anto, Niko, Anders, Christian, Carl Friedrich). One particularly annoying bug was the differences in the tracing events that PyPy produces (fixed by Anders, Samuele and Christian). Some details about larger tasks are in the sections below.

The work culminated in the beta released on Sunday.

Stackless

A large number of problems came from our stackless features, which do some advanced things and thus seem to contain advanced bugs. Samuele and Carl Friedrich spent some time fixing tasklet pickling and unpickling. This was achieved by supporting the (un)pickling of builtin code objects. In addition they fixed some bugs in the finalization of tasklets. This needs some care because the __del__ of a tasklet cannot run at arbitrary points in time, but only at safe points. This problem was a bit subtle to get right, and popped up nearly every morning of the sprint in form of a test failure.

Armin and Niko added a way to restrict the stack depth of the RPython-level stack. This can useful when using stackless, because if this is not there it is possible that you fill your whole heap with stack frames in the case of an infinite recursion. Then they went on to make stackless not segfault when threads are used at the same time, or if a callback from C library code is in progress. Instead you get a RuntimeError now, which is not good but better than a segfault.

Killing Features

During the sprint we discussed the fate of the LLVM and the JS backends. Both have not really been maintained for some time, and even partially untested (their tests were skipped). Also their usefulness appears to be limited. The JS backend is cool in principle, but has some serious limitations due to the fact that JavaScript is really a dynamic language, while RPython is rather static. This made it hard to use some features of JS from RPython, e.g. RPython does not support closures of any kind.

The LLVM backend had its own set of problems. For a long time it produced the fastest form of PyPy's Python interpreter, by first using the LLVM backend, applying the LLVM optimizations to the result, then using LLVM's C backend to produce C code, then apply GCC to the result :-). However, it is not clear that it is still useful to directly produce LLVM bitcode, since LLVM has rather good C frontends nowadays, with llvm-gcc and clang. It is likely that we will use LLVM in the future in our JIT (but that's another story, based on different code).

Therefore we decided to remove these two backends from SVN, which Samuele and Carl Friedrich did. They are not dead, only resting until somebody who is interested in maintaining them steps up.

Windows

One goal of the release is good Windows-support. Anders and Samuele set up a new windows buildbot which revealed a number of failures. Those were attacked by Anders, Samuele and Christian as well as by Amaury (who was not at the sprint, but thankfully did a lot of Windows work in the last months).

OS X

Christian with some help by Samuele tried to get translation working again under Mac OS X. This was a large mess, because of different behaviours of some POSIX functionality in Leopard. It is still possible to get the old behaviour back, but whether that was enabled or not depended on a number of factors such as which Python is used. Eventually they managed to successfully navigate that maze and produce something that almost works (there is still a problem remaining about OpenSSL).

Samuele and Carl Friedrich pretending to work on something

Documentation

The Friday of the sprint was declared to be a documentation day, where (nearly) no coding was allowed. This resulted in a newly structured and improved getting started document (done by Carl Friedrich, Samuele and some help of Niko) and a new document describing differences to CPython (Armin, Carl Friedrich) as well as various improvements to existing documents (everybody else). Armin undertook the Sisyphean task of listing all talks, paper and related stuff of the PyPy project.

Various Stuff

Java Backend Work

Niko and Anto worked on the JVM backend for a while. First they had to fix translation of the Python interpreter to Java. Then they tried to improve the performance of the Python interpreter when translated to Java. Mostly they did a lot of profiling to find performance bottlenecks. They managed to improve performance by 40% by overriding fillInStackTrace of the generated exception classes. Apart from that they found no simple-to-fix performance problems.

JIT Work

Armin gave a presentation about the current state of the JIT to the sprinters as well as Adrian Kuhn, Toon Verwaest and Camillo Bruni of the University of Bern who came to visit for one day. There was a bit of work on the JIT going on too; Armin and Anto tried to get closer to having a working JIT on top of the CLI.

Unknown wrote on 2009-04-22 07:46:

Guys, are you going to make a new release with the things done during the sprint? Thanks.

(pypy is a great work; Keep it up!)

vak wrote on 2009-11-03 12:30:

hi,
could you please make a new blog-post and tell us about news regarding LLVM and PyPy, please?

thanks in advance!

Beta for 1.1.0 released

Carl Friedrich Bolz-Tereick

2009-04-19 22:33

7 comments

Today we are releasing a beta of the upcoming PyPy 1.1 release. There are some Windows and OS X issues left that we would like to address between now and the final release but apart from this things should be working. We would appreciate feedback.

The PyPy development team.

PyPy 1.1: Compatibility & Consolidation

Welcome to the PyPy 1.1 release - the first release after the end of EU funding. This release focuses on making PyPy's Python interpreter more compatible with CPython (currently CPython 2.5) and on making the interpreter more stable and bug-free.

PyPy's Getting Started lives at:

https://codespeak.net/pypy/dist/pypy/doc/getting-started.html

Highlights of This Release

More of CPython's standard library extension modules are supported, among them ctypes, sqlite3, csv, and many more. Most of these extension modules are fully supported under Windows as well.

https://codespeak.net/pypy/dist/pypy/doc/cpython_differences.html https://morepypy.blogspot.com/2008/06/pypy-improvements.html

Through a large number of tweaks, performance has been improved by 10%-50% since the 1.0 release. The Python interpreter is now between 0.8-2x (and in some corner case 3-4x) slower than CPython. A large part of these speed-ups come from our new generational garbage collectors.

https://codespeak.net/pypy/dist/pypy/doc/garbage_collection.html

Our Python interpreter now supports distutils as well as easy_install for pure-Python modules.

We have tested PyPy with a number of third-party libraries. PyPy can run now: Django, Pylons, BitTorrent, Twisted, SymPy, Pyglet, Nevow, Pinax:

https://morepypy.blogspot.com/2008/08/pypy-runs-unmodified-django-10-beta.html https://morepypy.blogspot.com/2008/07/pypys-python-runs-pinax-django.html https://morepypy.blogspot.com/2008/06/running-nevow-on-top-of-pypy.html

A buildbot was set up to run the various tests that PyPy is using nightly on Windows and Linux machines:

https://codespeak.net:8099/

Sandboxing support: It is possible to translate the Python interpreter in a special way so that the result is fully sandboxed.

https://codespeak.net/pypy/dist/pypy/doc/sandbox.html https://blog.sandbox.lt/en/WSGI%20and%20PyPy%20sandbox

Other Changes

The clr module was greatly improved. This module is used to interface with .NET libraries when translating the Python interpreter to the CLI.

https://codespeak.net/pypy/dist/pypy/doc/clr-module.html https://morepypy.blogspot.com/2008/01/pypynet-goes-windows-forms.html https://morepypy.blogspot.com/2008/01/improve-net-integration.html

Stackless improvements: PyPy's stackless module is now more complete. We added channel preferences which change details of the scheduling semantics. In addition, the pickling of tasklets has been improved to work in more cases.

Classic classes are enabled by default now. In addition, they have been greatly optimized and debugged:

https://morepypy.blogspot.com/2007/12/faster-implementation-of-classic.html

PyPy's Python interpreter can be translated to Java bytecode now to produce a pypy-jvm. At the moment there is no integration with Java libraries yet, so this is not really useful.

We added cross-compilation machinery to our translation toolchain to make it possible to cross-compile our Python interpreter to Nokia's Maemo platform:

https://codespeak.net/pypy/dist/pypy/doc/maemo.html

Some effort was spent to make the Python interpreter more memory-efficient. This includes the implementation of a mark-compact GC which uses less memory than other GCs during collection. Additionally there were various optimizations that make Python objects smaller, e.g. class instances are often only 50% of the size of CPython.

https://morepypy.blogspot.com/2008/10/dsseldorf-sprint-report-days-1-3.html

The support for the trace hook in the Python interpreter was improved to be able to trace the execution of builtin functions and methods. With this, we implemented the _lsprof module, which is the core of the cProfile module.

A number of rarely used features of PyPy were removed since the previous release because they were unmaintained and/or buggy. Those are: The LLVM and the JS backends, the aspect-oriented programming features, the logic object space, the extension compiler and the first incarnation of the JIT generator. The new JIT generator is in active development, but not included in the release.

https://codespeak.net/pipermail/pypy-dev/2009q2/005143.html https://morepypy.blogspot.com/2009/03/good-news-everyone.html https://morepypy.blogspot.com/2009/03/jit-bit-of-look-inside.html

What is PyPy?

Technically, PyPy is both a Python interpreter implementation and an advanced compiler, or more precisely a framework for implementing dynamic languages and generating virtual machines for them.

The framework allows for alternative frontends and for alternative backends, currently C, Java and .NET. For our main target "C", we can "mix in" different garbage collectors and threading models, including micro-threads aka "Stackless". The inherent complexity that arises from this ambitious approach is mostly kept away from the Python interpreter implementation, our main frontend.

Socially, PyPy is a collaborative effort of many individuals working together in a distributed and sprint-driven way since 2003. PyPy would not have gotten as far as it has without the coding, feedback and general support from numerous people.

Have fun,

the PyPy release team, [in alphabetical order]

Amaury Forgeot d'Arc, Anders Hammerquist, Antonio Cuni, Armin Rigo, Carl Friedrich Bolz, Christian Tismer, Holger Krekel, Maciek Fijalkowski, Samuele Pedroni

and many others: https://codespeak.net/pypy/dist/pypy/doc/contributor.html

Benjamin Peterson wrote on 2009-04-20 01:21:

Congratulations! PyPy is becoming more and more viable every day. I hope I can continue to become more involved in this awesome project.

Anonymous wrote on 2009-04-21 01:18:

pypy is a very interesting project!

i have a question. do you think pypy-c without jit can ever reach the speed of c-python? why is it slower?

or will you put all the optimization efforts into the jit now? doesn't the performance difference matter because the jit will make it up anyway?

Maciej Fijalkowski wrote on 2009-04-21 04:36:

PyPy without jit can (and is sometimes) be faster than cpython, for various reasons, including garbage collector.

On the other hand, we rather won't sacrifice simplicity for speed and we hope that jit will go that part. Also the funny thing is that since we generate our jit, it gets better as interpreter gets simpler, because jit generator is able to find out more on it's own. So in fact we might give up on some optimizations in favor of simplicity, because jit will be happier.

Cheers,
fijal

Luis wrote on 2009-04-21 14:04:

Sorry for my anxiety, but is there any rough estimation on when the jit will be in a usable state?

Maciej Fijalkowski wrote on 2009-04-21 22:14:

Personally, I'm doing it in my free time. That means I'm giving no estimates, because it makes no sense. If you wish to go into some contractual obligations on our sides, we're up to discuss I suppose :-)

Luis wrote on 2009-04-21 22:33:

Maciej, I know how hard you are working on this. I didn't mean to sound disrespectful and I don't want to bother you... It's just that as everyone else, I'm anxoiusly looking forward to seeing pypy's magic in action. By the way, the new post is very much appreciated. Thanks!

Anonymous wrote on 2009-06-29 07:47:

I am desperately looking for some help building PyPy. I have posted a an Issue (#443) about my issues in the PyPy site.

If anyone from the release/Dev. team can give me a hand, I would seriously appreciate this!

I can be reached at wnyrodeo@yahoo.com

Thanks.

Leysin Sprint Started

Carl Friedrich Bolz-Tereick

2009-04-15 17:52

The Leysin Sprint started today. The weather is great and the view is wonderful, as usual. Technically we are working on the remaining test failures of the nightly test runs and are generally trying to fix various long-postponed bugs. I will try to give more detailed reports as the sprint progresses.

Pycon videos are online

Maciej Fijalkowski

2009-04-07 00:51

1 comment

Hi.

We didn't yet write full pycon summary, but both of our talks are now online: PyPy status talk and python in a sandbox.

Update:

Slides are also available: PyPy status talk and Python in a sandbox.

Enjoy!
fijal & holger

larsr wrote on 2009-04-08 15:25:

I found the slides to the python in a sandbox to be useful too.

JIT progress

News from the jit front

ICOOOLPS Submissions

4 weeks of GDB

1.1 final released

Roadmap for JIT

Leysin Sprint Report

Release Work

Stackless

Killing Features

Windows

OS X

Documentation

Various Stuff

Java Backend Work

JIT Work

Beta for 1.1.0 released

PyPy 1.1: Compatibility & Consolidation

Highlights of This Release

Other Changes

What is PyPy?

Leysin Sprint Started

Pycon videos are online

The PyPy blogposts

Recent Posts

Archives

Release Work

Stackless

Killing Features

Windows

OS X

Documentation

Various Stuff

Java Backend Work

JIT Work

PyPy 1.1: Compatibility & Consolidation

Highlights of This Release

Other Changes

What is PyPy?

The PyPy blogposts

Recent Posts

Archives

Tags