Nightly graphs of PyPy's performance
As you can probably see, we're very good on some benchmarks and not that great on others. Some of the bad results come from the fact that while we did a lot of JIT-related work, other PyPy parts did not see that much love. Some of our algorithms on the builtin data types are inferior to those of CPython. This is going to be an ongoing focus for a while.
We want to first improve on the benchmarks for a couple of weeks before doing a release to gather further feedback.
So... what's a revision number that I can use? Am I just supposed to guess? The page should have a reasonable default revision number.
for anyone else looking, 70700 is a reasonable place to start. (The graphs are really nice by the way, I'm not hating!)
a couple of suggestions:
1. scale for X axis (dates are likely to be more interesting than revision numbers)
1a. scale for Y axis
2. Add another line: unladen swallow performance
+1 for Anonymous's suggestions 1 and 2.
This is cool.
Unladen Swallow's perf should also be considered if possible.
Regarding revisions - by default it points to the first one we have graphs from, so you can just slice :) Also, yeah, revision numbers and dates should show up, will fix that. We don't build nightly unladen swallow and we don't want to run it against some older version, because they're improving constantly.
Wonderful idea, great implementation (axis are needed, tooltips would be interesting for long series), impressive results.
I hope you guys exploit this to raise interest in PyPy in this pre-release period. Just take a look at the response you get to posts involving numbers, benchmarks, etc. (BTW, keep linking to the funding post) :)
A series of short posts discussing hot topics would be a sure way to keep Pypy around the news until the release, so you get as much feedback as possible.
- Possible factors in slower results (discuss points in the Some Benchmarking post);
- One-of comparisons to different CPython versions, Unladen Swallow, ShedSkin, [C|J|IronP]ython (revisit old benchmarks posts?);
- Mention oprofile and the need for better profiling tools in blog, so you can crowdsource a review of options;
- Ping the Phoronix Test Suite folks to include Pypy translation (or even these benchmarks) in their tests: Python is an important part of Linux distros;
- Don't be afraid to post press-quotable numbers and pics, blurbs about what Pypy is and how much it's been improving, etc. Mention unrelated features of the interpreter (sandboxable!), the framework (free JIT for other languages), whatever;
- The benchmark platform (code, hardware, plans for new features).
Regarding comparison with unladen swallow: I think having a point per month would be good enough for comparison purposes.
@Anonymous: Great suggestions! I'll look at this issues. In fact, things like profiling has been highly on our todo list, but we should advertise it more. We surely miss someone who'll be good at PR :-)
Something's wrong with plot one's scale: the speed ups are represented by a first line of 2x, a second one of 4x and the third one is 8x. Shouldn't it be 6x instead?