The goal of this page is to point out some of the differences between running python with PyPy and with CPython
Pure python code works, but there are a few differences with object lifetime management. Modules that use the CPython C API will probably work, but will not achieve a speedup via the JIT. We encourage library authors to use CFFI instead.
If you are looking for how to use PyPy with the scientific python ecosystem, we encourage you to use conda, since they repackage common libraries like scikit-learn and SciPy for PyPy.
__del__, and resource use
The main difference in pure-python code that is not going to be fixed is that
not support refcounting semantics for "automatically" releasing state when
__del__ is called. The following code won't fill the
file immediately, but only after a certain period of time, when the GC
does a collection and flushes the output, since the file is only closed when
__del__ method is called:
The proper fix is
The same problem---not closing your files---can also show up if your program opens a large number of files without closing them explicitly. In that case, you can easily hit the system limit on the number of file descriptors that are allowed to be opened at the same time.
PyPy can be run with the command-line option
-X track-resources (as in,
pypy -X track-resources myprogram.py). This produces a
when the GC closes a non-closed file or socket. The traceback for the place
where the file or socket was allocated is given as well, which aids finding
close() is missing.
Similarly, remember that you must
close() a non-exhausted
generator in order to have its pending
clauses executed immediately:
def mygen(): with foo: yield 42 for x in mygen(): if x == 42: break # foo.__exit__ is not run immediately! # fixed version: gen = mygen() try: for x in gen: if x == 42: break finally: gen.close()
__del__() methods are not executed as predictively
as on CPython: they run "some time later" in PyPy (or not at all if
the program finishes running in the meantime). See more details
Why is memory usage so high?
Note that PyPy returns unused memory to the operating system only after
a madvise() system call (at least Linux, OS X, BSD) or on Windows. It is
important to realize that you may not see this in
top. The unused
pages are marked with
MADV_FREE, which tells the system "if you
need more memory at some point, grab this page". As long as memory is
RES column in
top might remains high. (Exceptions to
this rule are systems with no
MADV_FREE, where we use
MADV_DONTNEED, which forcefully lowers the
RES. This includes
Linux <= 4.4.)
A more complete list of known differences is available at our dev site.