14.14 让你的程序跑的更快-Python3 CookBook中文版

14.14 让你的程序跑的更快

问题

Your program runs too slow and you’d like to speed it up without the assistance of moreextreme solutions, such as C extensions or a just-in-time (JIT) compiler.

解决方案

While the first rule of optimization might be to “not do it,” the second rule is almostcertainly “don’t optimize the unimportant.” To that end, if your program is running slow,you might start by profiling your code as discussed in Recipe 14.13.More often than not, you’ll find that your program spends its time in a few hotspots,such as inner data processing loops. Once you’ve identified those locations, you can usethe no-nonsense techniques presented in the following sections to make your programrun faster.

Use functionsA lot of programmers start using Python as a language for writing simple scripts. Whenwriting scripts, it is easy to fall into a practice of simply writing code with very littlestructure. For example:

...def method(self):

value = self.valuefor x in s:

op(value)

Avoid gratuitous abstractionAny time you wrap up code with extra layers of processing, such as decorators, prop‐erties, or descriptors, you’re going to make it slower. As an example, consider this class:

class A:def init(self, x, y):self.x = xself.y = y
@propertydef y(self):

return self._y

@y.setterdef y(self, value):

self._y = value

Now, try a simple timing test:

>>> from timeit import timeit
>>> a = A(1,2)
>>> timeit("a.x", "from __main__ import a")
0.07817923510447145
>>> timeit("a.y", "from __main__ import a")
0.35766440676525235
>>>

As you can observe, accessing the property y is not just slightly slower than a simpleattribute x, it’s about 4.5 times slower. If this difference matters, you should ask yourselfif the definition of y as a property was really necessary. If not, simply get rid of it andgo back to using a simple attribute instead. Just because it might be common for pro‐grams in another programming language to use getter/setter functions, that doesn’tmean you should adopt that programming style for Python.

Use the built-in containersBuilt-in data types such as strings, tuples, lists, sets, and dicts are all implemented in C,and are rather fast. If you’re inclined to make your own data structures as a replacement(e.g., linked lists, balanced trees, etc.), it may be rather difficult if not impossible to matchthe speed of the built-ins. Thus, you’re often better off just using them.

Avoid making unnecessary data structures or copiesSometimes programmers get carried away with making unnecessary data structureswhen they just don’t have to. For example, someone might write code like this:

values = [x for x in sequence]squares = [x*x for x in values]

Perhaps the thinking here is to first collect a bunch of values into a list and then to startapplying operations such as list comprehensions to it. However, the first list is com‐pletely unnecessary. Simply write the code like this:

squares = [x*x for x in sequence]

Related to this, be on the lookout for code written by programmers who are overlyparanoid about Python’s sharing of values. Overuse of functions such as copy.deepcopy() may be a sign of code that’s been written by someone who doesn’t fully under‐stand or trust Python’s memory model. In such code, it may be safe to eliminate manyof the copies.

讨论

Before optimizing, it’s usually worthwhile to study the algorithms that you’re using first.You’ll get a much bigger speedup by switching to an O(n log n) algorithm than bytrying to tweak the implementation of an an O(n**2) algorithm.If you’ve decided that you still must optimize, it pays to consider the big picture. As ageneral rule, you don’t want to apply optimizations to every part of your program,because such changes are going to make the code hard to read and understand. Instead,focus only on known performance bottlenecks, such as inner loops.You need to be especially wary interpreting the results of micro-optimizations. Forexample, consider these two techniques for creating a dictionary:

a = {‘name" : ‘AAPL",‘shares" : 100,‘price" : 534.22
}

b = dict(name="AAPL", shares=100, price=534.22)

The latter choice has the benefit of less typing (you don’t need to quote the key names).However, if you put the two code fragments in a head-to-head performance battle, you’llfind that using dict() runs three times slower! With this knowledge, you might beinclined to scan your code and replace every use of dict() with its more verbose al‐ternative. However, a smart programmer will only focus on parts of a program whereit might actually matter, such as an inner loop. In other places, the speed difference justisn’t going to matter at all.If, on the other hand, your performance needs go far beyond the simple techniques inthis recipe, you might investigate the use of tools based on just-in-time (JIT) compilationtechniques. For example, the PyPy project is an alternate implementation of the Python

interpreter that analyzes the execution of your program and generates native machinecode for frequently executed parts. It can sometimes make Python programs run anorder of magnitude faster, often approaching (or even exceeding) the speed of codewritten in C. Unfortunately, as of this writing, PyPy does not yet fully support Python3. So, that is something to look for in the future. You might also consider the Numbaproject. Numba is a dynamic compiler where you annotate selected Python functionsthat you want to optimize with a decorator. Those functions are then compiled intonative machine code through the use of LLVM. It too can produce signficant perfor‐mance gains. However, like PyPy, support for Python 3 should be viewed as somewhatexperimental.Last, but not least, the words of John Ousterhout come to mind: “The best performanceimprovement is the transition from the nonworking to the working state.” Don’t worryabout optimization until you need to. Making sure your program works correctly isusually more important than making it run fast (at least initially).