Python level up series 1

--suggestions from personal experience and reading

Created by Junchao

Python

    Beautiful is better than ugly.
    Explicit is better than implicit.
    Simple is better than complex.
    Complex is better than complicated.
    Flat is better than nested.
    Sparse is better than dense.
    Readability counts.
    Special cases aren't special enough to break the rules.
    Although practicality beats purity.
    Errors should never pass silently.
    Unless explicitly silenced.
    In the face of ambiguity, refuse the temptation to guess.
    There should be one-- and preferably only one --obvious way to do it.
    Although that way may not be obvious at first unless you're Dutch.
    Now is better than never.
    Although never is often better than *right* now.
    If the implementation is hard to explain, it's a bad idea.
    If the implementation is easy to explain, it may be a good idea.
    Namespaces are one honking great idea -- let's do more of those!
  

Today's speech's outline

  • Pythonic Thinking
  • Functions


With reference to Effective Python and 编写高质量代码:改善Python程序的91个建议

Pythonic Thinking

Know which version you are using

  • 2.7 on most of the machines. still some 2.6
  • some minor changes even within 2.7
    celery(amqp==2.0.3)
    will not work on <2.7.7
  • and we are embracing Python 3

Know which version you are using

  • 2to3 will help a lot basically
  • and check Python doc before migration
  • pyenv for managing and switching Python versions
  • make sure your global version is still "system" to ensure libraries will still work
  • combine with virtualenv to make development env more isolated

Follow PEP 8 style guide

  • For Vim
    Plugin 'klen/python-mode'
  • For Sublime Text Editor
    Anaconda

Use list comprehension over map & filer

  • Faster(True for 2.*)
  • More Pythonic
      
  >>> import timeit
  >>> print(timeit.timeit('[i ** 2 for i in range(1000) if i % 2 == 0]', number=10000))
  0.637861013412
  >>> print(timeit.timeit('map(lambda x: x ** 2, filter(lambda x: x % 2 == 0, range(1000)))', number=10000))
  1.21372795105
  >>> print(timeit.timeit('[i ** 2 for i in xrange(1000) if i % 2 == 0]', number=10000))
  0.5871758461
  >>> print(timeit.timeit('map(lambda x: x ** 2, filter(lambda x: x % 2 == 0, xrange(1000)))', number=10000))
  1.16653990746
      
    

Generator for large comprehensions

  • More memory friendly

  • yield is more than generator though
    cover this in later section with concurrency & performance
  • [] --> ()
  • but you can only use it once

Python 3: map & filter

      
  import timeit

  print(timeit.timeit('[i ** 2 for i in range(1000) if i % 2 == 0]', number=10000))
  # 2.831494385987753
  print(timeit.timeit('(i ** 2 for i in range(1000) if i % 2 == 0)', number=10000))
  # 0.007943934993818402
  print(timeit.timeit('map(lambda x: x ** 2, filter(lambda x: x % 2 == 0, range(1000)))', number=10000))
  # 0.008463110993034206
      
    
"map() and filter() return iterators"
-- What's new in Python 3.0

Use True in infinite loop

  • More readable, it is not C || C++
  • although slower on <3.0
      
>>> import timeit
>>> def test1():
...     idx = 1
...     while True:
...        idx += 1
...        if idx > 1000:
...           break
>>> def test2():
...     idx = 1
...     while 1:
...        idx += 1
...        if idx > 1000:
...            break
...
>>> timeit.timeit(test1, number=1000000)
60.8167519569397
>>> timeit.timeit(test2, number=1000000)
35.87116599082947
      
    

Use True in infinite loop

  • In Python 3, True is reserved as constants
  • Meaning you can do True = 1 in 2.7
      
# See that Python 3 is generally slower than 2
import timeit
def test1():
  idx = 0
  while 1:
    idx += 1
    if idx > 1000:
      break
def test2():
  idx = 0
  while True:
    idx += 1
    if idx > 1000:
      break
print(timeit.timeit(test1, number=1000000))
# 55.750455176021205
print(timeit.timeit(test2, number=1000000))
# 56.51068397599738
      
    

Use unpacking for swapping variables

  • Save code & more readable
  • Faster
      
import timeit
def test1():
  x, y = 1, 2
  for _ in range(1000):
    temp = x
    x = y
    y = temp
def test2():
  x, y = 1, 2
  for _ in range(1000):
    y, x = x, y
print(timeit.timeit(test1, number=4000000))
# 147.59938350701123
print(timeit.timeit(test2, number=4000000))
# 133.6212830049917
      
    

Use unpacking for swapping variables

      
## test1
        >>   25 FOR_ITER                24 (to 52)
             28 STORE_FAST               2 (_)

  4          31 LOAD_FAST                0 (x)
             34 STORE_FAST               3 (temp)

  5          37 LOAD_FAST                1 (y)
             40 STORE_FAST               0 (x)

  6          43 LOAD_FAST                3 (temp)
             46 STORE_FAST               1 (y)
             49 JUMP_ABSOLUTE           25
## test2
        >>   25 FOR_ITER                19 (to 47)
             28 STORE_FAST               2 (_)

 12          31 LOAD_FAST                0 (x)
             34 LOAD_FAST                1 (y)
             37 ROT_TWO
             38 STORE_FAST               1 (y)
             41 STORE_FAST               0 (x)
             44 JUMP_ABSOLUTE           25
      
    

Careful use of else

  • Really great with try
  • May take time to understand when with for or while

Pay attention to finally

  • Supposed to do cleanup work
  • Control clauses like return or break will silence any exceptions in the try-block

Use zip for loop iterators together

  • Iterate 2 iterable at the same time
  • Check itertools.izip & itertools.izip_longest
  • BTW, itertools is also a very powerful library
      
example = dict((k, v) for k, v in zip(ls1, ls2))
      
    

Functions

Prefer exceptions to returning None

  • Python is not C or Go
  • Clearer control flow

Understanding closure

  • Understanding LEGB
  • Cannot assign without declaration
  • nonlocal in Python 3
      
def test_non_local():
    def inner():
        nonlocal outvar
        outvar = 5

    outvar = 3
    print(outvar)
    inner()
    print(outvar)

test_non_local()
      
    

Understanding closure

  • Viariables in closure only gets evaluated at execution
      
print([func(2) for func in [lambda x: i * x for i in range(5)]])
# [8, 8, 8, 8, 8]
print([func(2) for func in [lambda x, i=i: i * x for i in range(5)]])
# [0, 2, 4, 6, 8]
print([func(2) for func in (lambda x: i * x for i in range(5))])
# [0, 2, 4, 6, 8]
      
    

Generator over returning list

  • yield will transform a function to a generator
  • Efficiency
  • Infinite series
      
def fib_infinite():
    a, b = 0, 1
    while True:
        a, b = a + b, a
        yield a

import itertools
print(list(itertools.islice(fib_infinite(), 20)))
      
    

Prefer kwargs for clarity and extensibility

  • User knows, reader knows
  • Optional arguments
      
# version 1
def get_user(user_id):
    return User.find(id=user_id)
# version 2
def get_user(user_id, **kwargs):
    encoding = kwargs.pop('encoding', None)
    user = User.find(id=user_id)
    encoder = get_encoder(encoding)
    return encoder.encode(user)
# we can even further improve it if encoder is used by
#   many other functions as well
      
    

Use None for default argument

  • Immutability is the KEY!
      
def test_default_arg(default=None):
    default = [] if default is None else default
    print(default)

test_default_arg()
test_default_arg('')
test_default_arg(0)
      
    

Decorator

  • One of the greatest features in Python
  • Memorization
  • Logging
  • Profiling
  • and etc.

Decorator

      
from functools import wraps
def profile_func(func):
    @wraps(func)
    def profiled_func(*args, **kwargs):
        profile = cProfile.Profile()
        try:
            profile.enable()
            result = func(*args, **kwargs)
            profile.disable()
        finally:
            profile.print_stats()
        return result

    return profiled_func

@profile_func
def demo():
    return 'Hello, world!'

print(demo.__name__)
      
    

Decorator

      
# version 3 of get_user
from functools import wraps
def encode_result(func):
    @wraps(func)
    def inner(*args, **kwargs):
        encoding = kwargs.pop('encoding', None)
        result = func(*args, **kwargs)
        return get_encoder(encoding).encode(result)
    return inner

@encode_result
def get_user(user_id, **kwargs):
    return User.find(id=user_id)
      
    

Decorator

      
# version 3 of get_user
from functools import wraps
def encode_result(encoding=None):
    def wrapper(func):
        @wraps(func)
        def inner(*args, **kwargs):
            result = func(*args, **kwargs)
            return get_encoder(encoding).encode(result)
        return inner
    return wrapper

@encode_result(encoding='json')
def get_user(user_id, **kwargs):
    return User.find(id=user_id)
      
    

What is next? (Proposals Only)

  • Class & Inheritance
  • Metaclass
  • Concurrency & Parallelism
  • Built-ins
  • IPython notebook
  • Python for data science
  • Django & Flask
  • Tornado
  • An in-depth introduction to Python 3

Thank you