注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

Code@Pig Home

喜欢背着一袋Code傻笑的Pig .. 忧美.欢笑.记忆.忘却 .之. 角落

 
 
 

日志

 
 

[轻书快读] Effective Python - 59 Specific Ways to Write Better Python (4)  

2016-03-21 09:45:22|  分类: lang_python |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |

Item 31: Use Descriptors for Reusable @property Methods

你应该去看看 descriptor 是啥:
https://docs.python.org/2/howto/descriptor.html

中文版的解读在这里:
http://www.cnblogs.com/zyobi/archive/2010/11/07/1871293.html


Item 32: Use __getattr__, __getattribute__, and __setattr__ for Lazy Attributes

看看 __getattr__ 怎么用:
 <1> 若 object.__dict__[key] 存在,返回其值
 <2> 否则,返回 object.__getattr__(key)

class LazyDB(object):
  def __init__(self):
    self.exists = 5

  def __getattr__(self, name):
    value = 'Value for %s' % name
    print '__getattr__(), key =', name
    setattr(self, name, value)
    return value

data = LazyDB()
print 'Before:', data.__dict__
print 'foo:   ', data.foo
print 'foo:   ', data.foo
print 'After: ', data.__dict__

>>>
Before: {'exists': 5}
'__getattr__(), key = foo
foo:    Value for foo
foo:    Value for foo
After:  {'exists': 5, 'foo': 'Value for foo'}


是否调用 __getattr__() 受限于 __dict__?那就用 __getattribute__() 吧。
 <1> object.xxx 一定会触发 object.__getattribute__(xxx)

class ValidatingDB(object):
  def __init__(self):
    self.exists = 5

  def __getattribute__(self, name):
    print 'Called __getattribute__(%s)' % name
    try:
      return super(ValidatingDB, self).__getattribute__(name)
    except AttributeError:
      value = 'Value for %s' % name
      setattr(self, name, value)
      return value

data = ValidatingDB()
print 'exists:', data.exists
print 'foo:   ', data.foo
print 'foo:   ', data.foo

>>>
Called __getattribute__(exists)
exists: 5
Called __getattribute__(foo)
foo:    Value for foo
Called __getattribute__(foo)
foo:    Value for foo

看看 __setattr__ 怎么用:
  <1> 每次 object.xxx = value 都会调用 object.__setattr__(xxx, value)

class SavingDB(object):
  def __setattr__(self, name, value):
    # Save some data to the DB log
    super(SavingDB, self).__setattr__(name, value)

class LoggingSavingDB(SavingDB):
  def __setattr__(self, name, value):
    print 'Called __setattr__(%s, %r)' % (name, value)
    super(LoggingSavingDB, self).__setattr(name, value)

data = LoggingSavingDB()
print 'Before: ', data.__dict__
data.foo = 5
print 'After:  ', data.__dict__
data.foo = 7
print 'Finally:', data.__dict__

>>>
Before:  {}
Called __setattr__(foo, 5)
After:   {'foo': 5}
Called __setattr__(foo, 7)
Finally: {'foo': 7}


Item 33: Validate Subclasses with Metaclasses

看看 metaclass 如何起作用:
class Meta(type):
  def __new__(meta, name, bases, class_dict):
    print meta, name, bases, class_dict
    return type.__new__(meta, name, bases, class_dict)

class MyClass(object):
  __metaclass__ = Meta

  stuff = 123

  def foo(self):
    pass

o = MyClass()


>>>
<class '__main__.Meta'>
MyClass
(<type 'object'>,)
{'__module__': '__main__',
 'stuff': 123,
 '__metaclass__': <class '__main__.Meta'>,
 'foo': <function foo at 0x0284EB30>}

Python 3 换了一种写法:
class MyClass(object, metaclass=Meta):
  ...

我们可以通过 Meta.__new__ 对参数进行验证:
class ValidatePolygon(type):
  def __new__(meta, name, bases, class_dict):
    # Don't validate the abstract Polygon class
    if bases != (object,):
      if class_dict['sides'] < 3:
        raise ValueError('Polygons need 3+ sides')
    return type.__new__(meta, name, bases, class_dict)

class Polygon(object):
  __metaclass__ = ValidatePolygon
  sides = None # Specified by subclasses

class Triangle(Polygon):
  sides = 3


Item 34: Register Class Existence with Metaclasses

先来看看完整的例子,object 的自动序列化。通过 metaclass 来自动 register_class()。
这让我想起了 C++/MFC 中的 RTTI。:-),对整个 object 进行 序列化/反序列化。

registry = {}

def register_class(target_class):
  registry[target_class.__name__] = target_class

def deserialize(data):
  params = json.loads(data)
  name = params['class']
  target_class = registry[name]
  return target_class(*params['args'])

class Meta(type):
  def __new__(meta, name, bases, class_dict):
    cls = type.__new__(meta, name, bases, class_dict)
    register_class(cls)
    return cls

class Serializable(object):
  __metaclass__ = Meta

  def __init__(self, *args):
    self._args = args

  def serialize(self):
    return json.dumps({
      'class': self.__class__.__name__,
      'args': self._args,
    })

class Point2D(Serializable):
  def __init__(self, x, y):
    super(Point2D, self).__init__(x, y)
    self.x, self.y = x, y

  def __repr__(self):
    return 'Point2D(%d,%d)' % (self.x, self.y)


p = Point2D(10, 20)
print 'Before:    ', p
data = p.serialize()
print 'Serialized:', data
print 'After:     ', deserialize(data)

>>>
Before:     Point2D(10,20)
Serialized: {"args": [10, 20], "class": "Point2D"}
After:      Point2D(10,20)


Item 35: Annotate Class Attributes with Metaclasses

class Field(object):
  def __init__(self, name):
    self.name = name
    self.internal_name = '_' + self.name

  def __get__(self, instance, instance_type):
    if instance is None: return self
    return getattr(instance, self.internal_name, '')

  def __set__(self, instance, value):
    setattr(instance, self.internal_name, value)

class Customer(object):
  first_name = Field('first_name')  # class attributes
  last_name  = Field('last_name')

foo = Customer()
print 'Before:', repr(foo.first_name), foo.__dict__
foo.first_name = 'phay'
print 'After: ', repr(foo.first_name), foo.__dict__

>>>
Before: '' {}
After:  'phay' {'_first_name': 'phay'}


这里,first_name = Field('first_name') 把 first_name 这个字符串写了两次,好烦。
可以用 metaclass 更 tricky 的搞定它。
class Field(object):
  def __init__(self):
    # These will be assigned by the metaclass
    self.name = None
    self.internal_name = None

  def __get__(self, instance, instance_type):
    if instance is None: return self
    return getattr(instance, self.internal_name, '')

  def __set__(self, instance, value):
    setattr(instance, self.internal_name, value)

class Meta(type):
  def __new__(meta, name, bases, class_dict):
    for key, value in class_dict.items():
      if isinstance(value, Field):
        value.name = key
        value.internal_name = '_' + key
    cls = type.__new__(meta, name, bases, class_dict)
    return cls

class Customer(object):
  __metaclass__ = Meta
  first_name = Field()
  last_name  = Field()

foo = Customer()
print 'Before:', repr(foo.first_name), foo.__dict__
foo.first_name = 'phay'
print 'After: ', repr(foo.first_name), foo.__dict__

>>>
Before: '' {}
After:  'phay' {'_first_name': 'phay'}


5. Concurrency and Parallelism

Item 36: Use subprocess to Manage Child Processes

Python 中有好多个库(历史原因?)可以开子进程的,推荐使用 subprocess。
# sleepme.py
import sys
import time

if __name__ == '__main__':
    sleep_time = float(sys.argv[1])
    print 'begin sleep...'
    time.sleep(sleep_time)
    print 'end sleep...'

# testme.py
import subprocess

proc = subprocess.Popen(['python', 'sleepme.py', '5'], stdout=subprocess.PIPE)
print 'test...'
out, err = proc.communicate()
print out.decode('utf-8')

python testme.py

>>>
costs 5.03299999237
begin sleep...
end sleep...


Item 37: Use Threads for Blocking I/O, Avoid for Parallelism

通过全局锁 GIL,同一时刻,Python 只允许一个线程在运行。
所以想用 Python 多线程提高效率,只对I/O密集型程序有效,CPU密集型无效。
下面的例子就是CPU密集型。

单线程运算
def factorize(number):
  for i in range(1, number+1):
    if number % i == 0:
      yield i

numbers = [2139079, 1214759, 1516637, 1852285]
start = time()
for number in numbers:
  list(factorize(number))
end = time()
print 'Took %.3f seconds' % (end - start)

>>>
Took 1.040 seconds

多线程运算
from threading import Thread

class FactorizeThread(Thread):
  def __init__(self, number):
    super(FactorizeThread, self).__init__()
    self.number = number

  def run(self):
    self.factors = list(factorize(self.number))

start = time()
threads = []
for number in numbers:
  thread = FactorizeThread(number)
  thread.start()
  threads.append(thread)

for thread in threads:
  thread.join()
print 'Took %.3f seconds' % (end - start)

>>>
Took 1.061 seconds

然后看看I/O密集型的例子。

单线程I/O
import select

def slow_systemcall():
  select.select([], [], [], 0.1)

start = time()
for _ in range(5):
  slow_systemcall()
end = time()
print 'Took %.3f seconds' % (end - start)

>>>
Took 0.503 seconds

多线程I/O
start = time()
threads = []
for _ in range(5):
  thread = Thread(target = slow_systemcall)
  thread.start()
  threads.append(thread)
for thread in threads:
  thread.join()
end = time()
print 'Took %.3f seconds' % (end - start)

>>>
Took 0.102 seconds


Item 38: Use Lock to Prevent Data Races in Threads

就算有 GIL,Python 的线程还是会有 context switch,对于共享数据,还是要加锁。

from threading import Thread

class Counter(object):
    def __init__(self):
        self.count = 0

    def increment(self, offset):
        self.count += offset

def worker(i, how_many, counter):
    for _ in range(how_many):
        counter.increment(1)

def run_threads(func, how_many, counter):
    threads = []
    for i in range(5):
        args = (i, how_many, counter)
        thread = Thread(target=func, args=args)
        threads.append(thread)
        thread.start()
    for thread in threads:
        thread.join()

how_many = 10**5
counter = Counter()
run_threads(worker, how_many, counter)
print 'Counter should be %d, found %d' % (5 * how_many, counter.count)

>>>
Counter should be 500000, found 220305

加个锁即可
from threading import Lock

class Counter(object):
    def __init__(self):
        self.lock = Lock()
        self.count = 0

    def increment(self, offset):
        with self.lock:
            self.count += offset

>>>
Counter should be 500000, found 500000

  评论这张
 
阅读(210)| 评论(0)
推荐 转载

历史上的今天

在LOFTER的更多文章

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2017