一、pickle
pickle模塊用來實現(xiàn)python對象的序列化和反序列化。通常地pickle將python對象序列化為二進制流或文件。
python對象與文件之間的序列化和反序列化:
pickle.dump()
pickle.load()
如果要實現(xiàn)python對象和字符串間的序列化和反序列化,則使用:
pickle.dumps()
pickle.loads()
可以被序列化的類型有:
* None,True 和 False;
* 整數(shù),浮點數(shù),復數(shù);
* 字符串,字節(jié)流,字節(jié)數(shù)組;
* 包含可pickle對象的tuples,lists,sets和dictionaries;
* 定義在module頂層的函數(shù):
* 定義在module頂層的內(nèi)置函數(shù);
* 定義在module頂層的類;
* 擁有__dict__()或__setstate__()的自定義類型;
注意:對于函數(shù)或類的序列化是以名字來識別的,所以需要import相應的module。
二、pickle的運行過程
在大部分情況下,要是的對象picklable,我們不需要額外的代碼。默認地pickle將智能地檢查類和實例的屬性,當一個類實例反序列化的時候,它的__init__()方法通常不被調(diào)用。而是首先創(chuàng)建一個未初始化的實例,然后再回復存儲的屬性。
但是可以通過實現(xiàn)下列的方法來修改默認的行為:
object.__getstate__() :默認地序列化對象的__dict__,但是如果你實現(xiàn)了__getstate__(),則__getstate__()函數(shù)返回的值將被序列化。
object.__setstate__(state) :如果類型實現(xiàn)了此方法,則在反序列化的時候,此方法用來恢復對象的屬性。
object.__getnewargs__() : 如果實例構造的時候(__new__())需要參數(shù),則需要實現(xiàn)此函數(shù)。
注意:如果__getstate__()返回False,則在反序列化的時候__setstate__()則不被調(diào)用。
有的時候為了效率,或上面的3個函數(shù)不能滿足需求時,需要實現(xiàn)__reduce__()函數(shù)。
三、實例
import pickle
# An arbitrary collection of objects supported by pickle.
data = {
'a': [1, 2.0, 3, 4+6j],
'b': ("character string", b"byte string"),
'c': set([None, True, False])
}
with open('data.pickle', 'wb') as f:
# Pickle the 'data' dictionary using the highest protocol available.
pickle.dump(data, f, pickle.HIGHEST_PROTOCOL)
with open('data.pickle', 'rb') as f:
# The protocol version used is detected automatically, so we do not
# have to specify it.
data = pickle.load(f)
print(str(data))
四、修改picklable類型的默認行為
class TextReader:
"""Print and number lines in a text file."""
def __init__(self, filename):
self.filename = filename
self.file = open(filename)
self.lineno = 0
def readline(self):
self.lineno += 1
line = self.file.readline()
if not line:
return None
if line.endswith('\n'):
line = line[:-1]
return "%i: %s" % (self.lineno, line)
def __getstate__(self):
# Copy the object's state from self.__dict__ which contains
# all our instance attributes. Always use the dict.copy()
# method to avoid modifying the original state.
state = self.__dict__.copy()
# Remove the unpicklable entries.
del state['file']
return state
def __setstate__(self, state):
# Restore instance attributes (i.e., filename and lineno).
self.__dict__.update(state)
# Restore the previously opened file's state. To do so, we need to
# reopen it and read from it until the line count is restored.
file = open(self.filename)
for _ in range(self.lineno):
file.readline()
# Finally, save the file.
self.file = file
reader = TextReader("hello.txt")
print(reader.readline())
print(reader.readline())
s = pickle.dumps(reader)
#print(s)
new_reader = pickle.loads(s)
print(new_reader.readline())
# the output is
# 1: hello
# 2: how are you
# 3: goodbye