這貨很強大, 必須掌握
文檔 鏈接
pymotw 鏈接
基本是基于文檔的翻譯和補充,相當于翻譯了
itertools用于高效循環(huán)的迭代函數(shù)集合
總體,整體了解
無限迭代器
迭代器 參數(shù) 結(jié)果 例子count() start, [step] start, start+step, start+2*step, ... count(10) --> 10 11 12 13 14 ...cycle() p p0, p1, ... plast, p0, p1, ... cycle('ABCD') --> A B C D A B C D ...repeat() elem [,n] elem, elem, elem, ... endlessly or up to n times repeat(10, 3) --> 10 10 10處理輸入序列迭代器
迭代器 參數(shù) 結(jié)果 例子chain() p, q, ... p0, p1, ... plast, q0, q1, ... chain('ABC', 'DEF') --> A B C D E FcomPRess() data, selectors (d[0] if s[0]), (d[1] if s[1]), ... compress('ABCDEF', [1,0,1,0,1,1]) --> A C E Fdropwhile() pred, seq seq[n], seq[n+1], starting when pred fails dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1groupby() iterable[, keyfunc] sub-iterators grouped by value of keyfunc(v)ifilter() pred, seq elements of seq where pred(elem) is True ifilter(lambda x: x%2, range(10)) --> 1 3 5 7 9ifilterfalse() pred, seq elements of seq where pred(elem) is False ifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8islice() seq, [start,] stop [, step] elements from seq[start:stop:step] islice('ABCDEFG', 2, None) --> C D E F Gimap() func, p, q, ... func(p0, q0), func(p1, q1), ... imap(pow, (2,3,10), (5,2,3)) --> 32 9 1000starmap() func, seq func(*seq[0]), func(*seq[1]), ... starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000tee() it, n it1, it2 , ... itn splits one iterator into ntakewhile() pred, seq seq[0], seq[1], until pred fails takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4izip() p, q, ... (p[0], q[0]), (p[1], q[1]), ... izip('ABCD', 'xy') --> Ax Byizip_longest() p, q, ... (p[0], q[0]), (p[1], q[1]), ... izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-組合生成器
迭代器 參數(shù) 結(jié)果product() p, q, ... [repeat=1] cartesian product, equivalent to a nested for-looppermutations() p[, r] r-length tuples, all possible orderings, no repeated elementscombinations() p, r r-length tuples, in sorted order, no repeated elementscombinations_with_replacement() p, r r-length tuples, in sorted order, with repeated elementsproduct('ABCD', repeat=2) AA AB AC AD BA BB BC BD CA CB CC CD DA DB DC DDpermutations('ABCD', 2) AB AC AD BA BC BD CA CB CD DA DB DCcombinations('ABCD', 2) AB AC AD BC BD CDcombinations_with_replacement('ABCD', 2) AA AB AC AD BB BC BD CC CD DD創(chuàng)建一個迭代器,生成從n開始的連續(xù)整數(shù),如果忽略n,則從0開始計算(注意:此迭代器不支持長整數(shù))
如果超出了sys.maxint,計數(shù)器將溢出并繼續(xù)從-sys.maxint-1開始計算。
定義
def count(start=0, step=1): # count(10) --> 10 11 12 13 14 ... # count(2.5, 0.5) -> 2.5 3.0 3.5 ... n = start while True: yield n n += step等同于(start + step * i for i in count())
使用
from itertools import *for i in izip(count(1), ['a', 'b', 'c']): print i(1, 'a')(2, 'b')(3, 'c')創(chuàng)建一個迭代器,對iterable中的元素反復(fù)執(zhí)行循環(huán)操作,內(nèi)部會生成iterable中的元素的一個副本,此副本用于返回循環(huán)中的重復(fù)項。
定義
def cycle(iterable): # cycle('ABCD') --> A B C D A B C D A B C D ... saved = [] for element in iterable: yield element saved.append(element) while saved: for element in saved: yield element使用
from itertools import *i = 0for item in cycle(['a', 'b', 'c']): i += 1 if i == 10: break print (i, item)(1, 'a')(2, 'b')(3, 'c')(4, 'a')(5, 'b')(6, 'c')(7, 'a')(8, 'b')(9, 'c')創(chuàng)建一個迭代器,重復(fù)生成object,times(如果已提供)指定重復(fù)計數(shù),如果未提供times,將無止盡返回該對象。
定義
def repeat(object, times=None): # repeat(10, 3) --> 10 10 10 if times is None: while True: yield object else: for i in xrange(times): yield object
使用
from itertools import *for i in repeat('over-and-over', 5): print iover-and-overover-and-overover-and-overover-and-overover-and-over將多個迭代器作為參數(shù), 但只返回單個迭代器, 它產(chǎn)生所有參數(shù)迭代器的內(nèi)容, 就好像他們是來自于一個單一的序列.
def chain(*iterables): # chain('ABC', 'DEF') --> A B C D E F for it in iterables: for element in it: yield element使用
from itertools import *for i in chain([1, 2, 3], ['a', 'b', 'c']): print i123abcfrom itertools import chain, imapdef flatmap(f, items): return chain.from_iterable(imap(f, items))>>> list(flatmap(os.listdir, dirs))>>> ['settings.py', 'wsgi.py', 'templates', 'app.py', 'templates', 'index.html, 'config.json']提供一個選擇列表,對原始數(shù)據(jù)進行篩選
def compress(data, selectors): # compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F return (d for d, s in izip(data, selectors) if s)創(chuàng)建一個迭代器,只要函數(shù)predicate(item)為True,就丟棄iterable中的項,如果predicate返回False,就會生成iterable中的項和所有后續(xù)項。
即:在條件為false之后的第一次, 返回迭代器中剩下來的項.
def dropwhile(predicate, iterable): # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1 iterable = iter(iterable) for x in iterable: if not predicate(x): yield x break for x in iterable: yield x
使用
from itertools import *def should_drop(x): print 'Testing:', x return (x<1)for i in dropwhile(should_drop, [ -1, 0, 1, 2, 3, 4, 1, -2 ]): print 'Yielding:', iTesting: -1Testing: 0Testing: 1Yielding: 1Yielding: 2Yielding: 3Yielding: 4Yielding: 1Yielding: -2返回一個產(chǎn)生按照key進行分組后的值集合的迭代器.
如果iterable在多次連續(xù)迭代中生成了同一項,則會定義一個組,如果將此函數(shù)應(yīng)用一個分類列表,那么分組將定義該列表中的所有唯一項,key(如果已提供)是一個函數(shù),應(yīng)用于每一項,如果此函數(shù)存在返回值,該值將用于后續(xù)項而不是該項本身進行比較,此函數(shù)返回的迭代器生成元素(key, group),其中key是分組的鍵值,group是迭代器,生成組成該組的所有項。
即:按照keyfunc函數(shù)對序列每個元素執(zhí)行后的結(jié)果分組(每個分組是一個迭代器), 返回這些分組的迭代器
等價于
class groupby(object): # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B # [list(g) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D def __init__(self, iterable, key=None): if key is None: key = lambda x: x self.keyfunc = key self.it = iter(iterable) self.tgtkey = self.currkey = self.currvalue = object() def __iter__(self): return self def next(self): while self.currkey == self.tgtkey: self.currvalue = next(self.it) # Exit on StopIteration self.currkey = self.keyfunc(self.currvalue) self.tgtkey = self.currkey return (self.currkey, self._grouper(self.tgtkey)) def _grouper(self, tgtkey): while self.currkey == tgtkey: yield self.currvalue self.currvalue = next(self.it) # Exit on StopIteration self.currkey = self.keyfunc(self.currvalue)應(yīng)用
from itertools import groupbyqs = [{'date' : 1},{'date' : 2}][(name, list(group)) for name, group in itertools.groupby(qs, lambda p:p['date'])]Out[77]: [(1, [{'date': 1}]), (2, [{'date': 2}])]>>> from itertools import *>>> a = ['aa', 'ab', 'abc', 'bcd', 'abcde']>>> for i, k in groupby(a, len):... print i, list(k)...2 ['aa', 'ab']3 ['abc', 'bcd']5 ['abcde']另一個例子
from itertools import *from Operator import itemgetterd = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)di = sorted(d.iteritems(), key=itemgetter(1))for k, g in groupby(di, key=itemgetter(1)): print k, map(itemgetter(0), g)1 ['a', 'c', 'e']2 ['b', 'd', 'f']3 ['g']返回的是迭代器類似于針對列表的內(nèi)置函數(shù) filter() , 它只包括當測試函數(shù)返回true時的項. 它不同于 dropwhile()
創(chuàng)建一個迭代器,僅生成iterable中predicate(item)為True的項,如果predicate為None,將返回iterable中所有計算為True的項
對函數(shù)func執(zhí)行返回真的元素的迭代器
def ifilter(predicate, iterable): # ifilter(lambda x: x%2, range(10)) --> 1 3 5 7 9 if predicate is None: predicate = bool for x in iterable: if predicate(x): yield x
使用
from itertools import *def check_item(x): print 'Testing:', x return (x<1)for i in ifilter(check_item, [ -1, 0, 1, 2, 3, 4, 1, -2 ]): print 'Yielding:', iTesting: -1Yielding: -1Testing: 0Yielding: 0Testing: 1Testing: 2Testing: 3Testing: 4Testing: 1Testing: -2Yielding: -2和ifilter(函數(shù)相反 , 返回一個包含那些測試函數(shù)返回false的項的迭代器)
創(chuàng)建一個迭代器,僅生成iterable中predicate(item)為False的項,如果predicate為None,則返回iterable中所有計算為False的項 對函數(shù)func執(zhí)行返回假的元素的迭代器
def ifilterfalse(predicate, iterable): # ifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8 if predicate is None: predicate = bool for x in iterable: if not predicate(x): yield x
使用
from itertools import *def check_item(x): print 'Testing:', x return (x<1)for i in ifilterfalse(check_item, [ -1, 0, 1, 2, 3, 4, 1, -2 ]): print 'Yielding:', iTesting: -1Testing: 0Testing: 1Yielding: 1Testing: 2Yielding: 2Testing: 3Yielding: 3Testing: 4Yielding: 4Testing: 1Yielding: 1Testing: -2itertools.islice(iterable, start, stop[, step])
返回的迭代器是返回了輸入迭代器根據(jù)索引來選取的項
創(chuàng)建一個迭代器,生成項的方式類似于切片返回值: iterable[start : stop : step],將跳過前start個項,迭代在stop所指定的位置停止,step指定用于跳過項的步幅。 與切片不同,負值不會用于任何start,stop和step, 如果省略了start,迭代將從0開始,如果省略了step,步幅將采用1.
返回序列seq的從start開始到stop結(jié)束的步長為step的元素的迭代器
def islice(iterable, *args): # islice('ABCDEFG', 2) --> A B # islice('ABCDEFG', 2, 4) --> C D # islice('ABCDEFG', 2, None) --> C D E F G # islice('ABCDEFG', 0, None, 2) --> A C E G s = slice(*args) it = iter(xrange(s.start or 0, s.stop or sys.maxint, s.step or 1)) nexti = next(it) for i, element in enumerate(iterable): if i == nexti: yield element nexti = next(it)使用
from itertools import *print 'Stop at 5:'for i in islice(count(), 5): print iprint 'Start at 5, Stop at 10:'for i in islice(count(), 5, 10): print iprint 'By tens to 100:'for i in islice(count(), 0, 100, 10): print iStop at 5:01234Start at 5, Stop at 10:56789By tens to 100:0102030405060708090創(chuàng)建一個迭代器,生成項function(i1, i2, ..., iN),其中i1,i2...iN分別來自迭代器iter1,iter2 ... iterN,如果function為None,則返回(i1, i2, ..., iN)形式的元組,只要提供的一個迭代器不再生成值,迭代就會停止。
即:返回一個迭代器, 它是調(diào)用了一個其值在輸入迭代器上的函數(shù), 返回結(jié)果. 它類似于內(nèi)置函數(shù) map() , 只是前者在任意輸入迭代器結(jié)束后就停止(而不是插入None值來補全所有的輸入).
返回序列每個元素被func執(zhí)行后返回值的序列的迭代器
def imap(function, *iterables): # imap(pow, (2,3,10), (5,2,3)) --> 32 9 1000 iterables = map(iter, iterables) while True: args = [next(it) for it in iterables] if function is None: yield tuple(args) else: yield function(*args)使用
from itertools import *print 'Doubles:'for i in imap(lambda x:2*x, xrange(5)): print iprint 'Multiples:'for i in imap(lambda x,y:(x, y, x*y), xrange(5), xrange(5,10)): print '%d * %d = %d' % iDoubles:02468Multiples:0 * 5 = 01 * 6 = 62 * 7 = 143 * 8 = 244 * 9 = 36創(chuàng)建一個迭代器,生成值func(*item),其中item來自iterable,只有當iterable生成的項適用于這種調(diào)用函數(shù)的方式時,此函數(shù)才有效。
對序列seq的每個元素作為func的參數(shù)列表執(zhí)行, 返回執(zhí)行結(jié)果的迭代器
def starmap(function, iterable): # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000 for args in iterable: yield function(*args)使用
from itertools import *values = [(0, 5), (1, 6), (2, 7), (3, 8), (4, 9)]for i in starmap(lambda x,y:(x, y, x*y), values): print '%d * %d = %d' % i0 * 5 = 01 * 6 = 62 * 7 = 143 * 8 = 244 * 9 = 36返回一些基于單個原始輸入的獨立迭代器(默認為2). 它和Unix上的tee工具有點語義相似, 也就是說它們都重復(fù)讀取輸入設(shè)備中的值并將值寫入到一個命名文件和標準輸出中
從iterable創(chuàng)建n個獨立的迭代器,創(chuàng)建的迭代器以n元組的形式返回,n的默認值為2,此函數(shù)適用于任何可迭代的對象,但是,為了克隆原始迭代器,生成的項會被緩存,并在所有新創(chuàng)建的迭代器中使用,一定要注意,不要在調(diào)用tee()之后使用原始迭代器iterable,否則緩存機制可能無法正確工作。
把一個迭代器分為n個迭代器, 返回一個元組.默認是兩個
def tee(iterable, n=2): it = iter(iterable) deques = [collections.deque() for i in range(n)] def gen(mydeque): while True: if not mydeque: # when the local deque is empty newval = next(it) # fetch a new value and for d in deques: # load it to all the deques d.append(newval) yield mydeque.popleft() return tuple(gen(d) for d in deques)
使用
from itertools import *r = islice(count(), 5)i1, i2 = tee(r)for i in i1: print 'i1:', ifor i in i2: print 'i2:', ii1: 0i1: 1i1: 2i1: 3i1: 4i2: 0i2: 1i2: 2i2: 3i2: 4和dropwhile相反
創(chuàng)建一個迭代器,生成iterable中predicate(item)為True的項,只要predicate計算為False,迭代就會立即停止。
即:從序列的頭開始, 直到執(zhí)行函數(shù)func失敗.
def takewhile(predicate, iterable): # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4 for x in iterable: if predicate(x): yield x else: break
使用
from itertools import *def should_take(x): print 'Testing:', x return (x<2)for i in takewhile(should_take, [ -1, 0, 1, 2, 3, 4, 1, -2 ]): print 'Yielding:', iTesting: -1Yielding: -1Testing: 0Yielding: 0Testing: 1Yielding: 1Testing: 2返回一個合并了多個迭代器為一個元組的迭代器. 它類似于內(nèi)置函數(shù)zip(), 只是它返回的是一個迭代器而不是一個列表
創(chuàng)建一個迭代器,生成元組(i1, i2, ... iN),其中i1,i2 ... iN 分別來自迭代器iter1,iter2 ... iterN,只要提供的某個迭代器不再生成值,迭代就會停止,此函數(shù)生成的值與內(nèi)置的zip()函數(shù)相同。
izip(iter1, iter2, ... iterN):返回:(it1[0],it2 [0], it3[0], ..), (it1[1], it2[1], it3[1], ..)...def izip(*iterables): # izip('ABCD', 'xy') --> Ax By iterators = map(iter, iterables) while iterators: yield tuple(map(next, iterators))使用
from itertools import *for i in izip([1, 2, 3], ['a', 'b', 'c']): print i(1, 'a')(2, 'b')(3, 'c')與izip()相同,但是迭代過程會持續(xù)到所有輸入迭代變量iter1,iter2等都耗盡為止,如果沒有使用fillvalue關(guān)鍵字參數(shù)指定不同的值,則使用None來填充已經(jīng)使用的迭代變量的值。
class ZipExhausted(Exception): passdef izip_longest(*args, **kwds): # izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D- fillvalue = kwds.get('fillvalue') counter = [len(args) - 1] def sentinel(): if not counter[0]: raise ZipExhausted counter[0] -= 1 yield fillvalue fillers = repeat(fillvalue) iterators = [chain(it, sentinel(), fillers) for it in args] try: while iterators: yield tuple(map(next, iterators)) except ZipExhausted: pass笛卡爾積
創(chuàng)建一個迭代器,生成表示item1,item2等中的項目的笛卡爾積的元組,repeat是一個關(guān)鍵字參數(shù),指定重復(fù)生成序列的次數(shù)。
def product(*args, **kwds): # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111 pools = map(tuple, args) * kwds.get('repeat', 1) result = [[]] for pool in pools: result = [x+[y] for x in result for y in pool] for prod in result: yield tuple(prod)例子
import itertoolsa = (1, 2, 3)b = ('A', 'B', 'C')c = itertools.product(a,b)for elem in c: print elem(1, 'A')(1, 'B')(1, 'C')(2, 'A')(2, 'B')(2, 'C')(3, 'A')(3, 'B')(3, 'C')排列
創(chuàng)建一個迭代器,返回iterable中所有長度為r的項目序列,如果省略了r,那么序列的長度與iterable中的項目數(shù)量相同: 返回p中任意取r個元素做排列的元組的迭代器
def permutations(iterable, r=None): # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC # permutations(range(3)) --> 012 021 102 120 201 210 pool = tuple(iterable) n = len(pool) r = n if r is None else r if r > n: return indices = range(n) cycles = range(n, n-r, -1) yield tuple(pool[i] for i in indices[:r]) while n: for i in reversed(range(r)): cycles[i] -= 1 if cycles[i] == 0: indices[i:] = indices[i+1:] + indices[i:i+1] cycles[i] = n - i else: j = cycles[i] indices[i], indices[-j] = indices[-j], indices[i] yield tuple(pool[i] for i in indices[:r]) break else: return也可以用product實現(xiàn)def permutations(iterable, r=None): pool = tuple(iterable) n = len(pool) r = n if r is None else r for indices in product(range(n), repeat=r): if len(set(indices)) == r: yield tuple(pool[i] for i in indices)創(chuàng)建一個迭代器,返回iterable中所有長度為r的子序列,返回的子序列中的項按輸入iterable中的順序排序 (不帶重復(fù))
def combinations(iterable, r): # combinations('ABCD', 2) --> AB AC AD BC BD CD # combinations(range(4), 3) --> 012 013 023 123 pool = tuple(iterable) n = len(pool) if r > n: return indices = range(r) yield tuple(pool[i] for i in indices) while True: for i in reversed(range(r)): if indices[i] != i + n - r: break else: return indices[i] += 1 for j in range(i+1, r): indices[j] = indices[j-1] + 1 yield tuple(pool[i] for i in indices)#或者def combinations(iterable, r): pool = tuple(iterable) n = len(pool) for indices in permutations(range(n), r): if sorted(indices) == list(indices): yield tuple(pool[i] for i in indices)創(chuàng)建一個迭代器,返回iterable中所有長度為r的子序列,返回的子序列中的項按輸入iterable中的順序排序 (帶重復(fù))
def combinations_with_replacement(iterable, r): # combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC pool = tuple(iterable) n = len(pool) if not n and r: return indices = [0] * r yield tuple(pool[i] for i in indices) while True: for i in reversed(range(r)): if indices[i] != n - 1: break else: return indices[i:] = [indices[i] + 1] * (r - i) yield tuple(pool[i] for i in indices)或者def combinations_with_replacement(iterable, r): pool = tuple(iterable) n = len(pool) for indices in product(range(n), repeat=r): if sorted(indices) == list(indices): yield tuple(pool[i] for i in indices)使用現(xiàn)有擴展功能
def take(n, iterable): "Return first n items of the iterable as a list" return list(islice(iterable, n))def tabulate(function, start=0): "Return function(0), function(1), ..." return imap(function, count(start))def consume(iterator, n): "Advance the iterator n-steps ahead. If n is none, consume entirely." # Use functions that consume iterators at C speed. if n is None: # feed the entire iterator into a zero-length deque collections.deque(iterator, maxlen=0) else: # advance to the empty slice starting at position n next(islice(iterator, n, n), None)def nth(iterable, n, default=None): "Returns the nth item or a default value" return next(islice(iterable, n, None), default)def quantify(iterable, pred=bool): "Count how many times the predicate is true" return sum(imap(pred, iterable))def padnone(iterable): """Returns the sequence elements and then returns None indefinitely. Useful for emulating the behavior of the built-in map() function. """ return chain(iterable, repeat(None))def ncycles(iterable, n): "Returns the sequence elements n times" return chain.from_iterable(repeat(tuple(iterable), n))def dotproduct(vec1, vec2): return sum(imap(operator.mul, vec1, vec2))def flatten(listOfLists): "Flatten one level of nesting" return chain.from_iterable(listOfLists)def repeatfunc(func, times=None, *args): """Repeat calls to func with specified arguments. Example: repeatfunc(random.random) """ if times is None