国产探花免费观看_亚洲丰满少妇自慰呻吟_97日韩有码在线_资源在线日韩欧美_一区二区精品毛片,辰东完美世界有声小说,欢乐颂第一季,yy玄幻小说排行榜完本

首頁 > 編程 > Python > 正文

Python判斷列表是否已排序的各種方法及其性能分析

2019-11-25 16:42:47
字體:
來源:轉載
供稿:網友

聲明

本文基于Python2.7語言,給出判斷列表是否已排序的多種方法,并在作者的Windows XP主機(Pentium G630 2.7GHz主頻2GB內存)上對比和分析其性能表現。

一. 問題提出

Haskell培訓老師提出一個問題:如何判斷列表是否已經排序?

排序與否實際只是相鄰元素間的某種二元關系,即a->a->Bool。所以第一步可以把二元組列表找出來;第二步是把這個函數作用于每個元組,然后用and操作。老師給出的實現代碼如下:

pair lst = zip lst ( tail lst )sorted lst predict = and [ predict x y | (x,y) <- pair lst] 

Haskell中,等號前面是函數的名稱和參數,后面是函數的定義體。pair函數將列表lst錯位一下(tail除去列表的第一個元素)后,和原列表在zip的作用下形成前后相鄰元素二元組列表。predict函數接受兩個數字,根據大小返回True或False。and對類型為[Bool]的列表中所有元素求與,其語義類似Python的all()函數。

隨后,老師請大家思考性能問題。作者提出,若準確性要求不高,可生成三組隨機數排序后作為下標,提取原列表相應的三組元素組成三個新的子列表("采樣")。若判斷三個子列表遵循同樣的排序規則時,則認為原列表已排序。當列表很大且前段已排序時,選取適當數目的隨機數,可在保障一定準確率的同時顯著地降低運算規模。

此外,實際應用中還應考慮數據來源。例如,Python語言的os.listdir()方法在Windows系統中返回的列表條目滿足字母序,在Linux系統中則返回亂序條目。因此,若判斷系統平臺(os.platform)為win32,則條目必然已經排序。

為對比驗證隨機采樣方式的可行性,作者先在網上搜集判斷列表排序的現有方法,主要參考Stack Overflow網站上"Pythonic way to check if a list is sorted or not"問題的答案,并在本文第二節給出相關的代碼示例。注意,本文所述的"排序"不要求嚴格排序,即相鄰元素允許相等。例如,[1,2,2,3]視為升序,[3,3,2,2]視為降序。

二. 代碼實現

本節判斷列表排序的函數名格式為IsListSorted_XXX()。為簡潔起見,除代碼片段及其輸出外,一律以_XXX()指代。

2.1 guess

def IsListSorted_guess(lst):listLen = len(lst)if listLen <= 1:return True#由首個元素和末尾元素猜測可能的排序規則if lst[0] == lst[-1]: #列表元素相同for elem in lst:if elem != lst[0]: return Falseelif lst[0] < lst[-1]: #列表元素升序for i, elem in enumerate(lst[1:]):if elem < lst[i]: return Falseelse: #列表元素降序for i, elem in enumerate(lst[1:]):if elem > lst[i]: return Falsereturn True 

_guess()是最通用的實現,幾乎與語言無關。值得注意的是,該函數內會猜測給定列表可能的排序規則,因此無需外部調用者指明排序規則。

2.2 sorted

def IsListSorted_sorted(lst):return sorted(lst) == lst or sorted(lst, reverse=True) == lst 

_sorted()使用Python內置函數sorted()。由于sorted()會對未排序的列表排序,_sorted()函數主要適用于已排序列表。
若想判斷列表未排序后再對其排序,不如直接調用列表的sort()方法,因為該方法內部會判斷列表是否排序。對于已排序列表,該方法的時間復雜度為線性階O(n)――判斷為O(n)而排序為O(nlgn)。

2.3 for-loop

def IsListSorted_forloop(lst, key=lambda x, y: x <= y):for i, elem in enumerate(lst[1:]): #注意,enumerate默認迭代下標從0開始if not key(lst[i], elem): #if elem > lst[i]更快,但通用性差return Falsereturn True 

無論列表是否已排序,本函數的時間復雜度均為線性階O(n)。注意,參數key表明缺省的排序規則為升序。

2.4 all

def IsListSorted_allenumk(lst, key=lambda x, y: x <= y):return all(key(lst[i], elem) for i, elem in enumerate(lst[1:]))import operatordef IsListSorted_allenumo(lst, oCmp=operator.le):return all(oCmp(lst[i], elem) for i, elem in enumerate(lst[1:]))def IsListSorted_allenumd(lst):return all((lst[i] <= elem) for i, elem in enumerate(lst[1:]))def IsListSorted_allxran(lst, key=lambda x,y: x <= y):return all(key(lst[i],lst[i+1]) for i in xrange(len(lst)-1))def IsListSorted_allzip(lst, key=lambda x,y: x <= y):from itertools import izip #Python 3中zip返回生成器(generator),而izip被廢棄return all(key(a, b) for (a, b) in izip(lst[:-1],lst[1:])) 

lambda表達式與operator運算符速度相當,前者簡單靈活,后者略為高效(實測并不一定)。但兩者速度均不如列表元素直接比較(可能存在調用開銷)。亦即,_allenumd()快于_allenumo()快于_allenumk()。

若使用lambda表達式指示排序規則,更改規則時只需要改變x和y之間的比較運算符;若使用operator模塊指示排序規則,更改規則時需要改變對象比較方法。具體地,lt(x, y)等效于x < y,le(x, y)等效于x <= y,eq(x, y)等效于x == y,ne(x, y)等效于x != y,gt(x, y)等效于x > y,ge(x, y)等效于x >= y。例如,_allenumo()函數若要嚴格升序可設置oCmp=operator.lt。

此外,由all()函數的幫助信息可知,_allenumk()其實是_forloop()的等效形式。

2.5 numpy

def IsListSorted_numpy(arr, key=lambda dif: dif >= 0):import numpytry:if arr.dtype.kind == 'u': #無符號整數數組執行np.diff時存在underflow風險arr = numpy.int64(lst)except AttributeError:pass #無dtype屬性,非數組return (key(numpy.diff(arr))).all() #numpy.diff(x)返回相鄰數組元素的差值構成的數組 

NumPy是用于科學計算的Python基礎包,可存儲和處理大型矩陣。它包含一個強大的N維數組對象,比Python自身的嵌套列表結構(nested list structure)高效得多。第三節的實測數據表明,_numpy()處理大型列表時性能非常出色。

在Windows系統中可通過pip install numpy命令安裝NumPy包,不建議登錄官網下載文件自行安裝。

2.6 reduce

def IsListSorted_reduce(iterable, key=lambda x, y: x <= y):cmpFunc = lambda x, y: y if key(x, y) else float('inf')return reduce(cmpFunc, iterable, .0) < float('inf') 

reduce實現是all實現的變體。累加器(accumulator)中僅存儲最后一個檢查的列表元素,或者Infinity(若任一元素小于前個元素值)。

前面2.1~2.5小節涉及下標操作的函數適用于列表等可迭代對象(Iterable)。對于通用迭代器(Iterator)對象,即可以作用于next()函數或方法的對象,可使用_reduce()及后面除_rand()外各小節的函數。迭代器的計算是惰性的,只有在需要返回下一個數據時才會計算,以避免不必要的計算。而且,迭代器方式無需像列表那樣切片為兩個迭代對象。

2.7 imap

def IsListSorted_itermap(iterable, key=lambda x, y: x <= y):from itertools import imap, teea, b = tee(iterable) #為單個iterable創建兩個獨立的iteratornext(b, None)return all(imap(key, a, b)) 

2.8 izip

def IsListSorted_iterzip(iterable, key=lambda x, y: x <= y):from itertools import tee, izipa, b = tee(iterable)next(b, None)return all(key(x, y) for x, y in izip(a, b))def pairwise(iterable):from itertools import tee, izipa, b = tee(iterable)next(b, None)return izip(a, b) #"s -> (s0,s1), (s1,s2), (s2, s3), ..."def IsListSorted_iterzipf(iterable, key=lambda x, y: x <= y):return all(key(a, b) for a, b in pairwise(iterable)) 

第三節的實測數據表明,雖然存在外部函數調用,_iterzipf()卻比_iterzip()略為高效。

2.9 fast

def IsListSorted_fastd(lst):it = iter(lst)try:prev = it.next()except StopIteration:return Truefor cur in it:if prev > cur:return Falseprev = curreturn Truedef IsListSorted_fastk(lst, key=lambda x, y: x <= y):it = iter(lst)try:prev = it.next()except StopIteration:return Truefor cur in it:if not key(prev, cur):return Falseprev = curreturn True 

_fastd()和_fastk()是Stack Overflow網站回答里據稱執行最快的。實測數據表明,在列表未排序時,它們的性能表現確實優異。

2.10 random

import randomdef IsListSorted_rand(lst, randNum=3, randLen=100):listLen = len(lst)if listLen <= 1:return True#由首個元素和末尾元素猜測可能的排序規則if lst[0] < lst[-1]: #列表元素升序key = lambda dif: dif >= 0else: #列表元素降序或相等key = lambda dif: dif <= 0threshold, sortedFlag = 10000, Trueimport numpyif listLen <= threshold or listLen <= randLen*2 or not randNum:return (key(numpy.diff(numpy.array(lst)))).all()from random import samplefor i in range(randNum):sortedRandList = sorted(sample(xrange(listLen), randLen))flag = (key(numpy.diff(numpy.array([lst[x] for x in sortedRandList])))).all()sortedFlag = sortedFlag and flagreturn sortedFlag 

_rand()借助隨機采樣降低運算規模,并融入其他判斷函數的優點。例如,猜測列表可能的排序規則,并在隨機采樣不適合時使用相對快速的判斷方式,如NumPy。

通過line_profiler分析可知,第20行和第21行與randLen有關,但兩者耗時接近。因此randLen應小于listLen的一半,以抵消sorted開銷。除內部限制外,用戶可以調節隨機序列個數和長度,如定制單個但較長的序列。

注意,_rand()不適用于存在微量異常數據的長列表。因為這些數據很可能被隨機采樣遺漏,從而影響判斷結果的準確性。

三. 性能分析

本節借助Python標準庫random模塊,生成各種隨機列表,以對比和分析上節列表排序判斷函數的性能。

可通過sample()、randint()等方法生成隨機列表。例如:

>>>import random>>> random.sample(range(10), 10); random.sample(range(10), 5)[9, 1, 6, 3, 0, 8, 4, 2, 7, 5][1, 2, 5, 6, 0]>>> rand = [random.randint(1,10) for i in range(10)]; rand[7, 3, 7, 5, 7, 2, 4, 4, 9, 8]>>> random.sample(rand, 5); random.sample(rand, 5)[4, 7, 7, 9, 8][3, 9, 2, 5, 7]>>> randGen = (random.randint(1,10) for i in range(10)); randGen<generator object <genexpr> at 0x0192C878>

sample()方法從列表中隨機選擇數字,結合range()函數可生產不含重復元素的隨機列表;而randint()方法結合列表解析生成的隨機列表可能包含重復元素。Python文檔規定,首次導入random模塊時使用當前系統時間作為種子初始化隨機數生成器。因此,本文并未顯式地調用seed()方法設置種子。

為度量性能表現,定義如下計時裝飾器:

from time import clockTimeList = []def FuncTimer(repeats=1000):def decorator(func):def wrapper(*args, **kwargs):try:startTime = clock()for i in xrange(repeats):ret = func(*args, **kwargs)except Exception, e:print '%s() Error: %s!' %(func.__name__, e)ret = Nonefinally: #當目標函數發生異常時,仍舊輸出計時信息endTime = clock()timeElasped = (endTime-startTime)*1000.0msg = '%21s(): %s =>Time Elasped: %.3f msec, repeated %d time(s).' /%(func.__name__, ret, timeElasped, repeats)global TimeList; TimeList.append([timeElasped, msg])return retreturn wrapperreturn decoratordef ReportTimer():global TimeList; TimeList.sort(key=lambda x:x[0])for entry in TimeList:print entry[1]TimeList = [] #Flush

該裝飾器允許對輸出進行排序,以便更直觀地觀察性能。基于該裝飾器,下文將分別測試不同的排序場景。注意,第二節各函數頭部需添加FuncTimer()裝飾。

3.1 列表前段亂序

測試代碼如下:

def TestHeadUnorderedList():TEST_NAME = 'HeadUnorderedList'; scale = int(1e5)List = random.sample(xrange(scale), scale) + range(scale)print 'Test 1: %s, list len: %d' %(TEST_NAME, len(List))IsListSorted_guess(List)IsListSorted_sorted(List)IsListSorted_allenumk(List)IsListSorted_allenumo(List)IsListSorted_allenumd(List)IsListSorted_allxran(List)IsListSorted_allzip(List)IsListSorted_forloop(List)IsListSorted_itermap(List)IsListSorted_iterzipf(List)IsListSorted_iterzip(List)IsListSorted_reduce(List)IsListSorted_numpy(numpy.array(List)) #若不先轉換為數組,則耗時驟增IsListSorted_fastd(List)IsListSorted_fastk(List)ReportTimer()

運行輸出如下:

Test 1: HeadUnorderedList, list len: 200IsListSorted_fastd(): False =>Time Elasped: 0.757 msec, repeated 1000 time(s).IsListSorted_fastk(): False =>Time Elasped: 1.091 msec, repeated 1000 time(s).IsListSorted_forloop(): False =>Time Elasped: 2.080 msec, repeated 1000 time(s).IsListSorted_guess(): False =>Time Elasped: 2.123 msec, repeated 1000 time(s).IsListSorted_allxran(): False =>Time Elasped: 2.255 msec, repeated 1000 time(s).IsListSorted_allenumd(): False =>Time Elasped: 2.672 msec, repeated 1000 time(s).IsListSorted_allenumo(): False =>Time Elasped: 3.021 msec, repeated 1000 time(s).IsListSorted_allenumk(): False =>Time Elasped: 3.207 msec, repeated 1000 time(s).IsListSorted_itermap(): False =>Time Elasped: 5.845 msec, repeated 1000 time(s).IsListSorted_allzip(): False =>Time Elasped: 7.793 msec, repeated 1000 time(s).IsListSorted_iterzip(): False =>Time Elasped: 9.667 msec, repeated 1000 time(s).IsListSorted_iterzipf(): False =>Time Elasped: 9.969 msec, repeated 1000 time(s).IsListSorted_numpy(): False =>Time Elasped: 16.203 msec, repeated 1000 time(s).IsListSorted_sorted(): False =>Time Elasped: 45.742 msec, repeated 1000 time(s).IsListSorted_reduce(): False =>Time Elasped: 145.447 msec, repeated 1000 time(s).Test 1: HeadUnorderedList, list len: 200000IsListSorted_fastd(): False =>Time Elasped: 0.717 msec, repeated 1000 time(s).IsListSorted_fastk(): False =>Time Elasped: 0.876 msec, repeated 1000 time(s).IsListSorted_allxran(): False =>Time Elasped: 2.104 msec, repeated 1000 time(s).IsListSorted_itermap(): False =>Time Elasped: 6.062 msec, repeated 1000 time(s).IsListSorted_iterzip(): False =>Time Elasped: 7.244 msec, repeated 1000 time(s).IsListSorted_iterzipf(): False =>Time Elasped: 8.491 msec, repeated 1000 time(s).IsListSorted_numpy(): False =>Time Elasped: 801.916 msec, repeated 1000 time(s).IsListSorted_forloop(): False =>Time Elasped: 2924.755 msec, repeated 1000 time(s).IsListSorted_guess(): False =>Time Elasped: 2991.756 msec, repeated 1000 time(s).IsListSorted_allenumo(): False =>Time Elasped: 3025.864 msec, repeated 1000 time(s).IsListSorted_allenumk(): False =>Time Elasped: 3062.792 msec, repeated 1000 time(s).IsListSorted_allenumd(): False =>Time Elasped: 3190.896 msec, repeated 1000 time(s).IsListSorted_allzip(): False =>Time Elasped: 6586.183 msec, repeated 1000 time(s).IsListSorted_sorted(): False =>Time Elasped: 119974.955 msec, repeated 1000 time(s).IsListSorted_reduce(): False =>Time Elasped: 154747.659 msec, repeated 1000 time(s).

可見,對于前段亂序的列表,無論其長短_fastd()和_fastk()的表現均為最佳。對于未排序列表,_sorted()需要進行排序,故性能非常差。然而,_reduce()性能最差。

實際上除_guess()和_sorted()外,其他函數都按升序檢查列表。為安全起見,可仿照_guess()實現,先猜測排序方式,再進一步檢查。

因為短列表耗時差異大多可以忽略,后續測試將不再包含短列表(但作者確實測試過),僅關注長列表。除非特別說明,列表長度為10萬級,重復計時1000次。

3.2 列表中段亂序

測試代碼及結果如下:

def TestMiddUnorderedList():TEST_NAME = 'MiddUnorderedList'; scale = int(1e5)List = range(scale) + random.sample(xrange(scale), scale) + range(scale)print 'Test 2: %s, list len: %d' %(TEST_NAME, len(List))IsListSorted_numpy(numpy.array(List)) # 1572.295 msec IsListSorted_guess(List) # 14540.637 msecIsListSorted_itermap(List) # 21013.096 msecIsListSorted_fastk(List) # 23840.582 msecIsListSorted_allxran(List) # 31014.215 msecIsListSorted_iterzip(List) # 33386.059 msecIsListSorted_forloop(List) # 34228.006 msecIsListSorted_allenumk(List) # 34416.802 msecIsListSorted_allzip(List) # 42370.528 msecIsListSorted_sorted(List) # 142592.756 msecIsListSorted_reduce(List) # 187514.967 msecReportTimer()

為節省篇幅,已根據運行輸出調整函數的調用順序。下文也作如此處理。

3.3 列表后段亂序

測試代碼及結果如下:

def TestTailUnorderedList():TEST_NAME = 'TailUnorderedList'; scale = int(1e5)List = range(scale, 0, -1) + random.sample(xrange(scale), scale)print 'Test 3: %s, list len: %d' %(TEST_NAME, len(List))IsListSorted_numpy(numpy.array(List), key=lambda dif: dif <= 0) # 980.789 msec IsListSorted_guess(List) # 13273.862 msecIsListSorted_itermap(List, key=lambda x, y: x >= y) # 21603.315 msecIsListSorted_fastk(List, key=lambda x, y: x >= y) # 24183.548 msecIsListSorted_allxran(List, key=lambda x, y: x >= y) # 32850.254 msecIsListSorted_forloop(List, key=lambda x, y: x >= y) # 33918.848 msecIsListSorted_iterzip(List, key=lambda x, y: x >= y) # 34505.809 msecIsListSorted_allenumk(List, key=lambda x, y: x >= y) # 35631.625 msecIsListSorted_allzip(List, key=lambda x, y: x >= y) # 40076.330 msecReportTimer()

3.4 列表完全亂序

測試代碼及結果如下:

def TestUnorderedList():TEST_NAME = 'UnorderedList'; scale = int(1e5)List = random.sample(xrange(scale), scale)print 'Test 4: %s, list len: %d' %(TEST_NAME, len(List))IsListSorted_fastk(List) # 0.856 msecIsListSorted_allxran(List) # 3.438 msecIsListSorted_iterzip(List) # 7.233 msecIsListSorted_itermap(List) # 7.595 msecIsListSorted_numpy(numpy.array(List)) # 207.222 msecIsListSorted_allenumk(List) # 953.423 msecIsListSorted_guess(List) # 1023.575 msecIsListSorted_forloop(List) # 1076.986 msecIsListSorted_allzip(List) # 2062.821 msecReportTimer()

3.5 列表元素相同

測試代碼及結果如下:

```python def TestSameElemList(): TEST_NAME = 'SameElemList'; scale = int(1e5) List = [5]*scale print 'Test 5: %s, list len: %d' %(TEST_NAME, len(List)) IsListSorted_numpy(numpy.array(List)) # 209.324 msec IsListSorted_sorted(List) # 2760.139 msec IsListSorted_guess(List) # 5843.942 msec IsListSorted_itermap(List) # 20609.704 msec IsListSorted_fastk(List) # 23035.760 msec IsListSorted_forloop(List) # 29043.206 msec IsListSorted_allenumk(List) # 29553.716 msec IsListSorted_allxran(List) # 30348.549 msec IsListSorted_iterzip(List) # 32806.217 msec ReportTimer()

3.6 列表升序

測試代碼及結果如下:

def TestAscendingList():TEST_NAME = 'AscendingList'; scale = int(1e5)List = range(scale)print 'Test 6: %s, list len: %d' %(TEST_NAME, len(List))IsListSorted_numpy(numpy.array(List)) # 209.217 msecIsListSorted_sorted(List) # 2845.166 msecIsListSorted_fastd(List) # 5977.520 msecIsListSorted_guess(List) # 10408.204 msecIsListSorted_allenumd(List) # 15812.754 msecIsListSorted_itermap(List) # 21244.476 msecIsListSorted_fastk(List) # 23900.196 msecIsListSorted_allenumo(List) # 28607.724 msecIsListSorted_forloop(List) # 30075.538 msecIsListSorted_allenumk(List) # 30274.017 msecIsListSorted_allxran(List) # 31126.404 msecIsListSorted_reduce(List) # 32940.108 msecIsListSorted_iterzip(List) # 34188.312 msecIsListSorted_iterzipf(List) # 34425.097 msecIsListSorted_allzip(List) # 37967.447 msecReportTimer()

可見,列表已排序時,_sorted()的性能較占優勢。

3.7 列表降序

剔除不支持降序的函數,測試代碼及結果如下:

def TestDescendingList():TEST_NAME = 'DescendingList'; scale = int(1e2)List = range(scale, 0, -1)print 'Test 7: %s, list len: %d' %(TEST_NAME, len(List))IsListSorted_numpy(numpy.array(List), key=lambda dif: dif <= 0) # 209.318 msecIsListSorted_sorted(List) # 5707.067 msecIsListSorted_guess(List) # 10549.928 msecIsListSorted_itermap(List, key=lambda x, y: x >= y) # 21529.547 msecIsListSorted_fastk(List, key=lambda x, y: x >= y) # 24264.465 msecimport operator; IsListSorted_allenumo(List, oCmp=operator.ge) # 28093.035 msecIsListSorted_forloop(List, key=lambda x, y: x >= y) # 30745.943 msecIsListSorted_allenumk(List, key=lambda x, y: x >= y) # 32696.205 msecIsListSorted_allxran(List, key=lambda x, y: x >= y) # 33431.473 msecIsListSorted_allzip(List, key=lambda x, y: x >= y) # 34837.019 msecIsListSorted_iterzip(List, key=lambda x, y: x >= y) # 35237.475 msecIsListSorted_reduce(List, key=lambda x, y: x >= y) # 37035.270 msecReportTimer()

3.8 迭代器測試

參數為列表的函數,需要先將列表���過iter()函數轉換為迭代器。注意,當iterable參數為iterator時,只能計時一次,因為該次執行將用盡迭代器。

測試代碼如下:

def TestIter():TEST_NAME = 'Iter'; scale = int(1e7)List = range(scale) #升序# List = random.sample(xrange(scale), scale) #亂序print 'Test 8: %s, list len: %d' %(TEST_NAME, len(List))iterL = iter(List); IsListSorted_guess(list(iterL))iterL = iter(List); IsListSorted_sorted(iterL)iterL = iter(List); IsListSorted_itermap(iterL)iterL = iter(List); IsListSorted_iterzipf(iterL)iterL = iter(List); IsListSorted_iterzip(iterL)iterL = iter(List); IsListSorted_reduce(iterL)iterL = iter(List); IsListSorted_fastd(iterL)iterL = iter(List); IsListSorted_fastk(iterL, key=lambda x, y: x >= y)ReportTimer()

運行結果如下:

Test 8: Iter, list len: 10000000 ---升序IsListSorted_fastd(): True =>Time Elasped: 611.028 msec, repeated 1 time(s).IsListSorted_sorted(): False =>Time Elasped: 721.751 msec, repeated 1 time(s).IsListSorted_guess(): True =>Time Elasped: 1142.065 msec, repeated 1 time(s).IsListSorted_itermap(): True =>Time Elasped: 2097.696 msec, repeated 1 time(s).IsListSorted_fastk(): True =>Time Elasped: 2337.233 msec, repeated 1 time(s).IsListSorted_reduce(): True =>Time Elasped: 3307.361 msec, repeated 1 time(s).IsListSorted_iterzipf(): True =>Time Elasped: 3354.034 msec, repeated 1 time(s).IsListSorted_iterzip(): True =>Time Elasped: 3442.520 msec, repeated 1 time(s).Test 8: Iter, list len: 10000000 ---亂序IsListSorted_fastk(): False =>Time Elasped: 0.004 msec, repeated 1 time(s).IsListSorted_fastd(): False =>Time Elasped: 0.010 msec, repeated 1 time(s).IsListSorted_iterzip(): False =>Time Elasped: 0.015 msec, repeated 1 time(s).IsListSorted_itermap(): False =>Time Elasped: 0.055 msec, repeated 1 time(s).IsListSorted_iterzipf(): False =>Time Elasped: 0.062 msec, repeated 1 time(s).IsListSorted_guess(): False =>Time Elasped: 736.810 msec, repeated 1 time(s).IsListSorted_reduce(): False =>Time Elasped: 8919.611 msec, repeated 1 time(s).IsListSorted_sorted(): False =>Time Elasped: 12273.018 msec, repeated 1 time(s).

其中,_itermap()、_iterzip()、_iterzipf()、_reduce()、_fastd()、_fastk()可直接判斷迭代器是否已排序。其他函數需將迭代器轉換為列表后才能處理。_sorted()雖然接受迭代器參數,但sorted()返回列表,故無法正確判斷迭代器順序。

3.9 隨機采樣測試

綜合以上測試,可知_fastk()和_numpy()性能較為突出,而且_rand()內置numpy方式。因此,以_fastk()和_numpy()為參照對象,測試代碼如下(計時1次):

def TestRandList():scale = int(1e6)List = random.sample(xrange(scale), scale) + range(scale)print 'Test 1: %s, list len: %d' %('HeadUnorderedList', len(List))IsListSorted_fastk(List)IsListSorted_numpy(numpy.array(List))IsListSorted_rand(List, randNum=1)ReportTimer()List = range(scale) + random.sample(xrange(scale), scale) + range(scale)print 'Test 2: %s, list len: %d' %('MiddUnorderedList', len(List))IsListSorted_fastk(List)IsListSorted_numpy(numpy.array(List))IsListSorted_rand(List, randNum=1)ReportTimer()List = range(scale, 0, -1) + random.sample(xrange(scale), scale)print 'Test 3: %s, list len: %d' %('TailUnorderedList', len(List))IsListSorted_fastk(List, key=lambda x, y: x >= y)IsListSorted_numpy(numpy.array(List), key=lambda dif: dif <= 0)IsListSorted_rand(List, randNum=1)ReportTimer()List = [random.randint(1,scale) for i in xrange(scale)] #random.sample(xrange(scale), scale)print 'Test 4: %s, list len: %d' %('UnorderedList', len(List))IsListSorted_fastk(List)IsListSorted_numpy(numpy.array(List))IsListSorted_rand(List, randNum=1)ReportTimer()List = [5]*scaleprint 'Test 5: %s, list len: %d' %('SameElemList', len(List))IsListSorted_fastk(List)IsListSorted_numpy(numpy.array(List))IsListSorted_rand(List, randNum=1)ReportTimer()List = range(scale)print 'Test 6: %s, list len: %d' %('AscendingList', len(List))IsListSorted_fastk(List)IsListSorted_numpy(numpy.array(List))IsListSorted_rand(List, randNum=1)ReportTimer()List = range(scale, 0, -1)print 'Test 7: %s, list len: %d' %('DescendingList', len(List))IsListSorted_fastk(List, key=lambda x, y: x >= y)IsListSorted_numpy(numpy.array(List), key=lambda dif: dif <= 0)IsListSorted_rand(List, randNum=1)ReportTimer()List = range(scale, 0, -1); List[scale/2]=0print 'Test 8: %s, list len: %d' %('MiddleNotchList', len(List))IsListSorted_fastk(List, key=lambda x, y: x >= y)IsListSorted_numpy(numpy.array(List), key=lambda dif: dif <= 0)IsListSorted_rand(List, randNum=1)IsListSorted_rand(List, randNum=1, randLen=scale/2)ReportTimer()

運行輸出如下:

Test 1: HeadUnorderedList, list len: 2000000IsListSorted_fastk(): False =>Time Elasped: 0.095 msec, repeated 1 time(s).IsListSorted_rand(): False =>Time Elasped: 0.347 msec, repeated 1 time(s).IsListSorted_numpy(): False =>Time Elasped: 7.893 msec, repeated 1 time(s).Test 2: MiddUnorderedList, list len: 3000000IsListSorted_rand(): False =>Time Elasped: 0.427 msec, repeated 1 time(s).IsListSorted_numpy(): False =>Time Elasped: 11.868 msec, repeated 1 time(s).IsListSorted_fastk(): False =>Time Elasped: 210.842 msec, repeated 1 time(s).Test 3: TailUnorderedList, list len: 2000000IsListSorted_rand(): False =>Time Elasped: 0.355 msec, repeated 1 time(s).IsListSorted_numpy(): False =>Time Elasped: 7.548 msec, repeated 1 time(s).IsListSorted_fastk(): False =>Time Elasped: 280.416 msec, repeated 1 time(s).Test 4: UnorderedList, list len: 1000000IsListSorted_fastk(): False =>Time Elasped: 0.074 msec, repeated 1 time(s).IsListSorted_rand(): False =>Time Elasped: 0.388 msec, repeated 1 time(s).IsListSorted_numpy(): False =>Time Elasped: 3.757 msec, repeated 1 time(s).Test 5: SameElemList, list len: 1000000IsListSorted_rand(): True =>Time Elasped: 0.304 msec, repeated 1 time(s).IsListSorted_numpy(): True =>Time Elasped: 3.955 msec, repeated 1 time(s).IsListSorted_fastk(): True =>Time Elasped: 210.977 msec, repeated 1 time(s).Test 6: AscendingList, list len: 1000000IsListSorted_rand(): True =>Time Elasped: 0.299 msec, repeated 1 time(s).IsListSorted_numpy(): True =>Time Elasped: 4.822 msec, repeated 1 time(s).IsListSorted_fastk(): True =>Time Elasped: 214.171 msec, repeated 1 time(s).Test 7: DescendingList, list len: 1000000IsListSorted_rand(): True =>Time Elasped: 0.336 msec, repeated 1 time(s).IsListSorted_numpy(): True =>Time Elasped: 3.867 msec, repeated 1 time(s).IsListSorted_fastk(): True =>Time Elasped: 279.322 msec, repeated 1 time(s).Test 8: MiddleNotchList, list len: 1000000IsListSorted_rand(): True =>Time Elasped: 0.387 msec, repeated 1 time(s).IsListSorted_numpy(): False =>Time Elasped: 4.792 msec, repeated 1 time(s).IsListSorted_rand(): False =>Time Elasped: 78.903 msec, repeated 1 time(s).IsListSorted_fastk(): False =>Time Elasped: 110.444 msec, repeated 1 time(s).

可見,在絕大部分測試場景中,_rand()性能均為最佳,且不失正確率。注意測試8,當降序列表中間某個元素被置0(開槽)時,隨機采樣很容易遺漏該元素,導致誤判。然而,這種場景在實際使用中非常罕見。

發表評論 共有條評論
用戶名: 密碼:
驗證碼: 匿名發表
主站蜘蛛池模板: 中山市| 邵阳市| 大田县| 邢台县| 滕州市| 东丰县| 泾源县| 东源县| 通化市| 呼和浩特市| 广南县| 浮山县| 长白| 岳普湖县| 平遥县| 博湖县| 广河县| 林州市| 望都县| 泸州市| 高碑店市| 黄冈市| 呼图壁县| 武强县| 乐陵市| 隆德县| 左贡县| 大渡口区| 深水埗区| 泗洪县| 博兴县| 张家港市| 博野县| 犍为县| 徐州市| 子长县| 高尔夫| 云林县| 高尔夫| 泾阳县| 巴青县|