国产探花免费观看_亚洲丰满少妇自慰呻吟_97日韩有码在线_资源在线日韩欧美_一区二区精品毛片,辰东完美世界有声小说,欢乐颂第一季,yy玄幻小说排行榜完本

首頁 > 編程 > Python > 正文

Python腳本實現下載合并SAE日志

2019-11-25 18:00:01
字體:
來源:轉載
供稿:網友

由于一些原因,需要SAE上站點的日志文件,從SAE上只能按天下載,下載下來手動處理比較蛋疼,尤其是數量很大的時候。還好SAE提供了API可以批量獲得日志文件下載地址,剛剛寫了python腳本自動下載和合并這些文件

調用API獲得下載地址

文檔位置在這里

設置自己的應用和下載參數

請求中需要設置的變量如下

復制代碼 代碼如下:

api_url = 'http://dloadcenter.sae.sina.com.cn/interapi.php?'
appname = 'xxxxx'
from_date = '20140101'
to_date = '20140116'
url_type = 'http' # http|taskqueue|cron|mail|rdc
url_type2 = 'access' # only when type=http  access|debug|error|warning|notice|resources
secret_key = 'xxxxx'

生成請求地址

請求地址生成方式可以看一下官網的要求:

1.將參數排序
2.生成請求字符串,去掉&
3.附加access_key
4.請求字符串求md5,形成sign
5.把sign增加到請求字符串中

具體實現代碼如下

復制代碼 代碼如下:

params = dict()
params['act'] = 'log'
params['appname'] = appname
params['from'] = from_date
params['to'] = to_date
params['type'] = url_type

if url_type == 'http':
    params['type2'] = url_type2

params = collections.OrderedDict(sorted(params.items()))

request = ''
for k,v in params.iteritems():
    request += k+'='+v+'&'

sign = request.replace('&','')
sign += secret_key

md5 = hashlib.md5()
md5.update(sign)
sign = md5.hexdigest()

request = api_url + request + 'sign=' + sign

if response['errno'] != 0:
    print '[!] '+response['errmsg']
    exit()

print '[#] request success'

下載日志文件

SAE將每天的日志文件都打包成tar.gz的格式,下載保存下來即可,文件名以日期.tar.gz命名

復制代碼 代碼如下:

log_files = list()

for down_url in response['data']:   
    file_name = re.compile(r'/d{4}-/d{2}-/d{2}').findall(down_url)[0] + '.tar.gz'
    log_files.append(file_name)
    data = urllib2.urlopen(down_url).read()
    with open(file_name, "wb") as file:
        file.write(data)

print '[#] you got %d log files' % len(log_files)

合并文件

合并文件方式用trafile庫解壓縮每個文件,然后把文件內容附加到access_log下就可以了

復制代碼 代碼如下:

# compress these files to access_log
access_log = open('access_log','w');

for log_file in log_files:
    tar = tarfile.open(log_file)
    log_name = tar.getnames()[0]
    tar.extract(log_name)
    # save to access_log
    data = open(log_name).read()
    access_log.write(data)
    os.remove(log_name)

print '[#] all file has writen to access_log'

完整代碼

復制代碼 代碼如下:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @Author: Su Yan <http://yansu.org>
# @Date:   2014-01-17 12:05:19
# @Last Modified by:   Su Yan
# @Last Modified time: 2014-01-17 14:15:41

import os
import collections
import hashlib
import urllib2
import json
import re
import tarfile

# settings
# documents http://sae.sina.com.cn/?m=devcenter&catId=281
api_url = 'http://dloadcenter.sae.sina.com.cn/interapi.php?'
appname = 'yansublog'
from_date = '20140101'
to_date = '20140116'
url_type = 'http' # http|taskqueue|cron|mail|rdc
url_type2 = 'access' # only when type=http  access|debug|error|warning|notice|resources
secret_key = 'zwzim4zhk35i50003kz2lh3hyilz01m03515j0i5'

# encode request
params = dict()
params['act'] = 'log'
params['appname'] = appname
params['from'] = from_date
params['to'] = to_date
params['type'] = url_type

if url_type == 'http':
    params['type2'] = url_type2

params = collections.OrderedDict(sorted(params.items()))

request = ''
for k,v in params.iteritems():
    request += k+'='+v+'&'

sign = request.replace('&','')
sign += secret_key

md5 = hashlib.md5()
md5.update(sign)
sign = md5.hexdigest()

request = api_url + request + 'sign=' + sign

# request api
response = urllib2.urlopen(request).read()
response = json.loads(response)

if response['errno'] != 0:
    print '[!] '+response['errmsg']
    exit()

print '[#] request success'

# download and save files
log_files = list()

for down_url in response['data']:   
    file_name = re.compile(r'/d{4}-/d{2}-/d{2}').findall(down_url)[0] + '.tar.gz'
    log_files.append(file_name)
    data = urllib2.urlopen(down_url).read()
    with open(file_name, "wb") as file:
        file.write(data)

print '[#] you got %d log files' % len(log_files)

# compress these files to access_log
access_log = open('access_log','w');

for log_file in log_files:
    tar = tarfile.open(log_file)
    log_name = tar.getnames()[0]
    tar.extract(log_name)
    # save to access_log
    data = open(log_name).read()
    access_log.write(data)
    os.remove(log_name)

print '[#] all file has writen to access_log'

發表評論 共有條評論
用戶名: 密碼:
驗證碼: 匿名發表
主站蜘蛛池模板: 凤冈县| 上林县| 安宁市| 五大连池市| 师宗县| 库伦旗| 玛曲县| 石狮市| 扎囊县| 精河县| 昔阳县| 胶南市| 临澧县| 和平区| 虹口区| 航空| 玛多县| 合山市| 镶黄旗| 黄石市| 盐山县| 改则县| 开化县| 大名县| 中超| 静海县| 密山市| 焦作市| 陆河县| 库尔勒市| 彩票| 隆化县| 大化| 新安县| 宿松县| 柳江县| 宝兴县| 布尔津县| 涞水县| 明水县| 上蔡县|