博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
Python 3中的多线程文件下载类
阅读量:4045 次
发布时间:2019-05-24

本文共 4291 字,大约阅读时间需要 14 分钟。

今天在网上看到一个多线程文件下载类,觉得编码写的不错,但是是python2的,所以改造了一下,不废话,上代码,说明:我的代码都会自己测试通过,所以可以放心使用

我的博客:

其中有一个小细节,那就是从HTTP包头中抓出文件大小,方法如下:
urlHandler = urllib.request.urlopen( url )#返回一个response对象
headers = urlHandler.info()#这里面就是头信息了

length = int(headers.get(‘Content-Length’))#用这个方法马上就能抓出你要的东西,网上没什么资料,找了1个多小时后放弃,最后看了源码才找到这个方法

'''Created on 2011-11-10@author: PaulWangDescription:''''''It is a multi-thread downloading tool    It was developed follow axel.        Author: volans        E-mail: volansw [at] gmail.com'''import sysimport osimport timeimport urllib.requestfrom threading import Thread#===============================================================================# def download(url, output, blocks=6, proxies=local_proxies)# output:输出文件的全路径,不带路径帽默认在本文件夹中生成# blocks:分几块,开几个线程# proxies:代理地址#===============================================================================local_proxies = {'http': 'http://131.139.58.200:8080'}#代理地址class AxelPython(Thread, urllib.request.FancyURLopener):    '''Multi-thread downloading class.        run() is a vitural method of Thread.    '''    def __init__(self, threadname, url, filename, ranges=0, proxies={}):        Thread.__init__(self, name=threadname)        urllib.request.FancyURLopener.__init__(self, proxies)        self.name = threadname        self.url = url        self.filename = filename        self.ranges = ranges        self.downloaded = 0    def run(self):        '''vertual function in Thread'''        try:            self.downloaded = os.path.getsize( self.filename )        except OSError:            #print 'never downloaded'            self.downloaded = 0        # rebuild start poind        self.startpoint = self.ranges[0] + self.downloaded        # This part is completed        if self.startpoint >= self.ranges[1]:            print ('Part %s has been downloaded over.' % self.filename)            return        self.oneTimeSize = 16384 #16kByte/time        print ('task %s will download from %d to %d' % (self.name, self.startpoint, self.ranges[1]))        self.addheader("Range", "bytes=%d-%d" % (self.startpoint, self.ranges[1]))        self.urlhandle = self.open( self.url )        data = self.urlhandle.read( self.oneTimeSize )        while data:            filehandle = open( self.filename, 'ab+' )            filehandle.write( data )            filehandle.close()            self.downloaded += len( data )            #print "%s" % (self.name)            #progress = u'\r...'            data = self.urlhandle.read( self.oneTimeSize )def GetUrlFileSize(url, proxies={}):    urlHandler = urllib.request.urlopen( url )    headers = urlHandler.info()    length = int(headers.get('Content-Length'))    print('Content-Length is %d' % length)    return lengthdef SpliteBlocks(totalsize, blocknumber):    blocksize = totalsize/blocknumber    ranges = []    for i in range(0, blocknumber-1):        ranges.append((i*blocksize, i*blocksize +blocksize - 1))    ranges.append(( blocksize*(blocknumber-1), totalsize -1 ))    return rangesdef islive(tasks):    for task in tasks:        if task.isAlive():            return True    return Falsedef download(url, output, blocks=6, proxies=local_proxies):    ''' paxel    '''    size = GetUrlFileSize( url, proxies )    ranges = SpliteBlocks( size, blocks )    threadname = [ "thread_%d" % i for i in range(0, blocks) ]    filename = [ "tmpfile_%d" % i for i in range(0, blocks) ]    tasks = []    for i in range(0,blocks):        task = AxelPython( threadname[i], url, filename[i], ranges[i] )        task.setDaemon( True )        task.start()        tasks.append( task )    time.sleep( 2 )    while islive(tasks):        downloaded = sum( [task.downloaded for task in tasks] )        process = downloaded/float(size)*100        show = '\rFilesize:%d Downloaded:%d Completed:%.2f%%' % (size, downloaded, process)        sys.stdout.write(show)        sys.stdout.flush()        time.sleep( 0.5 )    filehandle = open( output, 'wb+' )    for i in filename:        f = open( i, 'rb' )        filehandle.write( f.read() )        f.close()        try:            os.remove(i)            pass        except:            pass    filehandle.close()if __name__ == '__main__':    url = "http://www.pygtk.org/dist/pygtk2-tut.pdf"    output = 'pygtk2.pdf'    download( url, output, blocks=1, proxies={} )1

转载地址:http://eggdi.baihongyu.com/

你可能感兴趣的文章
9、VUE面经
查看>>
关于进制转换的具体实现代码
查看>>
Golang 数据可视化利器 go-echarts ,实际使用
查看>>
mysql 跨机器查询,使用dblink
查看>>
mysql5.6.34 升级到mysql5.7.32
查看>>
dba 常用查询
查看>>
Oracle 异机恢复
查看>>
Oracle 12C DG 搭建(RAC-RAC/RAC-单机)
查看>>
Truncate 表之恢复
查看>>
Oracle DG failover 后恢复
查看>>
mysql 主从同步配置
查看>>
为什么很多程序员都选择跳槽?
查看>>
mongdb介绍
查看>>
mongdb安装使用
查看>>
mongdb在java中的应用
查看>>
区块链技术让Yotta企业云盘为行政事业服务助力
查看>>
Yotta企业云盘更好的为媒体广告业服务
查看>>
Yotta企业云盘助力旅游行业新发展
查看>>
Yotta企业云盘助力科技行业创高峰
查看>>
Yotta企业云盘更好地为教育行业服务
查看>>