python - Readline and threading -
so run code below, , when use queue.qsize() after run it, there still 450,000 or items in queue, implying lines of text file not read. idea going on here?
from queue import queue threading import thread lines = 660918 #int(str.split(os.popen('wc -l hgdp_finalreport_forward.txt').read())[0]) -1 queue = queue() file = 'hgdp_finalreport_forward.txt' num_threads =10 short_file = open(file) class worker(thread): def __init__(self,queue): thread.__init__(self) self.queue = queue def run(self): while true: try: self.queue.get() = short_file.readline() self.queue.task_done() #signal queue task done except: break ## should make call threads def main(): in range(num_threads): worker(queue).start() queue.join() in range(lines): # put range of number of lines in .txt file queue.put(i) main()
it's hard know you're trying here, if each line can processed independently, multiprocessing
simpler choice take care of synchronization you. added bonus don't have know number of lines in advance.
basically,
import multiprocessing pool = multiprocessing.pool(10) def process(line): return len(line) #or whatever open(path) lines: results = pool.map(process, lines)
or, if you're trying kind of aggregate result lines, can use reduce
lower memory usage.
import operator open(path) lines: result = reduce(operator.add, pool.map(process, lines))
Comments
Post a Comment