cursor - How to do this lookup more elegantly in Python and MySQL? -


i have bunch of files in directory named nameid_cityid.txt, nameid , cityid being ids of name (integer(10)) , city (integer(10)) in mydata table.

while following solution works, doing type conversions since fetchall fetches 'l' , file name tuple of nameid, cityid strings,..

if can suggest pythonic or more elegant way of doing same, awesome, me , communtiy!

what trying achieve : find files directory don't have record in database , file, parse/move/delete it.

mysql table mydata :

nameid  cityid 15633   45632 2354    76894 

python :

for pdffile in os.listdir(filepath):     cityid, nameid = pdffile.strip('.txt').split('_')[0], pdffile.strip('.txt').split('_')[1]         cursor.execute("select cityid, nameid mydata")     alreadyparsed = cursor.fetchall()     targetvalues = ((str(cityid), str(nameid)) cityid, nameid in alreadyparsed)     if (int(cityid), int(nameid)) in alreadyparsed:         print cityid, nameid, "found"     else:         print cityid, nameid, "not found" 

i'd use set quick , easy testing:

cursor.execute("select concat(nameid, '_', cityid, '.txt') mydata") present = set([r[0] r in cursor])  pdffile in os.listdir(filepath):     nameid, cityid = map(int, pdffile.rsplit('.', 1)[0].split('_'))     print nameid, cityid,     print "found" if pdffile in present else "not found" 

first, i've pulled query outside of filename loop; no point in querying same set of rows each time.

secondly, i'll let mysql generate filenames me using concat ease of collecting information set.

thirdly, because have set of filenames, testing each individual filename against set simple pdffile in present test.

and finally, i've simplified filename splitting logic 1 line.

now, if want set of filenames not present yet in database (rather enumerate ones , ones not), use set operation:

cursor.execute("select concat(nameid, '_', cityid, '.txt') mydata") present = set([r[0] r in cursor])  pdffile in (set(os.listdir(filepath)) - present):     nameid, cityid = map(int, pdffile.rsplit('.', 1)[0].split('_'))     print nameid, cityid, "found" 

here use .difference operation (with - operator) remove filenames there rows in database, in 1 simple operation.


Comments

Popular posts from this blog

django - How can I change user group without delete record -

java - Need to add SOAP security token -

java - EclipseLink JPA Object is not a known entity type -