.net - Is there a way to speed up reading the data? -

- February 15, 2015

in program created following logic reading data database , storing list<>:

                npgsqlcommand cmd = new npgsqlcommand(query, conn);                 list<userinfo> result = new list<userinfo>();                 npgsql.npgsqldatareader rdr = cmd.executereader();                 while (rdr.read())                 {                     string userid = rdr[0].tostring();                     string sex = rdr[1].tostring();                     string strdatebirth = rdr[2].tostring();                     string zip = rdr[3].tostring();                      userinfo userinfo = new userinfo();                     userinfo.msisdn = userid;                     userinfo.gender = sex;                     try                     {                         userinfo.birthdate = convert.todatetime(strdatebirth);                     }                     catch (exception ex)                     {                     }                     userinfo.zipcode = zip;                     userinfo.demographicsknown = true;                     userinfo.agegroup = getagegroup(strdatebirth);                     if (result.count(x => x.id== userid) == 0)                         result.add(userinfo);                 }

the performance of code poor. there on 2m of records , after half hour list userinfo contains 300.000 records.

does know how speed data reading database?

you using .count when mean .any()
whenever call .count enumerating entire collection see if have single match....

consider question you're asking:
"how many rows have match condition? number equal zero?"

what mean is:
"do rows match condition?"

in context, create hashset of userid values. checking existence in hashset (or dictionary) can faster checking same in list.

furthermore, if do have userid, parsed , read values no reason. check myhashset.contains(userid) first, add.

this primary reason it's slow. n rows you're performing nth-triangular enumerations of collection!

edit: consider untested change: don't know if reader supports typed read methods getstring() if doesn't use had before.

npgsqlcommand cmd = new npgsqlcommand(query, conn); list<userinfo> result = new list<userinfo>(); npgsql.npgsqldatareader rdr = cmd.executereader(); hashset<string> userhash = new hashset<string>(); // int?  while (rdr.read()) {     string userid = rdr.getstring(0);     if (!userhash.contains(userid))     {         string strdatebirth = rdrgetstring(2);         userinfo userinfo = new userinfo();         userinfo.msisdn = userid;         userinfo.gender = rdr.getstring(1);         datetime parseddate; // not used if parse fails         if (datetime.tryparse(strdatebirth, out parseddate))         {             userinfo.birthdate = parseddate;             // userinfo.agegroup = getagegroup(strdatebirth); // why take string?             // rewrite getagegroup method take datetime             userinfo.agegroup = getagegroup(parseddate);         }         userinfo.zipcode = rdr.getstring(3);         userinfo.demographicsknown = true;         result.add(userinfo);         userhash.add(userid);     } }

this keep first instance of user row find (which current code does). if want keep last instance can use dictionary , eliminate .contains() call altogether.

edit: noticed sample never added userid hash... whoops... added in there.

Search This Blog

EXC

.net - Is there a way to speed up reading the data? -

Comments

Post a Comment

Popular posts from this blog

django - How can I change user group without delete record -

java - Need to add SOAP security token -

java - EclipseLink JPA Object is not a known entity type -