search - Optimal Solr JVM/Virtual/Physical Memory Configuration -
our company has several different ways of getting leads, , several types of leads deal with. there slight differences between each type of lead , of information shared or related 1 or more other lead types. me , team trying build/configure index using solr handles each of these lead types , shared data .. customer data, resort data. etc (around 1.2 million records in all). we're hosting ubuntu server (12g ram, 8 core opteron), running tomcat 6 , solr 3.4.
i'd index add records in live time when customer submits lead-gen form on our website(around 1500-2000 daily), update when employees add or modify data (around 2500-3000 times daily).
in addition need customers on website , employees in house able search data filters, facets, auto-completes, highlighting , stuff 1 has come expect written search.
this setup functioning, hangs updating records both on website , in our internal apps. commits done every 1000 documents or 5 seconds , optimize once daily. optimal jvm, server or solr configurations type of setup? appreciated , can provide information needed willing help.
first, you should not optimize.
there 2 common erros when configuring jvm heap size in solr:
- giving memory jvm, (the os cache won't able cache disk operations),
- giving not enough memory jvm (there lot of pressure on garbage collector forced run frequent stop-the-world collections, use jmx monitoring figure out whether full gc triggered).
one other reason why application may hang background merges. lucene based on segments, , whenever number of segments gets higher mergefactor
, merge triggered. low value of mergefactor
might explain hangs.
you should give more details on current setup can you:
- jvm size,
- what collector using (g1, throughput collector, concurrent low pause collector, ...)
- index size (on disk, not number of documents),
mergefactor
,rambuffersizemb
, ...
Comments
Post a Comment