Elasticsearch equivalent to Map-Reduce -
what equivalent of map-reduce in elasticsearch when processing client-side? there "streaming" client can reduce data output comes in?
assume need join, or complex filtering @ client side, type might not fit in memory without map-reduce scheme. don't mind waiting long time response, dont want crush machine (client and/or server).
how should go this?
example, mappings:
{"book":{"properties":{ "title":{"type":"string", "index":"analyzed"}, "author":{"type":"string", "index":"analyzed"}, } {"character":{"properties":{ "book_id":{"type":"string", "index":"not_analyzed"}, "name":{"type":"string", "index":"analyzed"}, "age":{"type":"integer"}, "catch-phrase":{"type":"string", "index":"analyzed"}, }
say want find books have @ least m characters have catch phrase no longer n (where n parameter supplied @ client side)
so get_books_with_short_phrases(m,n)
i of course add fields such "phrase-length" "character" type, let's assume processing on "catch-phrase" might changing time.
i'd stream "characters" , "books" client, go on each client , output key-value of <book>-<character,len(phrase)>
reduce further <book>-<num_of_chars_with_short_phrase>
if load documents client memory, might disaster. if client processes each book , reduces k,v might better.
am going wrong it?
is solution running scripts on server somehow, performs map-reduce?
afaik can't streaming es.
as i'm sure know it's best different mindset in 'joins' not exist. instead denormalize , try cover usecase 1 query es of course doesn't work.
in above case however, invite take @ script-filter, allows complex computations (akin sql stored procedures) allow query-time parameters.
i'm pretty confident should give tools query in 1 go on server, although didn't deep it.
http://www.elasticsearch.org/guide/reference/query-dsl/script-filter/ http://www.elasticsearch.org/guide/reference/modules/scripting/
Comments
Post a Comment