faceted search - Elasticsearch ignoring whitespaces and cases for facet terms -
i have made es schema fields marked no_analysis creating facets. seems of data contains whitespaces or lowercase letters. e.g. field named color have values "black", "black", "black ", leads 3 different facet terms. there way handle without making changes data?
you can analyze text without tokenizing if use keyword tokenizer. means "black dog" not split 2 tokens, can apply token filters modify tokens, instance lowercasing them using lowercase filter , trimming them using trim token filter.
you need create custom analyzer in index settings , use in mapping field you're faceting on.
as result index "black" token out of 3 "black", "black" , "black " provided input.
Comments
Post a Comment