对于elasticsearch的聚合和过滤,他的结果并不会受到你写的顺序而影响。换句话说就是你无论是在聚合语句的前面写过滤条件,还是在过滤语句后面写过滤条件都不会影响他的结果。他都会先过滤再聚合和关系数据库一样先where后group by。 但是如果你想过滤条件不影响聚合(agg)结果,而只是改变hits结果;可以使用setPostFilter() 这个方法
eg:全部数据 代码:
SearchResponse response = null; SearchRequestBuilder responsebuilder = client.prepareSearch("company") .setTypes("employee").setFrom(0).setSize(250); AggregationBuilder aggregation = AggregationBuilders .terms("agg") .field("age") ; response = responsebuilder .addAggregation(aggregation) .setExplain(true).execute().actionGet();SearchHits hits = response.getHits(); Terms agg = response.getAggregations().get("agg");
结果: 仅聚合结果不过滤(注意看hits和agg里的结果)
{ "took":100, "timed_out":false, "_shards":{ "total":5, "successful":5, "failed":0 }, "hits":{ "total":7, "max_score":1, "hits":[ { "_shard":1, "_node":"fvp3NBT5R5i6CqN3y2LU4g", "_index":"company", "_type":"employee", "_id":"5", "_score":1, "_source":{ "name":"Fresh", "age":22 }, "_explanation":Object{...} }, { "_shard":1, "_node":"fvp3NBT5R5i6CqN3y2LU4g", "_index":"company", "_type":"employee", "_id":"10", "_score":1, "_source":{ "name":"Henrry", "age":30 }, "_explanation":Object{...} }, { "_shard":1, "_node":"fvp3NBT5R5i6CqN3y2LU4g", "_index":"company", "_type":"employee", "_id":"9", "_score":1, "_source":{ "address":{ "country":"china", "province":"jiangsu", "city":"nanjing", "area":{ "pos":"10001" } } }, "_explanation":Object{...} }, { "_shard":2, "_node":"fvp3NBT5R5i6CqN3y2LU4g", "_index":"company", "_type":"employee", "_id":"2", "_score":1, "_source":{ "address":{ "country":"china", "province":"jiangsu", "city":"nanjing" }, "name":"jack_1", "age":19, "join_date":"2016-01-01" }, "_explanation":Object{...} }, { "_shard":2, "_node":"fvp3NBT5R5i6CqN3y2LU4g", "_index":"company", "_type":"employee", "_id":"4", "_score":1, "_source":{ "name":"willam", "age":18 }, "_explanation":Object{...} }, { "_shard":2, "_node":"fvp3NBT5R5i6CqN3y2LU4g", "_index":"company", "_type":"employee", "_id":"6", "_score":1, "_source":{ "name":"Avivi", "age":30 }, "_explanation":Object{...} }, { "_shard":4, "_node":"K7qK1ncMQUuIe0K6VSVMJA", "_index":"company", "_type":"employee", "_id":"3", "_score":1, "_source":{ "address":{ "country":"china", "province":"shanxi", "city":"xian" }, "name":"marry", "age":35, "join_date":"2015-01-01" }, "_explanation":Object{...} } ] }, "aggregations":{ "agg":{ "doc_count_error_upper_bound":0, "sum_other_doc_count":0, "buckets":[ { "key":30, "doc_count":2 }, { "key":18, "doc_count":1 }, { "key":19, "doc_count":1 }, { "key":22, "doc_count":1 }, { "key":35, "doc_count":1 } ] } }}
1、setQuery() 写在前面 代码:
SearchResponse response = null; SearchRequestBuilder responsebuilder = client.prepareSearch("company") .setTypes("employee").setFrom(0).setSize(250); AggregationBuilder aggregation = AggregationBuilders .terms("agg") .field("age") ; response = responsebuilder .setQuery(QueryBuilders.rangeQuery("age").gt(30).lt(40)) .addAggregation(aggregation) .setExplain(true).execute().actionGet(); SearchHits hits = response.getHits(); Terms agg = response.getAggregations().get("agg");
结果:
{ "took":538, "timed_out":false, "_shards":{ "total":5, "successful":5, "failed":0 }, "hits":{ "total":1, "max_score":1, "hits":[ { "_shard":4, "_node":"anlkGjjuQ0G6DODpZgiWrQ", "_index":"company", "_type":"employee", "_id":"3", "_score":1, "_source":{ "address":{ "country":"china", "province":"shanxi", "city":"xian" }, "name":"marry", "age":35, "join_date":"2015-01-01" }, "_explanation":Object{...} } ] }, "aggregations":{ "agg":{ "doc_count_error_upper_bound":0, "sum_other_doc_count":0, "buckets":[ { "key":35, "doc_count":1 } ] } }}
2、setQuery() 写在后面 代码:
SearchResponse response = null; SearchRequestBuilder responsebuilder = client.prepareSearch("company") .setTypes("employee").setFrom(0).setSize(250); AggregationBuilder aggregation = AggregationBuilders .terms("agg") .field("age") ; response = responsebuilder .addAggregation(aggregation) .setQuery(QueryBuilders.rangeQuery("age").gt(30).lt(40) .setExplain(true).execute().actionGet();SearchHits hits = response.getHits(); Terms agg = response.getAggregations().get("agg");
结果:
"took":538, "timed_out":false, "_shards":{ "total":5, "successful":5, "failed":0 }, "hits":{ "total":1, "max_score":1, "hits":[ { "_shard":4, "_node":"anlkGjjuQ0G6DODpZgiWrQ", "_index":"company", "_type":"employee", "_id":"3", "_score":1, "_source":{ "address":{ "country":"china", "province":"shanxi", "city":"xian" }, "name":"marry", "age":35, "join_date":"2015-01-01" }, "_explanation":Object{...} } ] }, "aggregations":{ "agg":{ "doc_count_error_upper_bound":0, "sum_other_doc_count":0, "buckets":[ { "key":35, "doc_count":1 } ] } }}
3、setPostFilter() 在聚合.aggAggregation()方法后 代码:
SearchResponse response = null; SearchRequestBuilder responsebuilder = client.prepareSearch("company") .setTypes("employee").setFrom(0).setSize(250); AggregationBuilder aggregation = AggregationBuilders .terms("agg") .field("age") ; response = responsebuilder .addAggregation(aggregation) .setPostFilter(QueryBuilders.rangeQuery("age").gt(30).lt(40)) .setExplain(true).execute().actionGet(); SearchHits hits = response.getHits(); Terms agg = response.getAggregations().get("agg");
结果:
{ "took":7, "timed_out":false, "_shards":{ "total":5, "successful":5, "failed":0 }, "hits":{ "total":1, "max_score":1, "hits":[ { "_shard":4, "_node":"fvp3NBT5R5i6CqN3y2LU4g", "_index":"company", "_type":"employee", "_id":"3", "_score":1, "_source":{ "address":{ "country":"china", "province":"shanxi", "city":"xian" }, "name":"marry", "age":35, "join_date":"2015-01-01" }, "_explanation":Object{...} } ] }, "aggregations":{ "agg":{ "doc_count_error_upper_bound":0, "sum_other_doc_count":0, "buckets":[ { "key":30, "doc_count":2 }, { "key":18, "doc_count":1 }, { "key":19, "doc_count":1 }, { "key":22, "doc_count":1 }, { "key":35, "doc_count":1 } ] } }}
4、setPostFilter() 在聚合.aggAggregation()方法前 代码:
SearchResponse response = null; SearchRequestBuilder responsebuilder = client.prepareSearch("company") .setTypes("employee").setFrom(0).setSize(250); AggregationBuilder aggregation = AggregationBuilders .terms("agg") .field("age") ; response = responsebuilder .setPostFilter(QueryBuilders.rangeQuery("age").gt(30).lt(40)) .addAggregation(aggregation) .setExplain(true).execute().actionGet(); SearchHits hits = response.getHits(); Terms agg = response.getAggregations().get("agg");
结果:
{ "took":5115, "timed_out":false, "_shards":{ "total":5, "successful":5, "failed":0 }, "hits":{ "total":1, "max_score":1, "hits":[ { "_shard":4, "_node":"b8cNIO5cQr2MmsnsuluoNQ", "_index":"company", "_type":"employee", "_id":"3", "_score":1, "_source":{ "address":{ "country":"china", "province":"shanxi", "city":"xian" }, "name":"marry", "age":35, "join_date":"2015-01-01" }, "_explanation":Object{...} } ] }, "aggregations":{ "agg":{ "doc_count_error_upper_bound":0, "sum_other_doc_count":0, "buckets":[ { "key":30, "doc_count":2 }, { "key":18, "doc_count":1 }, { "key":19, "doc_count":1 }, { "key":22, "doc_count":1 }, { "key":35, "doc_count":1 } ] } }}
总结: 可以从运行的结果很好的看出无论是setPostFilter()还是setQuery(),它放在那的顺序并不会影响他的结果。更可以看出setQuery()这个方法的过滤条件不仅会影响它的hits的结果还会影响他的聚合(agg)结果。然而对于setPostFilter()这个方法,它只会影响hits的结果,并不会影响它的聚合(agg)结果。