4.3.1 Multiple Query Strings 多个查询关键词

The simplest multifield query to deal with is the one where we can map search terms to specific fields. If we know that War and Peace is the title, and Leo Tolstoy is the author, it is easy to write each of these conditions as a match clause and to combine them with a bool query:

最简单的跨字段查询的场景是,我们知道查询关键词中的哪个部分对应哪个字段。比如我们知道“战争与和平”是书名而“列夫 托尔斯泰”是作者名,这时候我们只要在一个bool查询中嵌两个should查询就可以了:

GET /_search
  "query": {
    "bool": {
      "should": [
        { "match": { "title":  "War and Peace" }},
        { "match": { "author": "Leo Tolstoy"   }}




GET /_search
  "query": {
    "bool": {
      "should": [
        { "match": { "title":  "War and Peace" }},
        { "match": { "author": "Leo Tolstoy"   }},
        { "bool":  {
          "should": [
            { "match": { "translator": "Constance Garnett" }},
            { "match": { "translator": "Louise Maude"      }}




Prioritizing Clauses 控制子查询的权重



GET /_search
  "query": {
    "bool": {
      "should": [
        { "match": { ①
            "title":  {
              "query": "War and Peace",
              "boost": 2
        { "match": { ②
            "author":  {
              "query": "Leo Tolstoy",
              "boost": 2
        { "bool":  { ③
            "should": [
              { "match": { "translator": "Constance Garnett" }},
              { "match": { "translator": "Louise Maude"      }}


①,② 标题和作者的 match 查询子句的 boost值为2

③ 而译者的 bool 查询子句的 boost 值为 1

boost参数的最佳值很容易确定,只要多测试几遍就可以了。一个比较推荐的值范围是 1 到 10 或者 1 到 15 也行。再高的话可能就没那么有效果了。因为相关度评分在计算的时候被会 normalized。

The bool query takes a more-matches-is-better approach, so the score from each match clause will be added together to provide the final _score for each document. Documents that match both clauses will score higher than documents that match just one clause.

Of course, you’re not restricted to using just match clauses: the bool query can wrap any other query type, including other bool queries. We could add a clause to specify that we prefer to see versions of the book that have been translated by specific translators:

Why did we put the translator clauses inside a separate bool query? All four match queries are should clauses, so why didn’t we just put the translator clauses at the same level as the title and author clauses?

The answer lies in how the score is calculated. The bool query runs each match query, adds their scores together, then multiplies by the number of matching clauses, and divides by the total number of clauses. Each clause at the same level has the same weight. In the preceding query, the bool query containing the translator clauses counts for one-third of the total score. If we had put the translator clauses at the same level as title and author, they would have reduced the contribution of the title and author clauses to one-quarter each.

