Filters and Multiselect Lists

One of the most common usage scenarios for JMESPath is being able to take a complex JSON document and simplify it down. The main features at work here are filters and multiselects. In this example below, we're taking the array of people and, for any element with an age key whose value is greater than 20, we're creating a sub list of the name and age values.

Expression Input data (JSON) Result

Filters and Multiselect Hashes

In the previous example we were taking an array of hashes, and simplifying down to an array of two element arrays containing a name and an age. We're also only including list elements where the age key is greater than 20. If instead we want to create the same hash structure but only include the age and name key, we can instead say:

Expression Input data (JSON) Result

The last half of the above expression contains key value pairs which have the general form keyname: <expression>. In the above expression we're just using a field as an expression, but they can be more advanced expressions. For example:

Expression Input data (JSON) Result

Notice in the above example instead of applying a filter expression ([? <expr> ]), we're selecting all array elements via [*].

Working with Nested Data

Expression Input data (JSON)

{
  "reservations": [
    {
      "instances": [
        {
          "type": "small",
          "state": {
            "name": "running"
          },
          "tags": [
            {
              "Key": "Name",
              "Values": [
                "Web"
              ]
            },
            {
              "Key": "version",
              "Values": [
                "1"
              ]
            }
          ]
        },
        {
          "type": "large",
          "state": {
            "name": "stopped"
          },
          "tags": [
            {
              "Key": "Name",
              "Values": [
                "Web"
              ]
            },
            {
              "Key": "version",
              "Values": [
                "1"
              ]
            }
          ]
        }
      ]
    },
    {
      "instances": [
        {
          "type": "medium",
          "state": {
            "name": "terminated"
          },
          "tags": [
            {
              "Key": "Name",
              "Values": [
                "Web"
              ]
            },
            {
              "Key": "version",
              "Values": [
                "1"
              ]
            }
          ]
        },
        {
          "type": "xlarge",
          "state": {
            "name": "running"
          },
          "tags": [
            {
              "Key": "Name",
              "Values": [
                "DB"
              ]
            },
            {
              "Key": "version",
              "Values": [
                "1"
              ]
            }
          ]
        }
      ]
    }
  ]
}

Result

The above example combines several JMESPath features including the flatten operator, multiselect lists, filters, and pipes.

The input data contains a top level key, "reservations", which is a list. Within each list, there is an "instances" key, which is also a list.

The first thing we're doing here is creating a single list from multiple lists of instances. By using the flatten operator we can take the two instances from the first list and the two instances from the second list, and combine them into a single list. Try changing the above expression to just reservations[].instances[] to see what this flattened list looks like. Everything to the right of the reservations[].instances[] is about taking the flattened list and paring it down to contain only the data that we want. This expression is taking each element in the original list and transforming it into a three element sublist. The three elements are:

In the tags list, select the first element in the flattened Values list whose Key has a value of Name.
The type
The state.name of each instance.

The most interesting of those three expressions is the tags[?Key=='Name'].Values[] | [0] part. Let's examine that further.

The first thing to notice is that we're filtering down the list associated with the tags key. The tags[?Key==`Name`] tells us to only include list elements that contain a Key whose value is Name. From those filtered list elements we're going to take the Values key and flatten the list. Finally, the | [0] will take the entire list and extract the 0th element.

Filtering and Selecting Nested Data

In this example, we're going to look at how you can filter nested hashes.

Expression Input data (JSON) Result

In this example we're searching through the people array. Each element in this array contains a hash of two elements, and each value in the hash is itself a hash. We're trying to retrieve the value of the general key that contains an id key with a value of 100.

If we just had the expression people[?general.id==`100`], we'd have a result of a filtered array containing one element with the entire person object.

From there, we then uses a pipe (|) to stop projections so that we can finally select the first element ([0]). Note that we are making the assumption that there's only one hash that contains an id of 100.

Finally, it's worth mentioning there is more than one way to write this expression. In this example we've decided that after we filter the list we're going to select the value of the general key and then select the first element in that list. We could also reverse the order of those operations, we could have taken the filtered list, selected the first element, and then extracted the value associated with the general key. That expression would be:

people[?general.id==`100`] | [0].general

Both versions are equally valid.

Using Functions

JMESPath functions give you a lot of power and flexibility when working with JMESPath expressions. Below are some common expressions and functions used in JMESPath.

sort_by

Expression Input data (JSON) Result

The first interesting thing here if the use of the function sort_by. In this example we are sorting the Contents array by the value of each Date key in each element in the Contents array. The sort_by function takes two arguments. The first argument is an array, and the second argument describes the key that should be used to sort the array.

The second interesting thing in this expression is that the second argument starts with &, which creates an expression type. Think of this conceptually as a reference to an expression that can be evaluated later. If you are familiar with lambda and anonymous functions, expression types are similar. The reason we use &Date instead of Date is because if the expression is Date, it would be evaluated before calling the function, and given there's no Date key in the outer hash, the second argument would evaluate to null. Check out the specification for more information on how functions are evaluated in JMESPath. Also, note that we're taking advantage of the fact that the dates are in ISO 8601 format, which can be sorted lexicographically.

And finally, the last interesting thing in this expression is the [*] immediately after the sort_by function call. The reason for this is that we want to apply the multiselect hash, the second half of the expression, to each element in the sorted array. In order to do this we need a projection. The [*] does exactly that, it takes the input array and creates a projection such that the multiselect hash {Key: Key, Size: Size} will be applied to each element in the list.

There are other functions that take expression types that are similar to sort_by including min_by and max_by.

Pipes

Pipe expression are useful for stopping projections. They can also be used to group expressions.

Main Page

Let's look at a modified version of the expression on the JMESPath home page.

Expression Input data (JSON) Result

We can think of this JMESPath expression as having three components, each separated by the pipe character |. The first expression is familiar to us, it's similar to the first example on this page. The second part of the expression, sort(@), is similar to the sort_by function we saw in the previous section. The @ token is used to refer to the current element. The sort function takes a single parameter which is an array. If the input JSON document was a hash, and we wanted to sort the foo key, which was an array, we could just use sort(foo). In this scenario, the input JSON document is the array we want to sort. To refer to this value, we use the current element, @, to indicate this. We're also only taking a subset of the sorted array. We're using a slice ([-2:]) to indicate that we only want the last two elements in the sorted array to be passed through to the final third of this expression.

And finally, the third part of the expression, {WashingtonCities: join(', ', @)}, creates a multiselect hash. It takes as input, the list of sorted city names, and produces a hash with a single key, WashingtonCities, whose values are the input list (denoted by @) as a string separated by a comma.