Filters and Multiselect Lists
One of the most common usage scenarios for JMESPath is being able to take a complex JSON document and simplify it down. The main features at work here are filters and multiselects. In this example below, we're taking the array of people and, for any element with an age key whose value is greater than 20, we're creating a sub list of the name and age values.
Filters and Multiselect Hashes
In the previous example we were taking an array of hashes, and simplifying
down to an array of two element arrays containing a name and an age.
We're also only including list elements where the age key is
greater than 20. If instead we want to create the same hash
structure but only include the age and name key,
we can instead say:
The last half of the above expression contains key value pairs which have
the general form keyname: <expression>. In the above
expression we're just using a field as an expression, but they can be
more advanced expressions. For example:
Notice in the above example instead of applying a filter expression
([? <expr> ]), we're selecting all array elements via
[*].
Working with Nested Data
The above example combines several JMESPath features including the flatten operator, multiselect lists, filters, and pipes.
The input data contains a top level key, "reservations", which is a list. Within each list, there is an "instances" key, which is also a list.
The first thing we're doing here is creating a single list from multiple
lists of instances. By using the
flatten operator we can
take the two instances from the first list and the two instances from the
second list, and combine them into a single list. Try changing the above
expression to just reservations[].instances[] to see what
this flattened list looks like. Everything to the right of the
reservations[].instances[] is about taking the flattened list
and paring it down to contain only the data that we want. This expression
is taking each element in the original list and transforming it into a
three element sublist. The three elements are:
-
In the
tagslist, select the first element in the flattenedValueslist whoseKeyhas a value ofName. - The
type - The
state.nameof each instance.
The most interesting of those three expressions is the
tags[?Key=='Name'].Values[] | [0] part. Let's examine that further.
The first thing to notice is that we're filtering down the list associated
with the tags key. The tags[?Key==`Name`] tells
us to only include list elements that contain a Key whose
value is Name. From those filtered list elements we're going
to take the Values key and flatten the list. Finally, the
| [0] will take the entire list and extract the 0th element.
Filtering and Selecting Nested Data
In this example, we're going to look at how you can filter nested hashes.
In this example we're searching through the people array.
Each element in this array contains a hash of two elements, and each value
in the hash is itself a hash. We're trying to retrieve the value of the
general key that contains an id key with a value
of 100.
If we just had the expression people[?general.id==`100`], we'd
have a result of a filtered array containing one element with the entire
person object.
From there, we then uses a pipe (|) to stop projections so
that we can finally select the first element ([0]). Note that
we are making the assumption that there's only one hash that contains an
id of 100.
Finally, it's worth mentioning there is more than one way to write this
expression. In this example we've decided that after we filter the list
we're going to select the value of the general key and then
select the first element in that list. We could also reverse the order of
those operations, we could have taken the filtered list, selected the
first element, and then extracted the value associated with the
general key. That expression would be:
people[?general.id==`100`] | [0].general Both versions are equally valid.
Using Functions
JMESPath functions give you a lot of power and flexibility when working with JMESPath expressions. Below are some common expressions and functions used in JMESPath.
sort_by
The first interesting thing here if the use of the function
sort_by. In this example we are sorting the
Contents array by the value of each Date key in
each element in the Contents array. The sort_by
function takes two arguments. The first argument is an array, and the
second argument describes the key that should be used to sort the array.
The second interesting thing in this expression is that the second argument
starts with &, which creates an expression type. Think of
this conceptually as a reference to an expression that can be evaluated
later. If you are familiar with lambda and anonymous functions, expression
types are similar. The reason we use &Date instead of
Date is because if the expression is Date, it
would be evaluated before calling the function, and given there's no
Date key in the outer hash, the second argument would evaluate
to null. Check out the specification for more information on
how functions are evaluated in JMESPath. Also, note that we're taking
advantage of the fact that the dates are in ISO 8601 format, which can be
sorted lexicographically.
And finally, the last interesting thing in this expression is the
[*] immediately after the sort_by function call.
The reason for this is that we want to apply the multiselect hash, the
second half of the expression, to each element in the sorted array. In
order to do this we need a projection. The [*] does exactly
that, it takes the input array and creates a projection such that the
multiselect hash {Key: Key, Size: Size} will be
applied to each element in the list.
There are other functions that take expression types that are similar to
sort_by including
min_by and
max_by.
Pipes
Pipe expression are useful for stopping projections. They can also be used to group expressions.
Main Page
Let's look at a modified version of the expression on the JMESPath home page.
We can think of this JMESPath expression as having three components, each
separated by the pipe character |. The first expression is
familiar to us, it's similar to the first example on this page. The second
part of the expression, sort(@), is similar to the
sort_by function we saw in the previous section. The
@ token is used to refer to the current element. The
sort function takes a
single parameter which is an array. If the input JSON document was a hash,
and we wanted to sort the foo key, which was an array, we
could just use sort(foo). In this scenario, the input JSON
document is the array we want to sort. To refer to this value, we use the
current element, @, to indicate this. We're also only taking
a subset of the sorted array. We're using a slice ([-2:]) to
indicate that we only want the last two elements in the sorted array to be
passed through to the final third of this expression.
And finally, the third part of the expression,
{WashingtonCities: join(', ', @)}, creates a multiselect
hash. It takes as input, the list of sorted city names, and produces a
hash with a single key, WashingtonCities, whose values are
the input list (denoted by @) as a string separated by a
comma.