<img align="right" src="images/ninologo.png" width="150"/>
<img align="right" src="images/tf-small.png" width="125"/>
<img align="right" src="images/dans.png" width="150"/>

# Search

Search is essential to get around in the corpus, and it is convenient as well.
Whereas the whole point of Text-Fabric is to move around in the corpus programmatically,
we show that
[template based search](https://annotation.github.io/text-fabric/tf/about/searchusage.html)
makes everything a lot more convenient ...

Along with showing how search works, we also point to pretty ways to display your search results.
The good news is that `search` and `pretty` work well together.

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
from IPython.display import display, Markdown
from tf.app import use

In [3]:
A = use("Nino-cunei/uruk:clone", hoist=globals())

**Locating corpus resources ...**

Name,# of nodes,# slots/node,% coverage
tablet,6364,22.01,100
face,9456,14.1,95
column,14023,9.34,93
line,35842,3.61,92
case,9651,3.46,24
cluster,32753,1.03,24
quad,3794,2.05,6
comment,11090,1.0,8
sign,140094,1.0,100


# The basics

Here is a very simple query: we look for tablets containing a numeral sign.

In [4]:
query = """
tablet
  sign type=numeral
"""

results = A.search(query)

  0.10s 38122 results


We can display the results in a table (here are the first 5):

In [5]:
A.table(results, end=5, condenseType="line")

n,p,tablet,sign
1,P006427 obverse:2:1,P006427,3(N14)
2,P006428 obverse:3:2,P006428,3(N14)
3,P006428 obverse:3:3,P006428,1(N14)
4,P006428 obverse:3:5,P006428,1(N01)
5,P006428 obverse:3:5,P006428,1(N57)


We can combine all results that are on the same tablet:

In [6]:
A.table(results, condensed=True, condenseType="line", end=5)

n,p,line,sign,Unnamed: 4,Unnamed: 5
1,P006427 obverse:2:1,3(N14) X SANGA~a [...],3(N14),,
2,P006428 obverse:3:2,3(N14) X,3(N14),,
3,P006428 obverse:3:3,1(N14) SUHUR,1(N14),,
4,P006428 obverse:3:5,1(N01) |DUG~bx1(N57)|,1(N01),1(N57),
5,P448701 obverse:1:1,1(N46) 2(N19) 4(N41),2(N19),4(N41),1(N46)


And we can show them inside the face they occur in:

In [7]:
A.show(results, condenseType="face", end=2, skipCols="1")

The feature *type* is displayed because it occurs in the query.
We can make the display a bit more compact by suppressing those features:

In [8]:
A.show(results, condenseType="face", end=2, queryFeatures=False, skipCols="1")

## Finding a tablet

Suppose we have the *p-number* of a tablet.
How do we find that tablet?
Remembering from the feature docs that the p-numbers are stored in the feature
`catalogId`, we can write a *search template*.

In [9]:
query = """
tablet catalogId=P005381
"""
results = A.search(query)
A.table(results)

  0.00s 1 result


n,p,tablet
1,P005381,P005381


The function `A.table()` gives you a tabular overview of the results,
with a link to the tablet on CDLI.

But we can also get more information by using `A.show()`:

In [10]:
A.show(results)

Several things to note here

* if you want to see the tablet on CDLI, you can click on the tablet header;
* the display matches the layout on the tablet:
  * faces and columns are delineated with red lines
  * lines and cases are delineated with blue lines
  * cases and subcases alternate their direction of division between horizontal and vertical:
    lines are horizontally divided into cases, they are vertically divided into subcases, and they
    in turn are horizontally divided in subsubcases, etc.
  * quads and signs are delineated with grey lines
  * clusters are delineated with brown lines (see further on)
  * lineart is given for top-level signs and quads; those that are part of a bigger quad do not
    get lineart;

It is possible to switch off the lineart.

## More info in the results
You can show the line numbers that correspond to the ATF source files as well.
Let us also switch off the lineart.

In [11]:
query = """
tablet catalogId=P005381
"""
results = A.search(query)
A.table(results, lineNumbers=True)
A.show(results, lineNumbers=True, showGraphics=False)

  0.00s 1 result


n,p,tablet
1,P005381,@85111 P005381


There is a big quad in `obverse:2 line 1`. We want to call up the lineart for it separately.
First step: make the nodes visible.

In [12]:
query = """
tablet catalogId=P005381
"""
results = A.search(query)
A.table(results, withNodes=True)
A.show(results, withNodes=True, showGraphics=False)

  0.00s 1 result


n,p,tablet
1,P005381,148166 P005381


We read off the node number of that quad and fetch the lineart.

In [13]:
A.lineart(143015)

## Search templates
Let's highlight all numerals on the tablet.

We prefer our results to be condensed per tablet for the next few shows.

We make that the temporary default:

In [14]:
A.displaySetup(condensed=True)

In [15]:
query = """
tablet catalogId=P005381
  sign type=numeral
"""
results = A.search(query)
A.show(results, queryFeatures=False)

  0.05s 10 results


We can do the same for multiple tablets. But now we highlight the undivided lines,
just for variation.

In [16]:
query = """
tablet catalogId=P003581|P000311
  line terminal
"""
results = A.search(query)

  0.01s 11 results


In [17]:
A.table(results, showGraphics=False, withPassage=False)

n,tablet,line,line.1,line.2,line.3,line.4,line.5
1,P000311,[1(N01)] [...] IR~a,[1(N01)] ERIM2,1(N01) NIMGIR SIG7,1(N01) U2~b NAGA~a MUSZEN ZATU647 BA,1(N01) IM~a [...],[N] [...]
2,P003581,5(N01) U2~a [...],1(N01) X [...],1(N14) [...] SUHUR [...],5(N14) 1(N01) [...] U2~a,|GI&GI| GU7,


In [18]:
A.show(results, showGraphics=False, condenseType="tablet")

In an other chapter of this tutorial, [steps](steps.ipynb) we encounter a grapheme with a double prime.
There is only one, and we showed the tablet on which it occurs, without highlighting the grapheme in question.
Now we can do the highlight:

In [19]:
results = A.search(
    """
sign prime=2
"""
)

  0.04s 1 result


In [20]:
A.show(results)

## Search for spatial patterns
A few words on the construction of search templates.

The idea is that you mimic the things you are looking for
in your search template.
Embedded things are mimicked by indentation.

Let's search for a line with a case in it that is not further divided,
in which there is a numeral and an ideograph.

Here is our first attempt, and we show the first tablet only.
Note that you can have comments in a search template.
Lines that start with `#` are ignored.

In [21]:
query = """
line
  case terminal=1
% order is not important
    sign type=ideograph
    sign type=numeral
"""
results = A.search(query)

  0.14s 10673 results


First a glance at the first 3 items in tabular view.

In [22]:
A.table(results, end=2, showGraphics=False)

n,p,tablet,sign,sign.1,sign.2,line,sign.3,case,sign.4,sign.5,case.1,sign.6,sign.7,Unnamed: 14,Unnamed: 15,Unnamed: 16
1,P448702 obverse:1:2,P448702,3(N01),KASZ~a,GI,2a'[N] 2(N14) 3(N01) KASZ~b NUN~a 2b'3(N01) KASZ~a GI,3(N01),2b'3(N01) KASZ~a GI,N,2(N14),2a'[N] 2(N14) 3(N01) KASZ~b NUN~a,KASZ~b,NUN~a,,,
2,P471695 obverse:2:1,P471695,2b1(3(N57) PAP~a)a,1(N01),ISZ~a,1a1(N01) ISZ~a,3(N01),APIN~a,3(N57),UR4~a,1a3(N01) APIN~a 3(N57) UR4~a 1b1b1(EN~a DU ZATU759)a 1b2(BAN~b KASZ~c)a 1b3(KI@n SAG)a,2a1(N14) 2(N01) [...] 2b2b1(3(N57) PAP~a)a 2b2 (SZU KI X)a 2b3'(EN~a AN EZINU~d)a 2b4' (IDIGNA [...])a,1a1(N01) ISZ~a 1b1b1 (PAP~a GIR3~c)a,3(N57),PAP~a,1a3(N01) APIN~a 3(N57) UR4~a


Ah, we were still in condensed mode.

For this query the table is more perspicuous in normal mode, so we tell not to condense.

In [23]:
A.table(results, condensed=False, end=7, showGraphics=False)

n,p,line,case,sign,sign.1
1,P448702 obverse:1:2,2a'[N] 2(N14) 3(N01) KASZ~b NUN~a 2b'3(N01) KASZ~a GI,2a'[N] 2(N14) 3(N01) KASZ~b NUN~a,KASZ~b,N
2,P448702 obverse:1:2,2a'[N] 2(N14) 3(N01) KASZ~b NUN~a 2b'3(N01) KASZ~a GI,2a'[N] 2(N14) 3(N01) KASZ~b NUN~a,KASZ~b,2(N14)
3,P448702 obverse:1:2,2a'[N] 2(N14) 3(N01) KASZ~b NUN~a 2b'3(N01) KASZ~a GI,2a'[N] 2(N14) 3(N01) KASZ~b NUN~a,KASZ~b,3(N01)
4,P448702 obverse:1:2,2a'[N] 2(N14) 3(N01) KASZ~b NUN~a 2b'3(N01) KASZ~a GI,2a'[N] 2(N14) 3(N01) KASZ~b NUN~a,NUN~a,N
5,P448702 obverse:1:2,2a'[N] 2(N14) 3(N01) KASZ~b NUN~a 2b'3(N01) KASZ~a GI,2a'[N] 2(N14) 3(N01) KASZ~b NUN~a,NUN~a,2(N14)
6,P448702 obverse:1:2,2a'[N] 2(N14) 3(N01) KASZ~b NUN~a 2b'3(N01) KASZ~a GI,2a'[N] 2(N14) 3(N01) KASZ~b NUN~a,NUN~a,3(N01)
7,P448702 obverse:1:2,2a'[N] 2(N14) 3(N01) KASZ~b NUN~a 2b'3(N01) KASZ~a GI,2b'3(N01) KASZ~a GI,KASZ~a,3(N01)


Now the results on the first tablet, condensed by line.

In [24]:
A.show(results, end=1, condenseType="line")

The order between the two signs is not defined by the template,
despite the fact that the line with the ideograph
precedes the line with the numeral.
Results may have the numeral and the ideograph in any order.

In fact, the highlights above represent multiple results.
If a case has say 2 numerals and 3 ideographs, there are 6 possible
pairs.

By default, results are shown in *condensed* mode.
That means that results are shown per tablet, and on the result tablets
everything that is in some result is being highlighted.

It is also possible to see the uncondensed results.
That gives you an exact picture of each real result constellation.

In order to illustrate the difference, we focus on one tablet and one case.
This case has 3 numerals and 2 ideographs, so we expect 6 results.

In [25]:
query = """
tablet catalogId=P448702
  line
    case terminal=1 number=2a
      sign type=ideograph
      sign type=numeral
"""
results = A.search(query)

  0.10s 6 results


We show them condensed (by default), so we expect 1 line with all ideographs and numerals in case `2a'` highlighted.

In [26]:
A.show(results, showGraphics=False, condenseType="line")

Now the same results in uncondensed mode. Expect 6 times the same line with
different highlighted pairs of signs.

Note that we can apply different highlight colours to different parts of the result.
The words in the pair are member 4 and 5.

The members that we do not map, will not be highlighted.
The members that we map to the empty string will be highlighted with the default color.

**NB:** Choose your colours from the
[CSS specification](https://developer.mozilla.org/en-US/docs/Web/CSS/color_value).

In [27]:
A.displaySetup(
    condensed=False,
    skipCols="1",
    colorMap={2: "", 3: "cyan", 4: "magenta"},
    showGraphics=False,
    condenseType="line",
    queryFeatures=False,
)

In [28]:
A.show(results)

Color mapping works best for uncondensed results. If you condense results, some nodes may occupy
different positions in different results. It is unpredictable which color will be used
for such nodes:

In [29]:
A.show(results, condensed=True)

In [30]:
A.displayReset()

You can enforce order.
We modify the template a little to state a
relational condition, namely that the ideograph follows the numeral.

In [31]:
query = """
tablet catalogId=P448702
  line
    case terminal=1 number=2a
      sign type=ideograph
      > sign type=numeral
"""
results = A.search(query)
A.table(results, condensed=False, showGraphics=False)

  0.11s 6 results


n,p,tablet,line,case,sign,sign.1
1,P448702 obverse:1:2,P448702,2a'[N] 2(N14) 3(N01) KASZ~b NUN~a 2b'3(N01) KASZ~a GI,2a'[N] 2(N14) 3(N01) KASZ~b NUN~a,KASZ~b,N
2,P448702 obverse:1:2,P448702,2a'[N] 2(N14) 3(N01) KASZ~b NUN~a 2b'3(N01) KASZ~a GI,2a'[N] 2(N14) 3(N01) KASZ~b NUN~a,KASZ~b,2(N14)
3,P448702 obverse:1:2,P448702,2a'[N] 2(N14) 3(N01) KASZ~b NUN~a 2b'3(N01) KASZ~a GI,2a'[N] 2(N14) 3(N01) KASZ~b NUN~a,KASZ~b,3(N01)
4,P448702 obverse:1:2,P448702,2a'[N] 2(N14) 3(N01) KASZ~b NUN~a 2b'3(N01) KASZ~a GI,2a'[N] 2(N14) 3(N01) KASZ~b NUN~a,NUN~a,N
5,P448702 obverse:1:2,P448702,2a'[N] 2(N14) 3(N01) KASZ~b NUN~a 2b'3(N01) KASZ~a GI,2a'[N] 2(N14) 3(N01) KASZ~b NUN~a,NUN~a,2(N14)
6,P448702 obverse:1:2,P448702,2a'[N] 2(N14) 3(N01) KASZ~b NUN~a 2b'3(N01) KASZ~a GI,2a'[N] 2(N14) 3(N01) KASZ~b NUN~a,NUN~a,3(N01)


Still six results.
No wonder, because the case has first three numerals in a row and then 2 ideographs.

Do you want the ideograph and the numeral to be *adjacent* as well?
We only have to add 1 character to the template to make it happen.

In [32]:
query = """
tablet catalogId=P448702
  line
    case terminal=1 number=2a
      sign type=ideograph
      :> sign type=numeral
"""
results = A.search(query)

  0.12s 1 result


In [33]:
A.table(results, condensed=False, showGraphics=False)

n,p,tablet,line,case,sign,sign.1
1,P448702 obverse:1:2,P448702,2a'[N] 2(N14) 3(N01) KASZ~b NUN~a 2b'3(N01) KASZ~a GI,2a'[N] 2(N14) 3(N01) KASZ~b NUN~a,KASZ~b,3(N01)


In [34]:
A.displaySetup(
    condensed=False,
    skipCols="1",
    colorMap={2: "", 3: "cyan", 4: "magenta"},
    showGraphics=False,
    condenseType="line",
    queryFeatures=False,
)

In [35]:
A.show(results, condensed=False)

In [36]:
A.displayReset()

By now it pays off to study the possibilities of
[search templates](https://annotation.github.io/text-fabric/tf/about/searchusage.html).

If you want a reminder of all possible spatial relationships between nodes, you can call it up
here in your notebook:

In [37]:
S.relationsLegend()

                      = left equal to right (as node)
                      # left unequal to right (as node)
                      < left before right (in canonical node ordering)
                      > left after right (in canonical node ordering)
                     == left occupies same slots as right
                     && left has overlapping slots with right
                     ## left and right do not have the same slot set
                     || left and right do not have common slots
                     [[ left embeds right
                     ]] left embedded in right
                     << left completely before right
                     >> left completely after right
                     =: left and right start at the same slot
                     := left and right end at the same slot
                     :: left and right start and end at the same slot
                     <: left immediately before right
                     :> left immediately after right
   

## Comparisons in templates: cases

Cases have a feature depth which indicate their nesting depth within a line.
It is not the depth *of* that case, but the depth *at* which that case occurs.

Comparison queries are handy to select cases of a certain minimum or maximum depth.

We'll work a lot with `condensed=False`, and `lineart` likewise, so let's make that the default:

In [38]:
A.displaySetup(condensed=False, showGraphics=False)

In [39]:
query = """
case depth=3
"""
results = A.search(query)
A.table(results, end=10)

  0.00s 254 results


n,p,case
1,P003357 obverse:1:1,1b1AEN~a ZATU759 DU
2,P003357 obverse:1:1,1b1B3(N57) SU~a
3,P003537 obverse:5:4,4b1A3(N57) X SZA U4 [...] X
4,P003537 obverse:5:4,4b1BX X
5,P003537 obverse:5:4,4b2A2(N57) GAN~b SZU [...]
6,P003537 obverse:5:4,4b2BX [...]
7,P003589 obverse:1:3,3b2A|GA~a.ZATU753|
8,P003589 obverse:1:3,3b2BMUD [...]
9,P003822 obverse:1:1,1a2A[...] [...]
10,P003822 obverse:1:1,1a2B[...] PAP~a SU~a


Are there deeper cases?

In [40]:
query = """
case depth>3
"""
results = A.search(query)
A.table(results, end=10)

  0.00s 119 results


n,p,case
1,P004735 obverse:2:1,1b1B1(NAB DI |BU~a+DU6~a|)a
2,P004735 obverse:2:1,1b1B2(ZI~a#? AN)a
3,P004735 obverse:2:1,1b1B3(ANSZE~e 7(N57) DUR2 DU)a
4,P004735 obverse:2:1,1b1B4(LAL3~a#? GAR IG~b)a
5,P004735 obverse:2:2,2b2B1(GI6 KISZIK~a# URI3~a)a
6,P004735 obverse:2:2,2b2B2([...])a
7,P218054 reverse:1:1,1a1A1[...] 5(N01) [...] UDU~a
8,P218054 reverse:1:1,1a1A2[...] 7(N01) MASZ2
9,P325754 reverse:1:1,1c2b11(N01) [...]
10,P325754 reverse:1:1,1c2b21(N14) 7(N01) TUR


Still deeper?

In [41]:
query = """
case depth>4
"""
results = A.search(query)
A.table(results, end=10)

  0.00s 0 results


As a check: the cases with depth 4 should be exactly the cases with depth > 3:

In [42]:
query = """
case depth=4
"""
results = A.search(query)
A.table(results, end=10)
tc4 = len(results)

  0.01s 119 results


n,p,case
1,P004735 obverse:2:1,1b1B1(NAB DI |BU~a+DU6~a|)a
2,P004735 obverse:2:1,1b1B2(ZI~a#? AN)a
3,P004735 obverse:2:1,1b1B3(ANSZE~e 7(N57) DUR2 DU)a
4,P004735 obverse:2:1,1b1B4(LAL3~a#? GAR IG~b)a
5,P004735 obverse:2:2,2b2B1(GI6 KISZIK~a# URI3~a)a
6,P004735 obverse:2:2,2b2B2([...])a
7,P218054 reverse:1:1,1a1A1[...] 5(N01) [...] UDU~a
8,P218054 reverse:1:1,1a1A2[...] 7(N01) MASZ2
9,P325754 reverse:1:1,1c2b11(N01) [...]
10,P325754 reverse:1:1,1c2b21(N14) 7(N01) TUR


Terminal cases at depth 1 are top-level divisions of lines that are not themselves divided further.

In [43]:
query = """
case depth=1 terminal
"""
results = A.search(query)
A.table(results, end=10)
tc1 = len(results)

  0.01s 5468 results


n,p,case
1,P448702 obverse:1:2,2a'[N] 2(N14) 3(N01) KASZ~b NUN~a
2,P448702 obverse:1:2,2b'3(N01) KASZ~a GI
3,P471695 obverse:1:1,1a3(N01) APIN~a 3(N57) UR4~a
4,P471695 obverse:1:2,2a1(N14) 2(N01) [...]
5,P471695 obverse:2:1,1a1(N01) ISZ~a
6,P482083 obverse:1:1,1a'[...] 1(N14) [...] SZE~a
7,P482083 obverse:1:1,1b'[N] TAR~a
8,P482083 obverse:1:2,2a'3(N01) SZE~a KASZ~b |U4x3(N01)|
9,P482083 obverse:1:2,2b' 1(N42~a) 1(N25) TAR~a
10,P006438 obverse:1:2,2aKU6~a BU~a


Now let us select both the terminal cases of level 1 and 4.
They are disjunct, so the amounts should add up.

In [44]:
query = """
case depth=1|4 terminal
"""
results = A.search(query)
A.table(results, end=10)
tc14 = len(results)
print(f"{tc1} + {tc4} = {tc1 + tc4} = {tc14}")

  0.01s 5587 results


n,p,case
1,P448702 obverse:1:2,2a'[N] 2(N14) 3(N01) KASZ~b NUN~a
2,P448702 obverse:1:2,2b'3(N01) KASZ~a GI
3,P471695 obverse:1:1,1a3(N01) APIN~a 3(N57) UR4~a
4,P471695 obverse:1:2,2a1(N14) 2(N01) [...]
5,P471695 obverse:2:1,1a1(N01) ISZ~a
6,P482083 obverse:1:1,1a'[...] 1(N14) [...] SZE~a
7,P482083 obverse:1:1,1b'[N] TAR~a
8,P482083 obverse:1:2,2a'3(N01) SZE~a KASZ~b |U4x3(N01)|
9,P482083 obverse:1:2,2b' 1(N42~a) 1(N25) TAR~a
10,P006438 obverse:1:2,2aKU6~a BU~a


5468 + 119 = 5587 = 5587


## Relational patterns: quads

Quads are compositions of signs by means of *operators*, such as `.` and `x`.
The operators are coded as an *edge* feature with values. The `op`-edges are between the signs/quads that are combined,
and the values of the `op` edges are the names of the operators in question.

Which operators do we have?

In [45]:
for (op, freq) in E.op.freqList():
    print(f"{op} : {freq:>5}x")

x :  2346x
. :  1042x
& :   222x
+ :   200x


Between how many sign pairs do we have an operator?

In [46]:
query = """
sign
-op> sign
"""
results = A.search(query)

  0.06s 3642 results


Lets specifically ask for the `x` operator:

In [47]:
query = """
sign
-op=x> sign
"""
results = A.search(query)

  0.06s 2238 results


Less than expected?

We must not forget the combinations between quads and between quads and signs.

We write a function that gives all pairs of sign/quads connected by a specific operator.

This is a fine illustration of how you can use programming to compose search templates,
instead of writing them out yourself.

In [48]:
def getCombi(op):
    types = ("sign", "quad")
    allResults = []
    for type1 in types:
        for type2 in types:
            query = f"""
{type1}
-op{op}> {type2}
"""
            results = A.search(query, silent=True)
            print(f"{len(results):>5} {type1} {op} {type2}")
            allResults += results
    print(f"{len(allResults):>5} {op}")

Now we can count all combinations with `x`:

In [49]:
getCombi("=x")

 2238 sign =x sign
  105 sign =x quad
    3 quad =x sign
    0 quad =x quad
 2346 =x


In [50]:
getCombi("=.")

  985 sign =. sign
   43 sign =. quad
   14 quad =. sign
    0 quad =. quad
 1042 =.


In [51]:
getCombi("=&")

  220 sign =& sign
    1 sign =& quad
    0 quad =& sign
    1 quad =& quad
  222 =&


In [52]:
getCombi("=+")

  199 sign =+ sign
    0 sign =+ quad
    0 quad =+ sign
    1 quad =+ quad
  200 =+


In exact agreement with the results of `E.op.freqList()` above.
But we are more flexible!

We can ask for more operators at the same time.

In [53]:
getCombi("=x|+")

 2437 sign =x|+ sign
  105 sign =x|+ quad
    3 quad =x|+ sign
    1 quad =x|+ quad
 2546 =x|+


In [54]:
getCombi("~[^a-z]")

 1404 sign ~[^a-z] sign
   44 sign ~[^a-z] quad
   14 quad ~[^a-z] sign
    2 quad ~[^a-z] quad
 1464 ~[^a-z]


Finally, we zoom in on the rare cases where the operator is `x` used between a quad and a sign.
We want to see the show the lines where they occur.

In [55]:
query = """
line
  quad
  -op=x> sign
"""
results = A.search(query)
A.show(results, withNodes=True, showGraphics=True, condenseType="line")

  0.04s 3 results


Hint: if you want to see where these lines come from, hover over the line indicator, or click on it.

Alternatively, you can set the condense type to tablet.
And note that we have set the base type to `quad`, so that the pretty display does not unravel the quads.

In [56]:
A.show(
    results, withNodes=True, showGraphics=True, condenseType="tablet", baseTypes="quad"
)

## Regular expressions in templates
We can use regular expressions in our search templates.

### Digits in graphemes
We search for non-numeral signs whose graphemes contains digits.

In [57]:
A.displaySetup(condensed=True)

In [58]:
query = """
sign type=ideograph grapheme~[0-9]
"""
results = A.search(query)
A.table(results, withNodes=True, end=5)

  0.09s 14558 results


n,p,tablet,sign,sign.1,Unnamed: 5,Unnamed: 6,Unnamed: 7
1,P448702 obverse:2:1,143892 P448702,75 U4,76 U4,,,
2,P448703 obverse:1:4,143893 P448703,97 U4,100 U4,87 U4,90 U4,93 U4
3,P471695 obverse:1:1,143894 P471695,114 ZATU759,140 GIR3~c,111 UR4~a,,
4,P482082 obverse:1:2,143895 P482082,155 ZATU694~c,,,,
5,P482083 obverse:1:2,143896 P482083,169 U4,,,,


We can add a bit more context easily:

In [59]:
query = """
tablet
  face
    column
      line
        sign type=ideograph grapheme~[0-9]
"""
results = A.search(query)
A.table(results, condensed=False, end=10)

  0.14s 14558 results


n,p,tablet,face,column,line,sign
1,P448702 obverse:2:1,P448702,obverse,P448702 obverse:2,U4 |U4x1(N01)| SAG SUKUD@h NA,U4
2,P448702 obverse:2:1,P448702,obverse,P448702 obverse:2,U4 |U4x1(N01)| SAG SUKUD@h NA,U4
3,P448703 obverse:1:1,P448703,obverse,P448703 obverse:1,|U4.1(N08)| X,U4
4,P448703 obverse:1:2,P448703,obverse,P448703 obverse:1,|U4.1(N08)| GI,U4
5,P448703 obverse:1:3,P448703,obverse,P448703 obverse:1,|U4.1(N08)| |GI&GI|,U4
6,P448703 obverse:1:4,P448703,obverse,P448703 obverse:1,|U4.1(N08)| X,U4
7,P448703 obverse:1:5,P448703,obverse,P448703 obverse:1,|U4.1(N08)| X,U4
8,P471695 obverse:1:1,P471695,obverse,P471695 obverse:1,1a3(N01) APIN~a 3(N57) UR4~a 1b1b1(EN~a DU ZATU759)a 1b2(BAN~b KASZ~c)a 1b3(KI@n SAG)a,UR4~a
9,P471695 obverse:1:1,P471695,obverse,P471695 obverse:1,1a3(N01) APIN~a 3(N57) UR4~a 1b1b1(EN~a DU ZATU759)a 1b2(BAN~b KASZ~c)a 1b3(KI@n SAG)a,ZATU759
10,P471695 obverse:2:1,P471695,obverse,P471695 obverse:2,1a1(N01) ISZ~a 1b1b1 (PAP~a GIR3~c)a,GIR3~c


### Pit numbers

The feature `excavation` gives you the number of the pit where a tablet is found.
The syntax of pit numbers is a bit involved, here are a few possible values:

```
W 20497
W 20335,3
W 19948,10
W 20493,26
W 17890,b
W 17729,o
W 15920,b5
W 17729,aq
W 19548,a + W 19548,b
W 17729,cn + W 17729,eq
W 14337,a + W 14337,b + W 14337,c + W 14337,d + W 14337,e
Ashm 1928-445b
```

Let's assume we are interested in `SZITA~a1` signs occurring in cases of depth 1.
The following query finds them all:

In [60]:
query = """
tablet
  case depth=1
    sign grapheme=SZITA variant=a1
"""
results = A.search(query)

  0.05s 78 results


Now we want to organize them by excavation number:

In [61]:
signPerPit = {}

for (tablet, case, sign) in sorted(results):
    pit = F.excavation.v(tablet) or "no pit information"
    signPerPit.setdefault(pit, []).append(sign)

for pit in sorted(signPerPit):
    print(f"{pit:<30} {len(signPerPit[pit]):>2}")

Ashm 1926,562                   1
Ashm 1926,567                   1
Ashm 1926,569                  13
Ashm 1926,695+737+741           6
Ashm 1926,716+732               1
Ashm 1926,739                   1
W 14731,z                       1
W 14777,c                       4
W 15776,i                       1
W 15785,a2                      1
W 15833,a01 + W 15833,aa04      1
W 15897,b5                      1
W 15897,c26                     1
W 20274,001                     1
W 20274,043                     1
W 20274,095                     2
W 20274,119                     1
W 20327,01                      1
W 20327,03                      1
W 20511,01                      1
W 20511,02                      6
W 21157                         1
W 21194                         1
W 21733,1                       3
W 22100,01                      4
W 22100,03                      5
W 22101,1                       1
W 23950                         1
W 23973,01                      1
W 24033,05    

We can restrict results to those on tablets found in certain pits by constraining the search template.
If we are interested in pit `20274` we can use a regular expression that matches all 4 detailed pit numbers
based on `20274`.
So, we do not say

```
excavation=20274
```
but

```
excavation~20274
```

In [62]:
query = """
tablet excavation~20274
  case depth=1
    sign grapheme=SZITA variant=a1
"""
results = A.search(query)
A.table(results, condensed=False, showGraphics=False)

  0.05s 5 results


n,p,tablet,case,sign
1,P003617 obverse:2:2,P003617,2bSZITA~a1 BU~a,SZITA~a1
2,P003499 obverse:1:2,P003499,2aGAL~a SZITA~a1,SZITA~a1
3,P003541 obverse:2:1,P003541,1bGESZTU~b SZITA~a1 ZATU686~a,SZITA~a1
4,P003593 obverse:5:2,P003593,2a[...] GADA~a SZITA~a1 X,SZITA~a1
5,P003593 obverse:5:3,P003593,3bGESZTU~b SZITA~a1 ZATU686~a,SZITA~a1


Or if we want to restrict ourselves to pit numbers with a `W`, we can say:

In [63]:
query = """
tablet excavation~W
  case depth=1
    sign grapheme=SZITA variant=a1
"""
results = A.search(query)

  0.06s 42 results


## Quantifiers in templates

So far we have seen only very positive templates.
They express what you want to see in the result.

It is also possible to state conditions about what you do not want to see in the results.

### Tablets without case divisions

Let's find all tablets in which all lines are undivided, i.e. lines without cases.

In [64]:
query = """
tablet
/without/
  case
/-/
"""

The expression

```
/without/
template
/-/
```

is a [quantifier](https://annotation.github.io/text-fabric/tf/about/searchusage.html#quantifiers).

It poses a condition on the preceding line in the template, in this case the `tablet`.
And the condition is that the template

```
tablet
  case
```

does not have results.

In [65]:
results = A.search(query)

  0.01s 5384 results


In [66]:
A.show(results, end=2)

Now let's find cases without numerals.

In [67]:
query = """
case
/without/
  sign type=numeral
/-/
"""
results = A.search(query)

  0.06s 2833 results


We show a few.

In [68]:
A.show(results, end=2)

Now we can use this to get something more sophisticated: the tablets that do not have numerals in their cases. So only undivided lines may contain numerals.

Let's find tablets that do have cases, but just no cases with numerals.

In [69]:
query = """
tablet
/where/
  case
/have/
  /without/
    sign type=numeral
  /-/
/-/
/with/
  case
/-/
"""

In [70]:
results = A.search(query)

  0.01s 53 results


In [71]:
A.show(results, end=2)

Can we find such tablet which do have numerals on their undivided lines.

We show here a way to use the results of one query in another one:
*custom sets*.

We put the set of tablets with cases but without numerals in cases in a set called `cntablet`.

We run the query again, but now in shallow mode, so that the result is a set.

By the way: read more about custom sets and shallow mode in the description of
[`A.search()`](https://annotation.github.io/text-fabric/tf/search/search.html#tf.search.search.Search.search).

In [72]:
results = A.search(query, shallow=True)
customSets = dict(cntablet=results)

  0.01s 53 results


Now we can perform a very simple query for numerals on this set: we want tablets with numerals.
By restricting ourselves to this set, we now that these numerals must occur on undivided lines.

In [73]:
query = """
cntablet
  sign type=numeral
"""
results = A.search(query, sets=customSets)

  0.06s 160 results


In [74]:
A.show(results, end=2, queryFeatures=False)

We could have found these results by one query as well.
Judge for yourself which method causes the least friction.

In [75]:
query = """
tablet
/without/
  case
    sign type=numeral
/-/
/with/
  case
/-/
  sign type=numeral
"""
results = A.search(query)
A.show(results, end=2, queryFeatures=False)

  0.06s 160 results


## More ...

The capabilities of search are endless.
Often it is the quickest way to focus on a phenomenon, quicker than hand coding all the logic
to retrieve your patterns.

That said, it is not a matter of either-or. You can use coding to craft your templates,
and you can use coding to process your results.

It's an explosive mix. A later chapter in this tutorial shows
even more [cases](cases.ipynb).

Have another look at
[the manual](https://annotation.github.io/text-fabric/tf/about/searchusage.html).

# Next

[calc](calc.ipynb)

*A tablet calculator ...*

All chapters:
[start](start.ipynb)
[imagery](imagery.ipynb)
[steps](steps.ipynb)
**search**
[calc](calc.ipynb)
[signs](signs.ipynb)
[quads](quads.ipynb)
[jumps](jumps.ipynb)
[cases](cases.ipynb)

---

CC-BY Dirk Roorda