<img align="right" src="images/tf-small.png" width="128"/>
<img align="right" src="images/etcbc.png"/>
<img align="right" src="images/dans-small.png"/>

# Tutorial

This notebook gets you started with using
[Text-Fabric](https://annotation.github.io/text-fabric/) for coding in the Hebrew Bible.

If you are totally new to Text-Fabric, it might be helpful to read about the underlying
[data model](https://annotation.github.io/text-fabric/Model/Data-Model/) first.

Short introductions to other TF datasets:

* [Dead Sea Scrolls](https://nbviewer.jupyter.org/github/annotation/tutorials/blob/master/lorentz2020/dss.ipynb),
* [Old Babylonian Letters](https://nbviewer.jupyter.org/github/annotation/tutorials/blob/master/lorentz2020/oldbabylonian.ipynb),
or the 
* [Q'uran](https://nbviewer.jupyter.org/github/annotation/tutorials/blob/master/lorentz2020/quran.ipynb)


# Export to Excel

In a notebook, you can perform searches and view them in a tabular display and zoom in on items with
pretty displays.

But there are times that you want to take your results outside Text-Fabric, outside a notebook, outside Python, and just
work with them in other programs, such as Excel.

You want to do that not only with query results, but with all kinds of lists of tuples of nodes.

There is a function for that, `A.export()`, and here we show what it can do.

In [1]:
%load_ext autoreload
%autoreload 2

# Incantation

The ins and outs of installing Text-Fabric, getting the corpus, and initializing a notebook are
explained in the [start tutorial](start.ipynb).

In [2]:
import os
from tf.app import use

In [5]:
#A = use('bhsa', hoist=globals())
A = use('bhsa:clone', checkout="clone", hoist=globals())

Using TF-app in /Users/dirk/github/annotation/app-bhsa/code:
	repo clone offline under ~/github (local github)
Using data in /Users/dirk/github/etcbc/bhsa/tf/c:
	repo clone offline under ~/github (local github)
Using data in /Users/dirk/github/etcbc/phono/tf/c:
	repo clone offline under ~/github (local github)
Using data in /Users/dirk/github/etcbc/parallels/tf/c:
	repo clone offline under ~/github (local github)
   |     0.00s Dataset without structure sections in otext:no structure functions in the T-API


# Inspect the contents of a file
We write a function that can peek into file on your system, and show the first few lines.
We'll use it to inspect the exported files that we are going to produce.

In [6]:
EXPORT_FILE = os.path.expanduser('~/Downloads/results.tsv')
UPTO = 10

def checkout():
    with open(EXPORT_FILE, encoding='utf_16') as fh:
        for (i, line) in enumerate(fh):
            if i >= UPTO:
                break
            print(line)

# Encoding

Our exported `.tsv` files open in Excel without hassle, even if they contain non-latin characters.
That is because TF writes such files in an
encoding that works well with Excel: `utf_16_le`.
You can just open them in Excel, there is no need for conversion before or after opening these files.

Should you want to process these files by means of a (Python) program, 
take care to read them with encoding `utf_16`.

# Example query

We first run a query in order to export the results.

In [7]:
query = '''
book book=Samuel_I
  clause
    word sp=nmpr
'''
results = A.search(query)

  0.55s 1868 results


# Bare export

You can export the table of results to Excel.

The following command writes a tab-separated file `results.tsv` to your downloads directory.

You can specify arguments `toDir=directory` and `toFile=file name` to write to a different file.
If the directory does not exist, it will be created.

We stick to the default, however.

In [9]:
A.export(results)

Check out the contents:

In [10]:
checkout()

R	S1	S2	S3	NODE1	TYPE1	book1	NODE2	TYPE2	TEXT2	NODE3	TYPE3	TEXT3	sp3

1	1_Samuel	1	1	426592	book	Samuel_I	453942	clause	וַיְהִי֩ אִ֨ישׁ אֶחָ֜ד מִן־הָרָמָתַ֛יִם צֹופִ֖ים מֵהַ֣ר אֶפְרָ֑יִם 	141547	word	אֶפְרָ֑יִם 	nmpr

2	1_Samuel	1	1	426592	book	Samuel_I	453943	clause	וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ 	141550	word	אֶ֠לְקָנָה 	nmpr

3	1_Samuel	1	1	426592	book	Samuel_I	453943	clause	וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ 	141552	word	יְרֹחָ֧ם 	nmpr

4	1_Samuel	1	1	426592	book	Samuel_I	453943	clause	וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ 	141554	word	אֱלִיה֛וּא 	nmpr

5	1_Samuel	1	1	426592	book	Samuel_I	453943	clause	וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ 	141556	word	תֹּ֥חוּ 	nmpr

6	1_Samuel	1	1	426592	book	Samuel_I	453943	clause	וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ

You see the following columns:

* **R** the sequence number of the result tuple in the result list
* **S1 S2 S3** the section as book, chapter, verse, in separate columns
* **NODEi TYPEi** the node and its type, for each node **i** in the result tuple
* **TEXTi** the full text of node **i**, if the node type admits a concise text representation
* **sp3** the value of feature **3**, since our query mentions the feature `sp` on node 3

# Richer exports

If we want to see the clause type (feature `typ`) and the word gender (feature `gn`) as well, we must mention them
in the query. 

We can do so as follows:

In [11]:
query = '''
book book=Samuel_I
  clause typ*
    word sp=nmpr gn*
'''
results = A.search(query)

  1.03s 1868 results


The same number of results as before. 
The `*` is a trivial condition, it is always true.

We do the export again and peek at the results.

In [12]:
A.export(results)
checkout()

R	S1	S2	S3	NODE1	TYPE1	book1	NODE2	TYPE2	TEXT2	typ2	NODE3	TYPE3	TEXT3	gn3	sp3

1	1_Samuel	1	1	426592	book	Samuel_I	453942	clause	וַיְהִי֩ אִ֨ישׁ אֶחָ֜ד מִן־הָרָמָתַ֛יִם צֹופִ֖ים מֵהַ֣ר אֶפְרָ֑יִם 	WayX	141547	word	אֶפְרָ֑יִם 	unknown	nmpr

2	1_Samuel	1	1	426592	book	Samuel_I	453943	clause	וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ 	NmCl	141550	word	אֶ֠לְקָנָה 	m	nmpr

3	1_Samuel	1	1	426592	book	Samuel_I	453943	clause	וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ 	NmCl	141552	word	יְרֹחָ֧ם 	m	nmpr

4	1_Samuel	1	1	426592	book	Samuel_I	453943	clause	וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ 	NmCl	141554	word	אֱלִיה֛וּא 	m	nmpr

5	1_Samuel	1	1	426592	book	Samuel_I	453943	clause	וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ 	NmCl	141556	word	תֹּ֥חוּ 	m	nmpr

6	1_Samuel	1	1	426592	book	Samuel_I	453943	clause	וּשְׁמֹ֡ו אֶ֠לְקָנָ

As you see, you have an extra column **typ2** and **gn3**.

This gives you a lot of control over the generation of spreadsheets.

# Not from queries

You can also export lists of node tuples that are not obtained by a query:

In [13]:
tuples = (
    tuple(results[0][1:3]),
    tuple(results[1][1:3]),
)

tuples

((453942, 141547), (453943, 141550))

Two rows, each row has a clause node and a word node.

Let's do a bare export:

In [14]:
A.export(tuples)
checkout()

R	S1	S2	S3	NODE1	TYPE1	TEXT1	book1	NODE2	TYPE2	TEXT2	typ2

1	1_Samuel	1	1	453942	clause	וַיְהִי֩ אִ֨ישׁ אֶחָ֜ד מִן־הָרָמָתַ֛יִם צֹופִ֖ים מֵהַ֣ר אֶפְרָ֑יִם 		141547	word	אֶפְרָ֑יִם 	

2	1_Samuel	1	1	453943	clause	וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ 		141550	word	אֶ֠לְקָנָה 	



Wait a minute: why is the `typ2` there?

It is because we have run a query before where we asked for `typ`.

If we do not want to be influenced by previous things we've run, we need to reset the display:

In [15]:
A.displayReset('tupleFeatures')

Again:

In [16]:
A.export(tuples)
checkout()

R	S1	S2	S3	NODE1	TYPE1	TEXT1	NODE2	TYPE2	TEXT2

1	1_Samuel	1	1	453942	clause	וַיְהִי֩ אִ֨ישׁ אֶחָ֜ד מִן־הָרָמָתַ֛יִם צֹופִ֖ים מֵהַ֣ר אֶפְרָ֑יִם 	141547	word	אֶפְרָ֑יִם 

2	1_Samuel	1	1	453943	clause	וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ 	141550	word	אֶ֠לְקָנָה 



# Display setup

We can get richer exports by means of
`A.displaySetup()`, using the parameter `tupleFeatures`:

In [17]:
A.displaySetup(tupleFeatures=(
    (0, 'typ rela'),
    (1, 'sp gn nu pdp'),
))

We assign extra features per member of the tuple.

In the above case:

* the first (`0`) member (the clause node), gets feature `typ`;
* the second (`1`) member (the word node), gets features `sp` and `gn`.

In [18]:
A.export(tuples)
checkout()

R	S1	S2	S3	NODE1	TYPE1	TEXT1	typ1	rela1	NODE2	TYPE2	TEXT2	sp2	gn2	nu2	pdp2

1	1_Samuel	1	1	453942	clause	וַיְהִי֩ אִ֨ישׁ אֶחָ֜ד מִן־הָרָמָתַ֛יִם צֹופִ֖ים מֵהַ֣ר אֶפְרָ֑יִם 	WayX	NA	141547	word	אֶפְרָ֑יִם 	nmpr	unknown	sg	nmpr

2	1_Samuel	1	1	453943	clause	וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ 	NmCl	NA	141550	word	אֶ֠לְקָנָה 	nmpr	m	sg	nmpr



Talking about display setup: other parameters also have effect, e.g. the text format.

Let's change it to the phonetic representation.

In [19]:
A.export(tuples, fmt='text-phono-full')
checkout()

R	S1	S2	S3	NODE1	TYPE1	TEXT1	typ1	rela1	NODE2	TYPE2	TEXT2	sp2	gn2	nu2	pdp2

1	1_Samuel	1	1	453942	clause	wayᵊhˌî ʔˌîš ʔeḥˈāḏ min-hārāmāṯˈayim ṣôfˌîm mēhˈar ʔefrˈāyim 	WayX	NA	141547	word	ʔefrˈāyim 	nmpr	unknown	sg	nmpr

2	1_Samuel	1	1	453943	clause	ûšᵊmˈô ʔelqānˌā ben-yᵊrōḥˈām ben-ʔᵉlîhˈû ben-tˌōḥû ven-ṣˌûf ʔefrāṯˈî . 	NmCl	NA	141550	word	ʔelqānˌā 	nmpr	m	sg	nmpr



# Chained queries

You can chain queries like this:

In [20]:
results = (
    A.search('''
book book=Samuel_I
  chapter chapter=1
    verse verse=1
      clause
        word sp=nmpr
''')
    +
    A.search('''
book book=Samuel_I
  chapter chapter=1
    verse verse=1
      clause
        word sp=verb nu=pl
''')
)

  0.56s 6 results
  0.59s 1 result


In such cases, it is better to setup the features yourself:

In [21]:
A.displaySetup(
    tupleFeatures=(
        (3, 'typ rela'),
        (4, 'sp gn vt vs'),
    ),
    fmt='text-phono-full',
)

Now we can do a fine export:

In [22]:
A.export(results)
checkout()

R	S1	S2	S3	NODE1	TYPE1	NODE2	TYPE2	NODE3	TYPE3	TEXT3	NODE4	TYPE4	TEXT4	typ4	rela4	NODE5	TYPE5	TEXT5	sp5	gn5	vt5	vs5

1	1_Samuel	1	1	426592	book	426856	chapter	1421483	verse	wayᵊhˌî ʔˌîš ʔeḥˈāḏ min-hārāmāṯˈayim ṣôfˌîm mēhˈar ʔefrˈāyim ûšᵊmˈô ʔelqānˌā ben-yᵊrōḥˈām ben-ʔᵉlîhˈû ben-tˌōḥû ven-ṣˌûf ʔefrāṯˈî . 	453942	clause	wayᵊhˌî ʔˌîš ʔeḥˈāḏ min-hārāmāṯˈayim ṣôfˌîm mēhˈar ʔefrˈāyim 	WayX	NA	141547	word	ʔefrˈāyim 	nmpr	unknown	NA	NA

2	1_Samuel	1	1	426592	book	426856	chapter	1421483	verse	wayᵊhˌî ʔˌîš ʔeḥˈāḏ min-hārāmāṯˈayim ṣôfˌîm mēhˈar ʔefrˈāyim ûšᵊmˈô ʔelqānˌā ben-yᵊrōḥˈām ben-ʔᵉlîhˈû ben-tˌōḥû ven-ṣˌûf ʔefrāṯˈî . 	453943	clause	ûšᵊmˈô ʔelqānˌā ben-yᵊrōḥˈām ben-ʔᵉlîhˈû ben-tˌōḥû ven-ṣˌûf ʔefrāṯˈî . 	NmCl	NA	141550	word	ʔelqānˌā 	nmpr	m	NA	NA

3	1_Samuel	1	1	426592	book	426856	chapter	1421483	verse	wayᵊhˌî ʔˌîš ʔeḥˈāḏ min-hārāmāṯˈayim ṣôfˌîm mēhˈar ʔefrˈāyim ûšᵊmˈô ʔelqānˌā ben-yᵊrōḥˈām ben-ʔᵉlîhˈû ben-tˌōḥû ven-ṣˌûf ʔefrāṯˈî . 	453943	clause	ûšᵊmˈô ʔelqānˌā ben-yᵊrōḥˈām ben-ʔᵉlîhˈû ben-

# All steps

Now you now how to escape from Text-Fabric.

We hope that this makes your stay in TF more comfortable.
It's not a *Hotel California*.

* **[start](start.ipynb)** your first step in mastering the bible computationally
* **[display](display.ipynb)** become an expert in creating pretty displays of your text structures
* **[search](search.ipynb)** turbo charge your hand-coding with search templates
* **exportExcel** make tailor-made spreadsheets out of your results
* **[share](share.ipynb)** draw in other people's data and let them use yours
* **[export](export.ipynb)** export your dataset as an Emdros database

CC-BY Dirk Roorda