Please read the entire README before starting your assignment, or asking
for help.

// ~ Overview ~ //

Recursion is an elegant method for solving many problems, and many data
structures are actually easier to manipulate with recursive functions.
In this assignment, we will see how a binary tree (actually any tree
data structure) can utilize the power of recursion.

// ~ Learning Goals ~ //

(1) Recursion
(2) Binary (search) trees
(3) Large datasets
(4) Memory management

You must submit one zip file to blackboard. This zip file must
contain:
(1) answer09.c
(2) git.log

Please create your zip file using the following command:

 > zip pa09.zip answer09.c git.log

If your zip file does not meet the above specification, then you may
get zero for this assignment. YOU HAVE BEEN WARNED.  Following the
instructions is *part* of getting the assignment correct. So please
follow the instructions.

// ~ Determining Your Mark ~ //

The tester program is there to ensure that you have followed the
instructions correctly.

Run the tester program as follows:

 > ./tester

// ~ Binary (Search) Trees ~ //

From Wikipedia: A binary tree is a tree data structure in which each
node has at most two children (referred to as the left child and the
right child).

This is similar (and more general) to "linked lists", where nodes have
at most one child.

Linked List:
    O   head/tail   O   head        O   head        O   head
                    |               |               |
                    V               V               V
                    O   tail        O               O
                                    |               |
                                    V               V
                                    O   tail        O
                                                    |
                                                    V
                                                    O   tail

Binary Tree:
    O   root        O   root       O   root            O   root
                   / \           /   \              /     \
                  O   O         O     O          O           O
                left right     / \   / \       /   \       /   \
                              O   O O   O     O     O     O     O
                                             / \   / \   / \   / \
                                            O   O O   O O   O O   O

The above diagrams show *full* binary trees, where every node has
either zero or two children. It is perfectly acceptable to have a node
with one child somewhere in the tree:

More Binary Trees:
    O   root        O   root       O   root       O   root
                   / \            / \            / \
                  O   O          O   O          O   O
                       \        /                \   \
                        O      O                  O   O

Note that every linked list is (conceptually) also a binary tree.

What makes these structures work so well recursively is their
naturally recurring instances. If a recursive program can handle one
node, then it can handle the entire tree.

This assignment combines the power of binary search trees with a set of data
that is as close to the real world as possible: Data about local businesses.
Users of the website yelp (http://www.yelp.com) can post reviews and
recommendation about their local restaurants and share them with the world.
Yelp hosts about 57 million reviews for businesses catering to 132 million
users a month, who constantly query their servers for information about their
local businesses.

One fundamental problem for companies with huge datasets like Yelp is
retrieving information from a database. The user experience of the Yelp website
would be unacceptable if, hypothetically, all of the data describing the
businesses would be stored in a simple array or linked list: It would mean that
for every search, the server would have to compare every single entry in the
database to the search term.

Of course, this is not how it is done in the real world, and this is where tree
data structures like the binary search tree described above come in.

The dataset you'll be working with is a small fraction of data pulled from
Yelp's servers. The information available to you is:

 - the name of the business
 - the address of the business
 - an average rating of the business on a scale of 1 through 4

You will construct a binary tree from this dataset that is ordered according
to the names of the businesses. Recall that there exists a certain C library
that provides functions to compare two strings to each other. This will create
a tree data structure with the following properties:

(1) Each node's left subtree contains nodes with names "less than" "equal to"
    its own name

(2) Each node's right subtree contains nodes with names "greater than" its
    own name

(3) All left and right subtrees are binary search trees themselves.

The relevant theory for why a binary tree is more efficient than a regular
list is covered in future courses,
but for a teaser, think about this: if you have a collection of N
elements in a linked list, and you want to see if element X is in the
list, then you need to compare X to every element in the list, a total
of N comparisons. However, if you use a binary search tree, you only
need to traverse down a single branch, and on average will make around
log_2(N) comparisons. (i.e., log-base-2 of N.) What it means to
"traverse down a single branch" will become clear as you write the
code for this assignment.

// ~ The Yelp Dataset ~ //

The dataset you'll be working with is a small fraction of the Yelp Academic
dataset, which is itself a fraction of Yelp's database that was made public for
use in academia. More information and a download for the full dataset can be
found here: http://www.yelp.com/dataset_challenge

The data was converted from the JSON format to a text file filled with
tab-seperated values with the extension '.tsv'. There is a lot more information
in the full dataset, but for this assignment, you will only use the average
rating, the name, and the address. The file you're provided with, called
'yelp_businesses.tsv' has information about 42153 businesses in the United
States and should be read line by line.

One line in the .tsv file lists one business and is structured like this:

  [rating]\t[name]\t[address]\n

In order to fill the fields of one node, you need to seperate a line according
to the delimiter '\t'. Remember your explode() function from PA03? It could
come in very handy for this assignment. If you want to use your explode 
function from PA03, then you will need to copy and paste it somewhere into
your answer09.c file. Of course, you don't HAVE to use explode(...) here, as
long as you create the nodes properly and without memory leaks.

// ~ Hints ~ //

(*) Make sure you read the class notes on binary trees.

----------------------------------------------------------------------

(*) Even though a tree is not an array, it is still easy to "iterate"
over all of the elements. Iteration means you want to visit every
element once, and only once. You already know how to do this with an
array:

int myints[] = { 5, 3, 6, 7 };
for(ind = 0; ind < 4; ++ind)
   do_something_with(myints[ind]); // see, visit each element once and only once

With trees, you choose either pre-order, in-order, or post-order
traversal to do the same thing. You *must* know this for the exam.
Please look them up in the class notes.

----------------------------------------------------------------------

(*) If you get stuck getting started, then try writing just
create_node(...), and a print_tree(...) function. You should
then be able to do the following:

// Step 1 //
Create and print trees like so:

int main(int argc, char * * argv)
{
   BusinessNode * root = create_node("5.0", "random name", "random address");
   root->left = create_node("3.5", "another name", "another address");
   root->right = create_node("4.0", "yet another name", "some address");
   root->left->right = create_node("1.5", "name 3", "address 3");
   print_tree(root);
   return 0;
}

// Step 2 //
Write destroy_tree(...). You should now have no memory leaks or
errors.

// Step 3 //
Write tree_insert(...) and make sure it
always works no matter what is thrown at it.

// Step 4 //
Write load_tree_from_file(...), which calls insert in a loop.

// Step 5 //
Write tree_search_name() and try to search for some Businesses that you know
are in the tree.

At this stage you will be reasonably close to completing the assignment.

----------------------------------------------------------------------

(*) To test the load_tree_from_file(...) function, it makes sense to work with
a smaller subset of the data. To read the first 5 lines of
'yelp_businesses.tsv' into a new file called 'shortfile.tsv', use the following
command:

 > cat yelp_businesses.tsv | head -n 5 > shortfile.tsv