Please read the entire README before starting your assignment, or asking
for help.

// ~ Overview ~ //

In this assignment, you will use the Yelp review data set to create a
program that prints the reviews for a given business name.  Since the
set of reviews is very large (about 740 MB), you should not load them
all into memory.  Instead, you will use a more complex structure to
track their location on disk and access them as needed.  You must do
the same for the address information of each business location.

Definitions:  In this assignment, "business" means a unique business
name (e.g., "Meatballz"), and "location" means a specific location of a
business (e.g., "Meatballz" at "1935 E Camelback Rd., Phoenix, AZ 85016").

// ~ Learning Goals ~ //

(1) Binary (search) trees
(2) File I/O
(3) Memory management
(3) Dynamic structures

This assignment counts for the File and Dynamic Structures learning
objectives.

You must submit one zip file to blackboard. This zip file must
contain these two files:
(1) answer10.c
(2) git.log

Please create your zip file using the following command:

 > zip pa10.zip answer10.c git.log

If your zip file does not meet the above specification, then you will
get zero for this assignment.

// ~ Task ~ //

The primary objective is to support a function that retrieves all of the
reviews for a given business name, grouped by location.

The functions you need to implement are specified in answer10.h.

Locations should be sorted by state >> city >> address.  Reviews should be
sorted by star rating (descending) >> text of review.  Sorting and matching
must be case-insensitive.

You must use a binary search tree for searching the list of businesses.
For this assignment, you may need a somewhat more complex structure.
Designing that structure is part of the assignment.

There are no restrictions other than those explicitly mentioned in the
README and answer10.h files.


// ~ Data ~ //

The data files for this assignment are located at the following path,
which is accessible from ecegrid.  Please access them using those paths.
Do not copy them into your home directory.  You will not need to change
them.  It is recommended that you work from ecegrid.  You may download
them to your personal computer if you wish, but do not make copies within
the ECE cluster.

  /home/shay/a/ece264p0/share/yelp_data/businesses.tsv
  /home/shay/a/ece264p0/share/yelp_data/reviews.tsv

businesses.tsv contains one row for each location of a business, with each
of the following fields, separated by tabs:

 1) business ID (an integer)
 2) name
 3) address
 4) city
 5) state
 6) zip_code
 7) full address (address, city, state, zip code in one field)

The star rating is not included with the business records.

reviews.tsv contains one row for each review of a location.  They are guaranteed
to be sorted in order of the business ID, but there is no particular order within
reviews for a given business ID.  They contain the following fields, tab-separated.

 1) business ID
 2) star rating given with that review
 3) review rating: "funny"
 4) review rating: "useful"
 5) review rating: "cool"
 6) review text

You will not need all data fields for this assignment.  Each row of the above files
is separated by a single newline ('\n') character.

// ~ How to start ~ //

Create an answer10.c file containing each of the functions specified in answer10.h.
The most difficult part of this assignment will be to create the create_business_bst(..)
function.  There are different ways you could do this.

Your solution must not load all of the reviews into memory at once at any time.  Your
data structure will need to store the *file offset* for the review text, as well as the
address details for the business locations.  When you are reading a file, you can get
the current file offset using the ftell(..) function.  Then, you can return to
a location using the fseek(..) function.  See the man page for those functions for
the full details.

// ~ Tester ~ //

You should get used to creating test cases of your own, and using tools like Valgrind
directly to verify that there are no errors.  However, to give you some confirmation
of whether your submission meets the requirements, a test interface (search.c)
has been supplied along with test output data obtained from the solution.

This is somewhat different from some of the other assignments.  Instead of
running a tester and getting an answer, you will compile your code with
search.c, run it on several test queries, and compare the results with the test
data that has been provided.

Here's how to get started:

1.  Look at the test interface (search.c) to make sure you understand how it
    works.  *** Read the usage examples at the bottom of search.c. ***
    
	vim search.c

1.  Compile your code with search.c.
    
	gcc -o search search.c answer10.c -Wall -Wshadow

2.  Try a test query to make sure it works.

	./search Meatballz

3.  Look at the test output data.  Notice that the query (name, state, and zip_code)
    are listed at the top of each file.

	cd test
	vim 00.Boston_Cleaners.NV.89135

4.  To compare the output of your code with the test data, look at the top of the
    test data file, get the parameters (name, state, and zip_code), and use the
	diff command to compare the output.  The diff command tells you which lines are
	different between two files.  You can also compare a file with some data on
	stdout, which turns out to be convenient for this assignment.  The following
	command compares the output for the first test case.

	./search "Boston Cleaners" NV 89135 | diff - test/00.Boston_Cleaners.NV.89135

5.  If you see no output, that means there were no differences.  You also need to
	test your code using Valgrind.  It is your responsibility to find memory errors.

	valgrind --tool=memcheck ./search "Boston Cleaners" NV 89135