Please read the entire README before starting your assignment, or asking for help. // ~ Overview ~ // In this assignment, you will use the Yelp review data set to create a program that prints the reviews for a given business name. Since the set of reviews is very large (about 740 MB), you should not load them all into memory. Instead, you will use a more complex structure to track their location on disk and access them as needed. You must do the same for the address information of each business location. Definitions: In this assignment, "business" means a unique business name (e.g., "Meatballz"), and "location" means a specific location of a business (e.g., "Meatballz" at "1935 E Camelback Rd., Phoenix, AZ 85016"). // ~ Learning Goals ~ // (1) Binary (search) trees (2) File I/O (3) Memory management (3) Dynamic structures This assignment counts for the File and Dynamic Structures learning objectives. You must submit one zip file to blackboard. This zip file must contain these two files: (1) answer10.c (2) git.log Please create your zip file using the following command: > zip pa10.zip answer10.c git.log If your zip file does not meet the above specification, then you will get zero for this assignment. // ~ Task ~ // The primary objective is to support a function that retrieves all of the reviews for a given business name, grouped by location. The functions you need to implement are specified in answer10.h. Locations should be sorted by state >> city >> address. Reviews should be sorted by star rating (descending) >> text of review. Sorting and matching must be case-insensitive. You must use a binary search tree for searching the list of businesses. For this assignment, you may need a somewhat more complex structure. Designing that structure is part of the assignment. There are no restrictions other than those explicitly mentioned in the README and answer10.h files. // ~ Data ~ // The data files for this assignment are located at the following path, which is accessible from ecegrid. Please access them using those paths. Do not copy them into your home directory. You will not need to change them. It is recommended that you work from ecegrid. You may download them to your personal computer if you wish, but do not make copies within the ECE cluster. /home/shay/a/ece264p0/share/yelp_data/businesses.tsv /home/shay/a/ece264p0/share/yelp_data/reviews.tsv businesses.tsv contains one row for each location of a business, with each of the following fields, separated by tabs: 1) business ID (an integer) 2) name 3) address 4) city 5) state 6) zip_code 7) full address (address, city, state, zip code in one field) The star rating is not included with the business records. reviews.tsv contains one row for each review of a location. They are guaranteed to be sorted in order of the business ID, but there is no particular order within reviews for a given business ID. They contain the following fields, tab-separated. 1) business ID 2) star rating given with that review 3) review rating: "funny" 4) review rating: "useful" 5) review rating: "cool" 6) review text You will not need all data fields for this assignment. Each row of the above files is separated by a single newline ('\n') character. // ~ How to start ~ // Create an answer10.c file containing each of the functions specified in answer10.h. The most difficult part of this assignment will be to create the create_business_bst(..) function. There are different ways you could do this. Your solution must not load all of the reviews into memory at once at any time. Your data structure will need to store the *file offset* for the review text, as well as the address details for the business locations. When you are reading a file, you can get the current file offset using the ftell(..) function. Then, you can return to a location using the fseek(..) function. See the man page for those functions for the full details. // ~ Tester ~ // You should get used to creating test cases of your own, and using tools like Valgrind directly to verify that there are no errors. However, to give you some confirmation of whether your submission meets the requirements, a test interface (search.c) has been supplied along with test output data obtained from the solution. This is somewhat different from some of the other assignments. Instead of running a tester and getting an answer, you will compile your code with search.c, run it on several test queries, and compare the results with the test data that has been provided. Here's how to get started: 1. Look at the test interface (search.c) to make sure you understand how it works. *** Read the usage examples at the bottom of search.c. *** vim search.c 1. Compile your code with search.c. gcc -o search search.c answer10.c -Wall -Wshadow 2. Try a test query to make sure it works. ./search Meatballz 3. Look at the test output data. Notice that the query (name, state, and zip_code) are listed at the top of each file. cd test vim 00.Boston_Cleaners.NV.89135 4. To compare the output of your code with the test data, look at the top of the test data file, get the parameters (name, state, and zip_code), and use the diff command to compare the output. The diff command tells you which lines are different between two files. You can also compare a file with some data on stdout, which turns out to be convenient for this assignment. The following command compares the output for the first test case. ./search "Boston Cleaners" NV 89135 | diff - test/00.Boston_Cleaners.NV.89135 5. If you see no output, that means there were no differences. You also need to test your code using Valgrind. It is your responsibility to find memory errors. valgrind --tool=memcheck ./search "Boston Cleaners" NV 89135