Project 3 - Animal Crossing Database

This project intends to provide some experience organizing csv files into a database. For this task, you are given a set of csv files that contain information about the Animal Crossing New Horizons Catalog, a popular video game.

According to the description of the dataset:

There are 30 csvs each listing various items, villagers, clothing, and other collectibles from the game. The data was collected by a dedicated group of AC fans who continue to collaborate and build this spreadsheet for public use. The database contains the original data and full list of contributors and raw data. At the time of writing, the only difference between the spreadsheet and this version is that the Kaggle version omits all columns with images of the items, but is otherwise identical.

As you will notice, there is no explicit description of the data. You will need to explore the data and understand it in order to complete the tasks below. You will also need to rely on your own understanding of the data and the relationships between the different tables. You can use the Animal Crossing Wiki to help you understand the data and the relationships between the different tables.

The goal of this project is to create a database that contains all the information in the csv files, and then answer some questions about the data using SQL queries.

Step 0. Accept the assignment

You will be using GitHub Classroom to complete this assignment. You will need to accept the assignment in order to get started. You can do this by clicking on the link below:

Accept the assignment

After clicking and accepting, you need to click in the link for your repository. Click in the green button that says “Code” and copy the link that appears.

To clone the repository, you can use the following command in your terminal:

git clone <repository_url>

Make sure to replace <repository_url> with the URL of your repository.

CD into the directory of your repository:

cd <repository_name>

Make sure to replace <repository_name> with the name of your repository.

From now on you can work using VS Code or any other IDE of your preference. Make sure to create a virtual environment (venv) if you are using Python. You can do this by running the following command in your terminal:

python3 -m venv venv

Then, activate the virtual environment by running the following command:

source venv/bin/activate

From now on, it is in your hands!

All deliverables should be submitted through GitHub Classroom. You will need to create a new repository for this assignment, and then push your code to that repository.

Step 1. Creating the database

You will need to create tables and insert data into them. You must use the sqlite3 module to create a database and tables. You will need to create a table for each csv file, and then insert the data from the csv files into the tables. There are a few ways of doing that. Feel free to use the SQLite CLI or the sqlite3 module in Python.

You can also use the pandas library to read the csv files and insert the data into the database. You can use the to_sql method of a pandas DataFrame to insert the data into the database. You can also use the csv module to read the csv files and insert the data into the database using SQL commands.

Before creating the tables, you should come up with the schema for the database. You should think about the relationships between the different tables and how to represent them in the database. You should also think about the data types of the columns in the tables.

Make sure to document well your database schema and the relationships between the tables. You should also document any assumptions you made about the data and the relationships between the tables. You can use comments in your SQL code to document your schema and the relationships between the tables.

As you will see, this dataset contains only the catalog of the elements of the game. We will keep working in this project to evolve it into a full database of the game. For now, we will only focus on the catalog of items.

The recipes table contains the materials and quantities to build the recipes. Make sure you create foreign keys to other tables so you can relate the recipes to the items. Think of other DIY elements in the other tables as connected to the recipes.

Make sure you treat the NA values as NULL values in the database.

Step 2. Answering questions

Once you have created the database and inserted the data into the tables, you should answer the following questions using SQL queries:

What is the most expensive item in the game?
What is the cheapest item in the game?
What is the most expensive item in each category?
What is the cheapest item in each category?
What is the most expensive item in each color?
What is the cheapest item in each color?
What is the most expensive item in each style?
What is the cheapest item in each style?
What is the most expensive item in the recipes table? How would you calculate that?
What is the cheapest item in the recipes table? How would you calculate that?

Deliverables

A database file with the tables and data from the csv files in SQLite (.db).
A file with the SQL queries you used to answer the questions above (.sql). Use comments to identify the questions you are answering with each query.
A file with the SQL queries you used to create the tables and insert the data into the tables (.sql). If (and only if) you used Python, please include the code you used to create the tables and insert the data into the tables (.py).
A README.md file with the documentation of the database schema and the relationships between the tables. You should also document any assumptions you made about the data and the relationships between the tables. You can use comments in your SQL code to document your schema and the relationships between the tables.

If you are not familiar with Markdown, this is a good time to start learning it. Make sure you create a folder for the figures in your report on README.md. Use DBeaver to generate the schema of your database and include it in your report. You can also use DBeaver to generate the SQL code to create the tables and insert the data into the tables. You can use the “Export” feature of DBeaver to export the schema and data to a SQL file.