DB Project Blog, Stem Success 394
After this Stem Success course, I feel that spreadsheets are incredibly limited in what they can do. Spreadsheets are good for small datasets but fall apart once you have too much data. Databases can not only pull from multiple external tables but the ability to query with SQL and find specific data you need as well as organize them makes databases much more robust. Relational databases are also great because they can be updated automatically due to their relational nature versus spreadsheets which have to be manually entered and updated every time. This also means any changes to databases can be tracked easier than spreadsheets.
In the Database Project, we chose the Hydropower dataset from CORGIS website. From a surface level look, the dataset in CSV form shows a single table about hydro-powered dams with their name, when they were made, from which organization, dimensional properties of the dam, which state and county they flow through, and from which river they flow through. We set about making a database based on the entity relationship diagram we made. For instance, our diagram had two tables, a location table, and a dam table. The location table consisted of two attributes: county and state. We queried these two attributes from the table and then created a new table with just these two attributes with both as one composite primary key. We made another table: dam, with all other attributes from the dataset and the dam’s name as the primary key. These two tables were related through a table called developed, that used the primary keys of both tables to relate the tables to each other in a many-to-one relationship.
This project taught me pretty much everything I know now about databases. I knew the word ‘database’ but previously if someone asked me the difference between a database and something like a spreadsheet I would not have known. I learned a bit of SQL to be able to query and create new tables to organize a database. I learned how to create primary and foreign keys so that relationships could be created between different tables. I even learned a bit of formatting so that the data could be presented properly, such as combining a latitude and longitude attribute into a single, properly formatted attribute to describe coordinates, as either attribute is not useful by itself.
Databases have helped me see that there is so much more I could do with my data if I have the tools and the know-how. I am moving onto graduate school next year, and I am already thinking about possible experiments I could run with larger data without being afraid of how hard it would be to sift through that data. I think this skillset would also be useful in any workspace that uses a lot of data and thus could help me get ahead of the competition in finding a good job. All in all, learning about databases have given me a lot of insight into how we store and use our data.
Comments
Post a Comment