Lab: Importing And Exploring Data


Using an INSERT statement is fine for testing tables or quick additions, but you're probably going to be working with hundreds or thousands of rows at a time. Luckily, there's a tool for that!

You can use specially formatted files, called delimited text files, to import data in bulk using SQL's COPY command like so:

    
      COPY table_name
      FROM '/path/to/file.csv'
      WITH (FORMAT CSV, HEADER);
    
  

You're going to import some census data, described here. You can download the file here. It's a heck of a lot bigger than the data sets you've been working on so far: it's got 3,143 records and 91 columns. Take a quick look at the file you just downloaded and see what you can learn about delimited text files. How are they organized? Where do the labels go? What separates the data?


Create the Table

Now that you have data, you have to create the table to import it all. And if you're making a table, you're going to have to create the columns, too. What data types will you need? Compare the some of the data in your import file to the list of Postgres data types. Don't make a complete list of all 91 columns. Instead, focus on the numeric and character types. What are some differences between the types? And what data types seem most useful to hold data in this table?

After struggling with that for a bit, here's the full CREATE TABLE query. Go ahead and run it in the analysis database. If you don't remember, look back at slides to see the last time you used a file and the psql command in the terminal. Remember to run it on the analysis database!

Then open up pgAdmin and run SELECT * FROM us_counties_2010; to see if you have the table and columns.


Import the Data

Let's import all the data at once:

    
      COPY us_counties_2010
      FROM '/path/to/us_counties_2010.csv'
      WITH (FORMAT CSV, HEADER);
    
  

You should see a success message about 3143 rows being affected. Since you don't know where pgAdmin is located, it's best to use an absolute file path to the data file.


Inspect the Data

Run that SELECT query again to view all the data. Did it get imported correctly?

If it did, congrats! Practice using SELECT with the other keywords to craft queries that might reveal questions about the data. Don't forget about the PDF linked above to understand the data that you're looking at. Have fun exploring!