Thursday, June 19, 2014

Extracting Census Data from American Factfinder to Use in ArcMap

Well, that escalated quickly.  Introductions to American Factfinder. 

I'm ready to get down to business because I sit a computer for 7.5 hours a day and I learn a lot of cool tricks and methods for getting stuff done.  Most of the time I celebrate these accomplishments with a silent fist pump at my desk.  With a blog, I can celebrate and share with the world.

Up until a year ago, The US Census American Factfinder website was a mystery to me.  I am here today to show you how to easily extract data from American Factfinder, join it to your GIS data, and view it within ArcMap.


Step 1:   Find Census TIGER/Line Shapefiles (if you don't have them already)


I'm talking tracts, block groups, blocks, etc.  These shapefiles have no meaningful attributes when downloaded, but they do have GEOIDs in their attributes which are super important to have in order to join them to the data later on.  You need these shapefiles to successfully view any kind of American Factfinder Data in your GIS.  If you already have some census GIS layers available to you, just make sure there is a GEOID column in the attribute table.  Otherwise, download from the Census site.

Do this by going to the Census website --> Click the 'Geography' tab -->
Choose 'Maps & Data' in the dropdown --> Click on the 'Tiger Products' Link

-or-

Just click here to get to the Tiger Products Page

There are lots of options when downloading TIGER files.  We are going to keep this tutorial simple and focus on the first option you see on the webpage; the TIGER/Line Shapefiles.  Go ahead and click on the link.  You will find that you can download the current year’s TIGER/Line Shapefiles or you can go back in time and download a vintage year.  Many will find that the current year’s shapefiles work for them, but if you are working with older Census data, be sure to download the correct vintage for that data.  The Census website provides some good documentation for choosing here.


Step 2: Download Census TIGER/Line Shapefiles


When you click the download button you will have two options for downloading.  Choose the ‘Web interface’ option.  From the web interface page select your area of interest.  If you’re new to Census data, ‘tracts’ are your safest bet.  However, if your geographic area of interest is a smaller area, you may want to consider ‘blocks’.  Or, you may want to look into the numerous other geographic area types the Census offers.  Once you select your geographic area type, the website will allow you to pick your area of interest. 

Choose the Web interface option when downloading TIGER/Line Shapefiles

 For this tutorial, I have chosen ‘tracts’ for ‘South Carolina’.  Download and save the zipfile.  Go to the location you saved the zip file and unzip it.

Now open up ArcGIS or whatever GIS software you use and add the shapefile in to make sure you got what you wanted.  Open the attribute data and you will see all the codes and GEOID that will be used in the next step when you join the data.

This is what the TIGER/Line Shapefile attribute table should look like when downloaded from the Census website.  Notice the GEOID column which will be used to join AFF data.

Step 3: Finding Data in American FactFinder (AFF)


Go to the U.S CensusBureau American FactFinder website.  Click on ‘Advanced Search’ and then click the ‘Show Me All’ option.

The best way to think of all the data in AFF is to think of a huge pile of unorganized papers and your job is to find the exact papers you need from that pile.  AFF is the tool you use to narrow down your options by process of elimination.  

You can eliminate the data in any order you’d like depending on what you are looking for.  I usually start by narrowing the search down the specific geographic region I am interested in.  This is done by clicking the blue ‘Geographies’ tab on the left menu.  From the ‘Select Geographies’ menu I would choose ‘Census Tract – 140’ as the geographic type (this would be different if I was using another geographic type).  Then I would select my state ‘SC’ and my county ‘Florence’ and then I can choose between all the tracts in Florence County or select just a few tracts.  Then I click the ‘Add To Your Selections’ button, this takes the query and adds it to ‘You Selections’ menu in the left menu.  Take note of the blue circle with the white ‘x’ that is your selection and it only goes away if you click on that blue circle with the white ‘x’.  So, if you wanted to get data for another geographic location you would need to first delete the other geographic area you currently have selected, otherwise you will get no results.  

You can add more than one selection, so if you are interested in a certain dataset or just want to get incomes or ages, you can narrow it down however you like.  For this tutorial I have selected ‘all Census Tracts within Florence County, South Carolina’ and narrowed it down to just the ‘2010 Redistricting Data SF (PL 94-171)’ Dataset.  

Screenshot of AFF in action.  Notice my selections in the top left corner, if I need to remove a selection I just click the blue circle with the white 'x'.  Beneath that are the blue boxes which I can drill down my selections even more.  The 7 results are tables based on my specific selections and can be downloaded to my computer.  If I need to get more information or see the data in a table before I download it I can click on the blue hyperlink or the 'i' in the about column.

I’ve got seven tables in my results to choose from.  If you click on the information button in the ‘about’ column you can get a pretty good idea of what kind of data is in that particular table.  I’m going to choose the first table, 'QT-PL' ‘Race, Hispanic or Latino, Age, and Housing Occupancy:2010'.  If I click on the blue hyperlinked table name I can actually see the attributes of the data by each tract in Florence County.  This allows me to determine if this data is going to be useful to me before I even download it.  This particular data is going to be useful to me so I can click on the download button from the ‘Actions:” menu above the table.  Clicking the download button here give me a dialogue box with options (leave them as they are).  If I were to navigate back one page, check the box beside the table I wanted, and click the ‘download’ button above it, it would not give me these extra options.  It would just automatically download (which is fine).  Make sure you save the zip file to your computer.  Navigate to that zipfile location and unzip it.  If you want, you can examine files in the folder which are in a Comma delimited (.csv) format.

Screenshot of the files included when I downloaded a table from AFF.  Notice they are in .CSV format.

 Step 4: Viewing the .CSV Data in Excel

So you have your downloaded data, but now you need to pull it into Microsoft Excel and clean it up before you can join it to you TIGER/Line Shapefiles.  

Open Excel (2007 or higher).

From the Top menu click ‘Data’ and then choose the ‘From Text’ option.  This will allow you to import the .csv file into the spreadsheet.  Navigate to the location where you saved your download.  There will most likely be two tables to select from, you need to choose the table that is the name of your dataset followed by ‘_with_ann.csv’.  For example, I would choose ‘DEC_10_PL_QTPL_with_ann.csv’ the other file is metadata (which you may need to use later on, but not now).  

Excel will now give you a dialogue box for importing this data.  Choose ‘Delimited’ and then click next.  For delimiters check only ‘Comma’.  Press next again.  Now be sure that the column data format is set to ‘General’ and then click finish.

Your table should look something like this in the image below.  Please note that you will need to remove extra header rows that include descriptions before you bring into ArcGIS, but you may want to name the headers accordingly so you can easily identify them.  Just be careful not to use any special characters so Arc will accept it and you won't spend half of your day beating your face in your desk.

Screenshot of what your data should look like after you import is from text to your excel spreadsheet.

Step 5: Reformatting the Excel table in order to Join Data to TIGER/Shapefile Lines


Notice the column titled ‘GEO.id’, this is the GEOID number you need, but it needs to be split and reformatted to a 'Text' data type in order to be able to be properly joined to your data in ArcGIS.

Insert a column to the immediate right of ‘GEO.id’ and label is GEOID.  Now highlight the ‘GEO.id’ column (click the top of the column, A), then in the top menu click the ‘Data’ tab and then choose ‘Text To Columns’.  A ‘convert text to columns wizard’ will pop up.  Choose ‘Delimited’ and click next.  Choose only the ‘Other’ option and type “S” in the box for delimiters.  Choose next and under the column data format choose ‘Text’ while being sure to click on the column with no name (the second column).
Make sure the column data format is set to 'Text'.


Click Finish.  Excel may ask if you want to replace all of the values in the column, choose OK.
Now your GEOID column is ready to be joined to your TIGER/Line Shapefiles.

Before going any further, save your work as an excel spreadsheet.  Also, please be sure that there are no invalid characters anywhere in your excel spreadsheet.  ArcGIS will not accept hyphens, periods, parentheses, brackets, or symbols (`!@#$%^&*()+=-|\{}[];:’”><,.?/) you get the point, right?  Clean it up!     
Screenshot of the cleaned up table.  Notice that the description header was deleted, and the headers have no special characters.  Be sure to save this as an excel spreadsheet.

Step 6: Joining the Data to the TIGER/Line Shapefiles in ArcMap   


Once you have everything cleaned, give your excel a final save, close it, and open up your ArcMap.  If you don’t have your TIGER/Line Shapefiles added to the map, go ahead and add them now.  Navigate to your saved excel and add the table to your ArcMap.

Open both the attribute tables, and make sure your newly created GEOID field looks similar to the GEOID field in your TIGER/Line Shapefile.  It does?  Great!  You are ready to join.

In the TOC, right click your TIGER/Line Shapefile layer.  Choose ‘joins and relates’ --> ‘join’.  In the dialogue box choose the GEOID field as the join field for both features.
Join Data dialogue box.  Choose GEOID as the joining field and choose to Validate the Join to make sure everything will join correctly.  Then press OK.

Click the ‘Validate Join’ button to make sure everything is going to work properly when you join.  Once validated, click ‘OK’ and open the attribute table of the TIGER/Line Shapefiles, you will notice that all the attributes are now attached.  This is a temporary join, if you want to make it permanent just right click your TIGER/Line Shapefile Layer in the Table of Contents and choose ‘Data” --> ‘Export Data’.  

Now you are ready to do whatever it is you wanted to do with the Census Data.  That’s another blog entry for me!

No comments:

Post a Comment