TheKyleReportYou know that Shipley’s Donuts shop down on West Center Street? The folks who work there may not even work there. They may be commuting every day to Houston. How about that Game Stop in the Target shopping center? The people who work there actually commute back and forth to Dallas every day.

I wrote an item yesterday about a table I ran across in the Community Impact Newspaper that allegedly presented “factual documentation” on “Kyle’s Commuting Problem.” It turns out that while those numbers may be factual, they are not exactly truthful. I wrote at the time I had tried to contact the sources listed for the statistics and I finally heard back from one of them, the Center for Economic Studies. Specifically, it was one Matthew Graham from CES who wrote me a detailed response that I will reprint in its entirely at the end of all this. But one of the cogent points Mr. Graham made is that a Kyle resident working in Kyle at a place of business whose headquarters are in another city likely will be listed as working in that other city. That’s especially true if all that outlet’s personnel and especially payroll functions reside in the headquarters location.

That means those sales people working at the Radio Shack near the Target will be shown as actually working in Fort Worth. Anyone employed by FedEx will be listed as a Dallas employee, regardless of where they actually live and work. The table in the Community Impact News says 8,422 commute daily to Austin. Now I know that number includes all those who work right here in Kyle at Seton and at the Austin Community College campus, two of Kyle’s largest employers.

That’s not to say that most of Kyle’s working population have jobs right here in Kyle. They don’t and that fact still presents a challenge to our economic development experts. But at least I no longer have to be that concerned about all those people I feared were commuting every day from Kyle to Houston or Dallas.

Here is the complete text of Mr. Graham’s reply to me because there’s more to his reply than what I just outlined:

Thanks for your question. From the description in your email, it sounds like the data may have come from our OnTheMap web application http://onthemap.ces.census.gov/ which makes a dataset called LODES (LEHD Origin-Destination Employment Statistics) available for analysis. However, I was unable to recreate the exact numbers you mention here.

That being said, when I do run the various reports in OnTheMap for Kyle, TX, I do see a pattern similar to the one you describe in your email, and so I’ll proceed under the assumption that OnTheMap/LODES was the data source.

First, let me give a quick background on this dataset because it is helpful in understanding some of the dynamics that can appear. These public-use statistics are created primarily from several administrative record sources, the most important of which are the state’s Unemployment Insurance (UI) wage record system and the Quarterly Census of Employment and Wages (QCEW). From the UI system, we gather information on which individuals are connected to which firms and some information about earnings. From the QCEW, we get information on firm structure and establishment location (e.g. where are the individual establishments located and how many employees work at each) as well as information on industry and ownership (public or private). Additionally we get information on individuals’ residential locations from other Federal administrative sources. Finally, it’s useful to note that the confidentiality protection methodology that we use to protect individuals’ residential locations is dependent upon data from the Decennial Census (2000 and, more recently, 2010).

With that background, it appears likely that there are at least two dynamics impacting this dataset’s representation of Kyle’s resident workforce. First, Kyle clearly grew very quickly between 2000 and 2010. Our ability to accurately represent very high growth areas is limited by our confidentiality protection system. In other cases of this type we have seen growth lagged in the data behind what is actually happening in the community, often with something of a “catch up” spike once we had 2010 Decennial Census data available for the confidentiality protection system. This dynamic definitely seems to be playing out in Kyle’s residential workforce.

The second dynamic is something we refer to colloquially as a “headquartering issue.” In these cases what we see are firms either under reporting the number of establishments or not reporting any establishments outside of a “headquarters” location, which sometimes could even be something like a payroll office. In these cases and with no information otherwise, we must allocate all of the firm’s workers to the establishments that are provided to us, even if an establishment location is very far from a worker’s residence. When this happens, we see long distance “commutes” appear in the data, although we generally believe that most of the long distance relationships are not daily commutes. Certain industries can be prone to this issue, including construction, sales forces, temporary employees, drivers, and oil/gas extraction to name some. Additionally, state agencies sometimes fail to report their offices and thus state employees can appear clustered in the state capitals when they actually work all over the state.

Analysts have brought headquartering-type cases to our attention in the past. We have little ability to “fix” the data except to request that the states (from whom we get these data) ask firms to do a better job reporting. In some cases, there are fairly easy remedies that can be applied by external analysts (such as rescaling individual data items using external sources of data such as zoning regulations and office square footage datasets). In the case of Kyle, I would probably recommend at this point that you simply consider these “long distance commutes” to be administrative relationships rather than actual residence-to-workplace relationships, and in all likelihood people are doing most of the commuting closer to home (which doesn’t mean that there are not any folks out there making very long – maybe less frequent than daily – commutes). Additionally, I’d like to note that this kind of groundtruthing, in which local knowledge (or external/complementary information) is applied to data is a vital process in drawing valid conclusions, not just from these data but for any data.

To close, I’ll point you to a couple of other references if you’d like to dig into the data further. First, we recently released a design comparison between LODES and the American Community Survey (ACS) Commuting data, which are often compared to each other. That can be found here: https://ideas.repec.org/p/cen/wpaper/14-38.html. Additionally, you may find this external analysis between the two datasets helpful: http://www.camsys.com/pubs/NCHRP08-36-98.pdf. If you have further questions, please don’t hesitate to contact us at CES.OnTheMap.Feedback@census.gov. Thanks again for your interest in these data.