Final Report

University of Connecticut

The Third Annual State Data Conference

Is Connecticut Falling Behind?

November 15, 2001

Summary:

The Third Annual State Data Conference brought together data users, collectors and providers to share the latest developments in Connecticut and to discuss the needs of our data community. The broad objective of this conference series is to create an ongoing symposium for those involved in different stages of data generation, archiving, distribution, and use so they can exchange information on current data availability, the challenges they face, their particular needs, and their expectations (or hopes) for future developments. This third in a series of conferences continued the conversation begun at the first two. Participants in the one-day meeting came from various sectors, including government agencies, academic scholars, private enterprise and non-profit organizations.

The organizing theme of this third conference focused on the question of whether Connecticut was falling behind in its efforts to develop and provide data to data users. Ten presenters explored various facets of this question in three sessions. The first session put the 2000 Census in context, looking at who we are and how we’ve changed. The second session examined three ongoing initiatives to build access to Connecticut data. Finally, the last session explored the many benefits of digital spatial mapping. The following summarizes the challenges, current initiatives and future directions for Connecticut’s data system.

The conference theme “are we falling behind?” likely articulated a fear common among data providers and data users everywhere. There will probably always be concerns about the appropriate provision of information to a community of data users. Part of the reason is that, as with any other scarce commodity, the provision of data involves opportunity costs. Since we have to give up other goods and services to produce data, we will (quite rationally) never produce the full quantity or quality of data we are technically capable of producing. What’s more, this calculus is complicated by information’s often unpredictable value, a characteristic that makes it easy to underestimate information’s true worth. After a busy news day, for example, a newspaper may be worth far more than its price to the reader. After a slow news day it may be worth far less. Thus, for long stretches of time some information may seem worthless, and then it may suddenly assume limitless value at a critical moment. Still another reason for the perceived dearth of quality data is that information often takes on the characteristics of a public good—once produced, it becomes difficult to limit data access to paying users only. That reduces the incentive of providers to offer it in sufficient quantity or quality. Because of these challenges to the efficient operation of markets for data, data collection and dissemination is often organized through quasi-public and government agencies, as in the case of the U.S. Census.

Putting the 2000 Census in Perspective

In his talk “Shadowing the U.S. Census,” Wayne Villemez addressed many of these challenges to the task of providing quality data. By its own admission, the U.S. Bureau of the Census says that it undercounted the population of Connecticut in 1990 by more than 21,000, costing the state hundreds of millions of dollars in federal assistance tied to population counts. Undercounting errors were worse among the young than the old—children were undercounted at a rate of 1.7% while adults were undercounted at a rate of 0.3%—and worse among minorities than whites—blacks and Hispanics were undercounted at rates exceeding 5%, while whites, on average, were not undercounted at all. To many, such errors were unjustifiable.

In the 2000 Census the stakes were higher. Any potential undercount not only threatened millions of dollars in funding, it also endangered one of the state’s Congressional seats. Accordingly, the Connecticut Office of Policy and Management commissioned the University of Connecticut’s Center for Population Research (CPR) to conduct a “shadow” Census survey to check the accuracy of the official Census count and to document any significant undercount should the Census figures be challenged in court. CPR, headed by Professor Villemez, used four separate methodologies to test the Census count: a statewide sample survey, an age cohort comparison, a focused population count, and a hidden population study. The results of the four surveys dovetailed to the same conclusion: there likely was an undercount in Census 2000 but the size of the shortfall was too small to call into question the Census’ official figures. The counts of some of the state’s smaller areas were, however, somewhat suspicious. The CPR studies, in short, affirmed the value of the Census count. The Bureau of the Census, with all its resource constraints, does a pretty good job at delivering an accurate count of the population.

Of the four CPR studies, one in particular offers some revealing insights. The focused population count concentrated on four especially difficult to count Census block groups in the state—three low-income, minority areas in Hartford, and one “college catchment” area near the University of Connecticut. CPR invested substantial resources in the effort, exploiting its connection to a popular University basketball team, using 16 times the number of enumerators as Census had available, taking the time for as many as 8 call-backs, and employing enumerators who matched the population counted and who spoke the language (one even grew up right in the neighborhood). The result: convincing evidence that Census likely undercounted these hard-to-count city populations by about 2.9%. By comparison, Census itself estimates the total undercount for all of Hartford County at about 1.0%. The discrepancy is not unexpected. Vastly superior resources would be expected to produce a superior count. What is perhaps surprising is just how good the Census count is, given the resources it has at its disposal.

Robert Cromley, Director of the University of Connecticut Center for Geographic Information and Analysis, took a close look at just what that 2000 Census count revealed about Connecticut’s population in his presentation “Evolving Connecticut Communities.” Dr. Cromley explained that in the period between the 1990 and 2000 Censuses, Connecticut’s population grew, but not by much. The state’s population total rose 118,449 or 3.6%. Most towns gained population, but the state’s biggest towns, Hartford, New Haven, Groton, and Bridgeport, lost a significant number of residents, both in absolute and in percentage terms. The biggest population gains, both in absolute and percentage terms, occurred in Fairfield County and in the suburban fringe around Hartford and along the Route 2 corridor.

The state also showed signs of aging. Connecticut lost population in younger population cohorts, but gained population in older cohorts. Between the censuses, the population of 15 to 24 year-olds dropped 12.8%, while that of 25 to 34 year olds plummeted 22.6%. The biggest gain, a 36.8% jump, occurred among those 85 or older. All other population age cohorts grew as well, except for the group representing those who are just entering retirement age. The cohort of 65 to 74 year-olds shrank 9.6%.

Much has been made about the claim that young residents are leaving the state in droves for jobs elsewhere, and that’s why we’re seeing a big drop in the number of young people in the state. But for the most part, Cromley explained, the drop in the number of young people reflects the natural aging of the population. In general, younger cohorts were simply smaller than the older cohorts they grew to replace. Thus, there were fewer 25 to 34 year olds in 2000 than in 1990 because the population of 5 to 14 year olds in 1990 was so relatively small.

Besides this natural aging process, the only other way for population cohorts to shrink is through net out-migration, defined as the change in population over the period, less births plus deaths. Indeed, Cromley’s statistics show that out-migration accounted for fully 85% of the decline in population in the oldest population-losing cohort—the 65 to 74 year-olds. But the population lost among 15 to 24 year olds as a result of out-migration was a surprisingly small 14% of the total. And migration actually added to the total of 25 to 34 year olds in the state over the period. So there appears to be very little evidence of a “brain drain” at work in Connecticut.

That’s not to say that migration patterns haven’t had a differential effect on many Connecticut towns. By and large, towns that have lost population both in absolute and percentage terms, have also experienced a net out-migration of population, often of some magnitude greater than the net population decline. In Hartford, New Haven and Bridgeport, for example, out-migration far exceeded the population declines, so net births are clearly masking important demographic shifts. Areas with growing population totals also tend to be areas with significant levels of net in-migration, as for example, with the towns along Route 2. One interesting exception: Stamford. This town in Fairfield County combined an increase in population with a net out-migration of residents.

As enlightening as these statistics may be, the Census Bureau promises to offer an even richer data set in the not-to-distant future, as explained by Ana Maria Garcia in her presentation “Charting the Process: The American Community Survey.” The Bureau is now completing the development and testing of this new survey instrument, designed to replace the Census long form. When fully implemented in fiscal year 2003, the ACS will collect detailed economic, demographic and housing data traditionally collected on the decennial census long form from 3 million households a year, from every county in the country. These data will provide detailed characteristics about the nation updated every year, rather than only once every ten years, making it an invaluable resource for researchers and policy makers. Full implementation of the ACS will enable the 2010 Census to collect only short-form information.

To demonstrate the operational feasibility of collecting long form information at the same time as, but in a separate process from the decennial census, the Census Bureau conducted the Census 2000 Supplementary Survey. The survey was completed on time, within budget, and with a response rate of over 96 percent. Data files, tabulated files, and associated documentation will be available on CD-ROM, as well as on the Census ACS web site.

Besides conducting its own demographic surveys, Census also commissions data services from outside groups and in 1987 it approached the Massachusetts Institute for Social and Economic Research (MISER) to develop an origin-of-manufacturing-export data series. MISER is an interdisciplinary research institute at the University of Massachusetts, Amherst, founded by Stephen Coelen. Dr. Coelen discussed this trade data series in his presentation “Links with the World: Deconstructing Trade Data,” explaining how the series has changed over time, and what its measures show for Connecticut.

As originally developed, the MISER trade data series measured exports by the origin of the manufacturer (OM). But in 1993, Census complicated the picture by requesting a series that measured trade by the location of the exporter (EL). The thought was that, with agricultural products in particular, the origin of manufacturing series was simply not getting at where the product originated. Indeed, under the EL series, Connecticut’s export sector appears 64% larger than it does if measured using the OM series, so the EL series certainly helps raise the profile of the export sector. Why, then, bother reporting data using OM? As Coelen explains, the OM series identifies where the manufacturing process occurs and so it captures the biggest share of the product’s value-added and pinpoints where there are the largest multiplier effects on other parts of the economy.

Another recent development that complicated the trade data series was Census’s request that the series be reported using the new North American Classification System (NAICS) and also by harmonized system commodity coding. The switch from the Standard Industrial Classification (SIC) system to NAICS significantly rearranged commodity groupings but it didn’t really result in any greater reporting specificity. Under SIC there were 34 reportable 2-digit categories and under NAICS there are 33 3-digit groupings. In fact, the NAICS system creates its own peculiarities by, for example, putting Connecticut’s number five SIC export, medical instruments, in the NAICS miscellaneous category along with jewelry and sporting goods. The addition of harmonized system reporting does, by contrast, offer more specificity (there are 99 2-digit harmonized codes), but even here, the true source of economic activity can become obscured. Under the 2-digit harmonized system Connecticut’s number one export is machinery, but it’s not until the data are examined by 4-digit code that the transportation-related nature of the chief commodities produced in the state (turbojets and turbo propellers) become clear. The key advantage to reporting data under systems in use outside the U.S., whether it be NAICS or the harmonized code, is that we are now better able to compare our activity with that of the rest of the world.

Building Access to Data

A profusion of high quality data is worthless unless data users can gain access to the information quickly and efficiently. A second set of speakers at the conference explored alternative systems of data delivery.

As many data users know, Connecticut already has a website, ConneCT, which connects state residents to various state agencies. What the state does not have, but is currently developing, explains Robert Mitchell, Information Technology Administrator at the Department of Information Technology (DoIT), is a government portal like those found in other states, notably California. Mr. Mitchell’s presentation, “Initiative to Create a State Data Portal,” explained the distinction and apprised conference participants on the State’s progress.

A website, like ConneCT, is organized by structure of government and provides links to key agencies but lacks integration or a common look or feel from one agency site to another. What’s more, the site provides users with few opportunities to transact business. The purpose of a portal, by contrast, is to move residents out of waiting lines at state agencies and put them “on-line” at the portal. A portal offers users a single point of access to government information and services that is open 24 hours and is organized according to the interests and needs of consumers. Thus, functional categories such as “business,” “living,” and “health,” will serve as “virtual agencies” and guide residents through the steps they need to take to get what they’re after. And the site can be personalized to provide weather, traffic information, e-mail, notices, events and other announcements. Right now, the Connecticut portal is still under development, but it has passed some major hurdles on the way to its eventual introduction.

Connecticut is also an active participant in the development of the Federated Electronic Research, Review, Extraction and Tabulation Tool (FERRETT) cooperative data program. FERRETT, developed by the Census Bureau and Bureau of Labor Statistics, is a generalized search system for extracting and tabulating data across heterogeneous statistical data sources. The FERRETT program supports metadata searches across surveys (metadata is basically a summary of information about the data itself, for example what types of data fields will be in the file), on-the-fly variable recoding, complex tabulations, on-line statistical analysis and graphics. Census and BLS work with statistical agencies at all levels of state, federal and local governments to improve the capabilities of the FERRETT system.

Patrick McGlamery, map librarian at the University of Connecticut Libraries Map and Geographic Information Center (MAGIC) discussed the University’s participation in the FERRETT program in his presentation “Searching Through the UConn Portal.” UConn’s participation in FERRETT began after the first Connecticut State Data Conference in May, 1999 when conference participants identified a serious shortage of complete, readily available and easily accessible data for the state of Connecticut. McGlamery and UCCGIA Director Robert Cromley teamed with Census and BLS to build a Connecticut link to federal, state and local data that is accessible through FERRETT.

The association, according to McGlamery, has been a productive one. Visitors now have access to 25,000 data sets at the UConn site. And Connecticut’s participation in the program has helped to extend FERRETT’s capabilities. The primacy of town-level as opposed to county-level data in Connecticut has required modifying the FERRETT metadata and interface structures to enable the program to work with town data, a task administered by Ann Green at Yale University. And the geographic and mapping resources of MAGIC at UConn are helping to strengthen FERRETT’s graphical capabilities by developing a system that allows the Internet cartographic display and distribution of demographic information. Work on this project should be completed early in 2002.

The Connecticut Economic Resource Center (CERC) is also set to roll out a new subscription-based state data service called DataFinder. According to CERC’s Vice President of Research, Jeff Blodgett, who introduced this new web-based resource in a presentation entitled “A Comprehensive State Data Base,” DataFinder is a sophisticated menu-driven, data dissemination and mapping tool. Users begin by selecting a geographic area, which includes customizable geographies such as drive-time areas, and then with a few clicks of a mouse, can choose from among a number of data reporting options such as themed maps, survey reports, comparison reports, rank reports and business lists. DataFinder allows users to produce cross-comparisons from among a host of demographic, economic, business, spatial and local government fiscal information. DataFinder also incorporates several database components that are unique to Connecticut including 180 separate town database series , geocoded real estate locations, and a lists of more than 122,000 business together with their employment ranges, SIC codes and geocoded locations.

Digital Spatial Mapping

The value of having a rich data depository is often uncertain until a moment of crisis arises. One such moment occurred September 11th 2001.

The City of New York had, for some 20 years, been building a GIS capability, explained Richard Goodden, VP of PlanGraphic’s Eastern Region. But until just recently, the system remained a set of unconnected islands of GIS—with over 20 City agencies developing their own internal mapping capabilities. Then the City’s Department of Information Technology and Telecommunications (DoITT) set out to establish a GIS Utility or New York City Supermap to provide the basis for inter-agency mapping, GIS coordination & data sharing. In 1996 and 1997, DEP funded the first photo flyover to produce digital orthophotos and planimetric mapping of the city. These photos are the base map against which the other maps with block and parcel data, zoning, water systems, utilities, and the like could be calibrated.

DoITT’s GIS Utility serves as the primary City agent for inter-agency mapping, GIS coordination and data sharing. DoITT itself works to facilitate the widespread use of its base map, it collects, standardizes, documents and publishes key geographic data sets of interest to several City agencies and commercial entities and it provides technical support to small agencies that lack internal GIS capabilities. The GIS Utility, or Supermap, resides on an IBM mainframe with data stored in the Oracle spatial format.

This integrated database with its “canned” digitized maps proved invaluable in early days of the WTC crisis. Pre-attack orthophotos and street maps helped emergency personnel get their bearings amid the smoke-filled rubble. Vector maps of roads, subways, bridges, electric, water, telephone and and post-attack photos aided the rescue effort. And as the effort moved from rescue to recovery, thermal and lidar data became key as did the modeling and analysis of substructure utility and transportation datasets, the coordination of efforts with FEMA, and the development of a secure web access sit for the FDNY. As critical as the GIS capability was to managing the WTC disaster, the event made plain one undeniable truth: waiting until the need arises to develop an information capability is to wait too long.

The City of Boston is also busy turning out an integrated digital mapping system as Martin von Wyss, of the Boston Atlas, explained. According to von Wyss, the City knew it had an information problem when it found itself disseminating maps that were as many as 30 years old. So the Boston Redevelopment Authority teamed with the founder of a digital media technology company, a software programmer and city data providers to develop the Boston Atlas, BRA’s on-line mapping application. The Boston Atlas uses recent GIS data, including planimetric, land use, and ward and precinct maps, as well as aerial photos from various city organizations, and is an invaluable resource for planners, architects, developers, and students. The Atlas allows users to find information about city addresses, make and print high-quality maps, and download data for use in GIS and CAD applications.

Despite the successes of the New York Supermap and the Boston Atlas, Connecticut has failed to take the necessary first step toward creating a similar “data infrastructure” for the state according to Deborah Dumin a GIS research specialist at the Connecticut Department of Environmental Protection. As the New York and Boston examples show, a Connecticut Supermap could provide the data infrastructure needed to help manage the state’s physical infrastructure and serve as an invaluable tool for municipal planning, tracking land use, emergency response and damage assessment, to name just a few applications. Currently, however, as Dumin explains, Connecticut remains “data rich” and “information poor.” Many separate towns and state agencies collect valuable information, but because we lack a common shared base of facts in the form of a detailed base map of the state, it is very difficult to coordinate and integrate these disparate projects. Since these separate activities would have far more value were they linked together through a common base mapping system, such a map would help the towns and the state better leverage their other investments.

In 2000, the Department of Environmental Protection tried to get its digital mapping initiative off the ground by producing a digital prototype of a base map for North Branford. Aircraft flying at 6,000 feet took photos of the ground that were scanned at a resolution of eight-tenths of a foot and viewable at a scale of 1:40, making the map suitable for municipal planning purposes. But before the project could expand to other areas of the state, and despite the Governor’s support for the program, the state legislature pulled the project’s funding. Some towns are doing aerial surveys of their own, and DEP continues to work with other mapping formats, such as airborne lasers and color infrared, oblique and conventional photography with a plan to eventually make such products available to state residents on the Internet. But until the legislature acts to support a digital mapping initiative, it is unlikely Connecticut will have the same sophisticated data infrastructure that New York and Boston now benefit from.