Methodology

HOUSES Index

The HOUSES composite index is derived from individual housing features by linking address information to enumerated real property data that is available from local government assessors' offices.

The program applies principal component factor analysis based on real property data features of housing and neighborhood socioeconomic status (SES) items. Factor analysis results are then pared down to the following four real property feature variables:

  • Housing value.
  • Square footage of housing unit.
  • Number of bedrooms.
  • Number of bathrooms.

In formulating the HOUSES Index, individuals' addresses, at the time of query, are geocoded. The geocoding allows for users to match study addresses to geographic reference data and real property data of a housing unit. Each property item corresponding to an individual's address is standardized into a z-score and aggregated into an overall z-score for the four variables mentioned above, such that a higher HOUSES score indicates higher SES.

HOUSES is standardized within each county based on available real property data for a given year, as real property data is ascertained and updated from the county assessor's office for tax purposes. The HOUSES z-score can then be converted to percentiles, quartiles and deciles, if needed. For example, some studies use HOUSES in quartiles with Q1 representing the underserved population with the lowest SES and Q4 representing the population with the highest SES.

HOUSES can overcome the paucity of conventional SES measures in commonly used datasets, such as administrative datasets derived from electronic health records. The lack of access to SES measures in commonly used datasets has been an important obstacle to health disparities research.

HOUSES Cloud for the United States

To make HOUSES scalable, an automated cloud-based system was developed through a National Institute on Aging grant: NIH R21AG065639. The cloud-based system makes the HOUSES formulation process automatic using the established algorithms. The preliminary data showed nearly a perfect match with manual formulation of the HOUSES Index.

For example, after a training and reiterative process with data from Olmsted County, Minnesota, a HOUSES Index was formulated for Ramsey County, Minnesota. The capital city of St. Paul is in Ramsey County. This data was compared with the HOUSES Index z-score formulated by the HOUSES Cloud against gold standard human-formulated data. The average difference between the HOUSES Index from the HOUSES Cloud and gold standard data is negligible. As property data is generally updated once a year, HOUSES can be automatically updated annually so that HOUSES can more accurately reflect the current socioeconomic status of people living at a given address.

With the availability of the HOUSES Cloud, the HOUSES Index can be calculated for all counties in all 50 states of the U.S. Users authenticate and access the web application to perform a HOUSES lookup via search and small batch uploads. Application programming interface (API) clients, such as those with electronic health records, authenticate directly to the API server to perform HOUSES lookup requests. All HOUSES Program services and requests are secure and private.

Request services

To request services, complete the HOUSES Request Form.