Personally Identifiable Financial Information (PIFI)

PIFI is defined in Securities and Exchange Commission (SEC), Final Rule: Privacy of Consumer Financial Information (Regulation S-P) 17 CFR Part 248 (2000). Available at:

Both the GLBA and the regulations define NPI[5] in terms of PIFI.
The GLBA does not define PIFI but the FTC regulations define the term to mean any information:
(i) A consumer provides to you [the financial institution] to obtain a financial product or service from you;
(ii) About a consumer resulting from any transaction involving a financial product or service between you and a consumer; or
(iii) You otherwise obtain about a consumer in connection with providing a financial product or service to that consumer.

Medical Data Exploitation

Administration, Census Bureau, and Department of Veterans’ Affairs all maintain extensive collections of genetic data. Since May 1998, sex offenders have been required to surrender DNA samples to federal databases, and today every state maintains its own DNA database that contains the DNA profiles of felons—and of others, including people merely suspected of crimes or even of innocent people rounded up in DNA sweeps. The samples of 450,000 convicts are stored with identifiers, such as the person’s name, description, criminal record, Social Security number, and image. The government has also sponsored the creation of national databases, such as the FBI’s Combined DNA Index System (CODIS), which stores DNA samples, most without identifying information. CODIS went online in 1998 with samples from 8,000 convicted child molesters, and by 2001, it contained the profiles of 1.5 million felons. In 2002, the U.S. Attorney General ordered the FBI to expand CODIS to 50 million profiles, and by 2004, CODIS stored 2.6 million samples containing the DNA of people convicted of almost any crime. In October 2005, the Senate Judiciary Committee approved a law, which was pending when this book went to print, to force anyone who is merely detained by federal authorities to provide DNA, and in August 2006 the database contained more than 3.5 million samples. The FBI predicts that CODIS will accommodate 50 million samples “in the near future.”

Besides harboring the markers for four thousand disease risks, DNA also contains information about the health and identity of one’s forebears and descendants. With a sample of your DNA, a person can predict certain disease and disorder probabilities for you and for your children. George Annas, a law professor and bioethicist at Boston University, has referred to one’s DNA profile as a “future coded diary,” and with the completion of the Human Genome Project, the code has essentially been broken. Therefore, taking the fingerprints of an arrestee and taking a sample of his DNA are not comparable acts; the latter is far more intrusive and revealing—but far less likely to yield a uniquely definitive identification.

Medical Apartheid: The Dark History of Medical Experimentation on Black Americans from Colonial Times to the Present by Harriet A. Washington

Why a Muslim Registry is a Bad Idea

Originally posted in answer to the question What is so bad about a Muslim registry? on Quora.

I am going to provide an IT perspective on this question.

Yes, that’s right, an Information Technology, computers and the-people-who-deal-with-the-machines-collecting-and-crunching-the-data-perspective.

Why? Because when someone decides to ‘create a registry’ someone (similar to myself) is tasked with the job of creating a database AND reports generated by that database.

As illustrated by the wonderful commentary generated by the Y2Gay database discussions, IT has an important perspective on these things: Gay marriage: the database engineering perspective

Creating Databases Means Identifying Key Data

When a database is created, the first thing that must be done is simply this:

  1. identify the key data being collected
  2. identify the reports and other deliverables created by the database

While you might thing the second item could be restated as ‘identify the reason for the database’ nothing could be further from the truth. When dealing with non-computer people it is not unusual to have someone demand that a database be created to gather information “that creates a positive customer service experience for our customers!”…or something equally unclear yet very pep-rally appropriate. Then, after talking to multiple people and FINALLY getting them to explain what, exactly, they are going to DO with the data, the unofficial and IT specific purpose changes to: “create a mailing list.”

This is one of those near-universal experiences people in IT like to laugh and complain about. It applies to government and private sector equally.

So, in the case of a Muslim registry, the first step (key data) is partially addressed in the notes included with this question:

I am shocked there is not already a registry of ALL citizens with info such as race, gender, religion, languages etc. At least a Muslim registry is a step in the right direction, seeing as a great threat to America happens to belong to a single religion (I know most Muslims are not terrorists).

As noted in other answers, many of these elements are already gathered through existing databases, like the US Census and ID Cards.

Don’t Make Me Fill Out ANOTHER Form!

When data is already being collected and reported, the individuals responsible for that data tend to get cranky when someone comes in asks them to fill out another form, create another report, and generally re-enter the same stuff AGAIN. There’s also the possibility of entering errors into the data source when it’s being created/generated/imported/modified multiple times by multiple people.

Of the items listed, everything except religion and language are already included on divers licenses and state IDs. The data collected by the DMV is free and available to the general public (personally, I do NOT agree with this massive dumping of personal information…but I digress) so an enterprising database designer could…potentially…import the DMV data and connect it to the missing elements: religion and language.

With the right connections and political power, it could also connect to the state and federal databases containing anyone and everyone who has ever been arrested for any reason (including those cleared as innocent) AND the databases maintained by the department of homeland security, the no-fly list, and even the records maintained by public schools. Several of these databases INCLUDE religion and race.

In short, we COULD create a complete profile on every person residing within the United States neatly coordinated within a single location just by importing already existing data.

Explain to me…again…why we are doing this?

That brings us to the second question – what, specifically, is going to be DONE with this data?

Since the Muslim registry enters into the network of existing information specifically for the purpose of:

  1. collecting religion and language
  2. identifying terrorists
  3. focusing specifically on Muslims as potential terrorists

Then the database being created is more like a report-generating app that connects all existing data, spits out lists of known Muslims, their home address, the school they attend, the language they speak, connections to known terrorists groups, their place of worship, and anything else that might be deemed important.

Presumably, this information would be provided to people in the field, who would add information to individual files, as needed.

As an IT person, I’m thinking: soooo…you want to re-create the department of homeland security?

As noted above, all of this information already exists and it is a well known fact that federal agencies have made concerted effort to connect and share data. I guarantee you, this sort of thing already exists – along with similar reports on every religion, hate group, environmentalist group, activist community and whatever else someone in ANY federal level policing agency (or state level or whatever) might deem important to know…for whatever reason,

In fact, if human behavior remains consistent (and it usually does) there are probably databases and reports that focus on individuals, groups and concerns going back to the beginning of data collection – and people working in all levels of law enforcement who occasionally stumble across these things and scratch their heads wondering why in holy hades do they even HAVE this?

Duplicate with different purpose…and the reason is what?

So, again, why are we building this?

Now we are getting down to brass tacks. The key term here is registry.

A registry managed by the government contains data on people that is made publicly available ( – Public Records Online). A category-specific registry is usually (always?) focused on presenting information about people who are deemed dangerous enough to warn the general public on a permanent basis.

For example:

Therefore, this isn’t data collection, this is data distribution to the general public.

Whats Wrong With a Muslim Registry?

Creating a database of all individuals who associate with a specific religion and making it publicly available for the express purpose of warning all individuals NOT associated with that religion to be wary of interaction due to potential terrorism…

Yeah, that’s a problem.

Why? Because that’s not purpose-driven data collection, that’s propaganda.

I suggest reading any of the other posts that focus on the registries maintained by the Nazis or the crimes committed again the Japanese here in the USA during WWII. I’m sure there are other equally powerful examples and all of them come down to the same thing: when the government ostracizes a group of people and generates a marketing campaign that vilifies all members of said group…and a registry would achieve that goal (and ONLY that goal)…then bad things happen.

Really bad things.

We don’t need that here in the United States.

Data Analysis: Who Are The Homeless (2016)?

Trying to place ‘most of the homeless’ into different categories of deserving and undeserving poor is a common element in virtually every conversation or debate about homelessness, poverty and poverty survivors. The numbers are often sliced, diced and presented in a dozen different ways, making comparative analysis and logical conclusions difficult (at best).

This is my attempt to collect current data (available freely through the internet) and present the numbers in a reasonably easy-to-understand manner.

The primary question being answered: Who are the homeless?

Data Differences

I’ve included a list of links to key sources of data on Homelessness at the end of this post. All of these resources are academically respected and frequently cited in articles and other forms of research. Unfortunately, the data presented often contains inconsistencies that must be identified and addressed before completing a truly effective analysis or a clear presentation of high-level data. These inconsistencies do not negate the quality of the data or the effectiveness of the research, they are simply the natural outcome of a data-collection survey that (literally) examines millions of people.

For the purposes of this post, I have decided to focus on presenting the data contained within a single data source: HUD Exchange.

Step 1: Analysis of Data Collection Techniques

It’s important to begin the analysis by getting an understanding of the methods used during collection. An examination of the HUD Point in Time (PIT) Count Implementation Tools provides the following key details:

Data Collection Personnel consist of average people (volunteers), professionals in the ‘Helping Services’ (e.g.: homeless shelter workers), formerly homeless people and currently homeless people. According to the Tips For Including People Experiencing Homelessness (PDF) reference sheet, formerly and currently homeless are employed as Subject Matter Experts (SME) and may or may not be paid for their assistance.

Pre-Selected Data Categories outlined in the PIT Count Planning Worksheet (PDF) are detailed, extensive and specific. The data that collection personnel are expecting to find and, therefore, seeking out is clearly defined. The data outlined in the Sub-population Crosswalk (PDF) survey instrument is limited and specific. The data collection personnel are expecting to find and, therefore, seeking out is both clearly defined and restrictive, presenting the possibility of missed data points (e.g.: people not included in the count because they are not ‘real’ homeless) or inflated/inaccurate data points (e.g.: placing people in non-applicable categories for the sole purpose of including the data somewhere). The data points contained within the Sub-population Crosswalk (PDF) survey instrument are as follows:

  • Chronically Homeless Individuals or Families (based on family head of household)
  • Veteran
  • Adults with a Serious Mental Illness
  • Adults with a Substance Use Disorder
  • Adults with HIV/AIDS
  • Victims of Domestic Violence

Database Affected Data Points are a possibility due to the nature of the data contained with the Homeless Management Information System (HMIS), as outlined in the Sheltered PIT Count and HMIS Data Element Crosswalk (PDF) guide. Based on the categories and subcategories of data contained within the database, the expectations surrounding the situations of all homeless are clearly defined. This presents the possibility of inaccurate or inflated data points resulting from workers trying to find a way to enter data into the database.

Flexibility of Toolkit in Data Collection suggests that more extensive and (potentially) accurate data is being collected than may (or may not) be found within the databases. The Point In Time (PIT) Survey Tools (HTML) include forms that specifically address situations where the individuals conducting the survey are unable to talk to the individuals being counted and the answers include ‘unsure’ or ‘unknown.’ In other words, a data point may consist of a family that is found sleeping outside, but the collector is unable to access the location or communicate with the individuals in question (e.g.: does not want to wake them up, cannot speak their language, etc.) so it is not possible to verify whether the family is truly homeless or dealing with some other situation.

Inherent Data Collection Problems are centered around evaluating the entirety of a poverty survivors situation through distant observation or a single face-to-face interaction. Accurately identifying an age or race can be extremely difficult under these circumstances. Correctly evaluating mental health and assessing whether or not an individuation meets the ‘chronically homeless’ definition are nearly impossible.

Inherent Data Collection Strengths are in the total count of human bodies. The PIT count provides a total number of people who are living on the street or in shelters during a specific period of time. While exact ages are difficult to pinpoint, total numbers of individuals falling within per-defined age ranges are reasonably reliable. Total numbers of family groups and children, teens, adults and the elderly who are trying to survive the streets alone are also reasonably reliable. Therefore, the strength is in the reasonable reliability of the high-level total counts.

Step 2: General Examination of Raw Data

Positive: The HUD data is clear and easy to decipher. It provides total counts for high-level categories, divided by state and geographic region.

Negative: The revisions tab lists changes to historic data that have occurred since 2007 (earliest available data). The changes listed are significant. However, the changes are also limited to select portions of historic data and do not indicate that equally significant changes will be made throughout all bodies of data.

Step 3: Data Analysis

Based on my analysis of the data collection methods, I focused on the strongest data points available.

Percentages: All percentages are a comparison to the total number of homeless people in the United States during the 2016 PIT count. Because the totals change, depending on the data being presented, the specific totals used to generate the percentages are included at the top of each chart followed by 100%.

Children and Youth: The data provided by HUD does not provide a high-level total count of all children and/or youth included in the PIT count. There are several subcategories focusing on children and youth and I have summed these categories to create a rough total, but I suspect this number represents BOTH overlap in data categories and a a significantly deflated total. Without knowing the total number of children and youth that are included in the total number of people ‘In Families’ it’s impossible to calculate the total number of children and youth.

Total Homeless, 2016
Count Percentage
Total Homeless, 2016 549,928 100.00%
Sheltered Homeless, 2016 373,571 67.93%
Unsheltered Homeless, 2016 176,357 32.07%


Sheltered Homeless, 2016
Count Percentage
Sheltered Homeless, 2016 373,571 100.00%
Sheltered Homeless Individuals, 2016 198,008 53.00%
Sheltered Homeless People in Families, 2016 175,563 47.00%


Unsheltered Homeless, 2016
Count Percentage
Unsheltered Homeless, 2016 176,357 100.00%
Unsheltered Homeless Individuals, 2016 157,204 89.14%
Unsheltered Homeless People in Families, 2016 19,153 10.86%


Families Vs Individuals, 2016
Count Percentage
Total Homeless, 2016 549,928 100.00%
Homeless Individuals, 2016 355,212 64.59%
Homeless People in Families, 2016 194,716 35.41%


Homeless Subcategories, 2016
Count Percentage
Total Homeless, 2016 549,928 100.00%
Total Subcategories 248,419 45.17%
Total Youth and Children 104,474 19.00%
Homeless Unaccompanied Youth (Under 25), 2016 35,686 6.49%
Homeless Unaccompanied Children (Under 18), 2016 3,824 0.70%
Homeless Unaccompanied Young Adults (Age 18-24), 2016 31,862 5.79%
Parenting Youth (Under 25), 2016 9,892 1.80%
Parenting Youth Under 18, 2016 92 0.02%
Parenting Youth Age 18-24, 2016 9,800 1.78%
Children of Parenting Youth, 2016 13,318 2.42%
Homeless Veterans, 2016 39,471 7.18%
Chronically Homeless, 2016 86,132 15.66%

Data Sources

Data Provided by Homeless Services Providers (smaller scale)