Collect personal information from users, such as their names or addresses, in a single field rather than multiple fields.
The formats of these data are too diverse for it to be reasonable, and catering to the 90% alienates those who do not match our concept of normality.
We can not decide for our users the formats of their data.
Teller doesn’t have a last name
You have probably created a form that looks a lot like this one, splitting discreet data into more than one abstract input.
That’s okay, we will forgive you and move on.
The concept that a person is always identified by a combination of first and last names is a Western invention.
We can extend our applications’ potential user bases to include those who cannot truthfully fill in two fields for their name.
This is good business sense: we are always trying to get more registrations. Allowing people to use your site and to specify their own names without any constraints.
Edward Codd’s first normal form for relational databases states that “values… are required to be atomic with respect to the DBMS.”1
While this may have been understood to mean that each atomic unit of data such as names and addresses should be separated, it is important to remember that these values should only be atomic within our use case.
For our purposes, multiple fields for data that we only use in clumps ends up being more of a burden than a benefit.
As Chris Date points out, “the notion of atomicity has no absolute meaning.”2 This means that data should be considered as atomic when it is in the smallest unit we will need for querying the database. Values such as we discuss here are therefore atomic for our purposes.
These are people
Different cultures tend to vary in their ideas about the order of given names and surnames.
In many prominent cultures, names are organized such that the surname or family name is first and the given name is last, which reflects on the importance of duty to family and respect for ones ancestors. In others it is common to go by first name only.
This isn’t just a rockstar/celebrity thing. It is also a cultural norm.
Consider what it would be like if others always addressed you by your last name: formal and impersonal.
Typically when using only a first name, we would be trying to impart a more friendly feeling, but this does not translate well.
It is best to always address someone either as they have asked us, or by their full name.
Middle of Nowhere, Alaska
Even in the US, there are addresses that don’t match the typical four fields websites like to collect.
In many of Alaska’s small villages and towns, there is no need for something like a street address. Sometimes, villages are as small as a dozen people.
“Sourdough Sam, Cantwell, Alaska” would be a perfectly serviceable address in a place like that.
I once received mail addressed to “FNBA Main Branch Attn: Caleb Thompson, Downtown Anchorage, Alaska”.
A text field for an address is the best way to capture the information.
Validation of address formats is what we refer to as a core competency. Entire companies are built around that functionality, and it is not easy to duplicate such functionality.
Remember: if a user would like you to have their address so that you can ship them something, they will make sure the address is correct themselves.
If they do not want you to have that information, 1600 Pennsylvania Avenue, Washington, DC, 02500 is a perfectly valid address and keeps their own information private from you.
A field guide for users
A better form might look more like this one:
This form represents names and addresses as text fields. We have disabused ourselves of all of some unnecessary and harmful complexity. We allow the user to give us information more meaningful than we might otherwise get.
Besides all that, it is also a much less intimidating form to display to users, who now have two fields to fill rather than seven. That's a more than 70% improvement!
The WC3 has put together a document on personal names. Of particular interest is the section on implications of this naming diaspora on field design.
For a more light-hearted explanation of basically the same point, I regularly enjoy Patrick McKenzie’s article, Falsehoods Programmers Believe About Names.
- Codd, E. F. The Relational Model for Database Management Version 2 (Addison-Wesley, 1990).↩
- Date, C. J. "What First Normal Form Really Means" in Date on Database: Writings 2000-2006 (Springer-Verlag, 2006), p. 112↩