Methodology and Sources

Our data for first names in the United States comes from the Social Security Administration. You can read more about their data here, but here are a few important caveats.

  1. To protect privacy, they only show names where at least 5 babies were born with that name and sex in a given year.
  2. They ignore punctuation and replace accented letters. Thus, Mary-Jane becomes Maryjane and José becomes Jose.
  3. They ignore names less than 2 letters in length.
  4. They truncate names longer than 15 letters in length.
  5. Their data is based on Social Security card applications and is incomplete for people born before 1937.

To estimate the average age of people who have a given name, we use actuarial data provided here. For example, we can see that, as of 2020, 68.617% of men and 78.935% of women who were born in 1950 are still alive.

If you have more specific questions about our data, don't hesistate to ask.