- Let me guess why you’re here
- What’s the problem
- Another way of thinking about the problem
- What’s the short term solution
- Wow, that formula is hard to read
- Up Next
Someone just gave you an Excel workbook where one of the columns is ZIP codes and many of them are 4 instead of 5 digits. The cities and states next to ZIP codes are all from New England,
and probably New Jersey and a few other places too. You no longer have the original text file and are stuck with the workbook.
The problem is that Excel is not very good at preserving data types. A five digit ZIP code (which despite the postal service introducing nine digit ones with a dash between the 5th and 6th digit people still generally use) looks like an integer to Excel. Excel treats it as such and eliminates any zeros in the front. However, it is not an integer, it is a code. Specifically, it is a postal code, which you want to use to store as a text string. This isn’t a problem as long as your address is west of Philadelphia, but it becomes a real headache with New England, New Jersey, and Caribbean ZIP codes.
For example, the ZIP code for the neighborhood of Back Bay in Boston should be 02116. However, Excel assumes that it is an integer and gives you 2116.
A good way of thinking about this is to look at the Postal Codes from other countries, which is what a ZIP Code really is. Here are a couple of examples what they look like in Canada and the United Kingdom, where they use alphanumeric postal codes.
|Country||Example Postal Code|
|United Kingdom||W11 2BQ|
You would never store these codes as integers. You would store them as strings. Excel would not mistake them for integers, even if they started with zeros.
What’s the short term solution:
This solution assumes that all of your ZIP codes are in the original 5 digits, and not the new format of 5 digits + 4 digits.
The short term solution is fairly easy. Let’s say the zip codes are in column E. Insert a column to the right of the zip code column E. Next, in the first row of the new column F, type the following formula:
Replicate the formula down the column.
In plain english, what the formula says is:
"If the length of the value in cell E2 is less than the 5 digits in a ZIP code, then concatenate the correct number of zeros to the front of the value of E2 and convert it to a text string. Else, just convert the value in E2 to a text string."
Let’s break it up into its components.
It is all nested in an Excel
=IF(condition, do if true, do if false)
LEN(E2) < 5 – is the ZIP code less than 5 digits.
Do if true clause
REPT(“0”,5-LEN(E2)) - repeat the character "0" (NOT the number 0, the text string value "0"). You want to make this a text code, not an integer. You want 5 minus the length of the less than 5
digit numbers because this is the exact number of zeros missing.
&TEXT(E2,”#”) – convert from an integer to a string
and concatenate the 4 or less character ZIP code to the string of zeros you just generated
Do if false clause
TEXT(E2,"#")) – you found a 5 character ZIP code, convert it to a string from digits.
Let’s say you had the original text file. You could prevent this
That will be discussed in The New England ZIP Code Problem: Part II