[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search
 
Modify

CLDR Ticket #2600(accepted defect)

Opened 6 years ago

Last modified 4 months ago

Problem with UK postcode Regex

Reported by: lloyd.watkin@… Owned by: mark
Component: unknown Data Locale:
Phase: Review:
Weeks: Data Xpath:
Xref:

Description

UK postcodes tend to have the format:

AA1 1ZZ
A1 2ZZ
A11 2ZZ
AA11 2ZZ

http://bit.ly/bDiHTh and http://en.wikipedia.org/wiki/Postcodes_in_the_United_Kingdom#Format

The regex in your standards does not allow for this space, making systems build upon your regex validation not very useful.

Would it be possible to correct this bug, the regex change is trivial.

Attachments

Change History

comment:1 Changed 6 years ago by yoshito

  • Owner changed from somebody to mark
  • Status changed from new to assigned

comment:2 in reply to: ↑ description Changed 6 years ago by dave@…

It's also worth noting the current regex

GIR[ ]?0AA|((AB|AL|B|BA|BB|BD|BH|BL|BN|BR|BS|BT|CA|CB|CF|CH|CM|CO|CR|CT|CV|CW|DA|DD|DE|DG|DH|DL|DN|DT|DY|E|EC|EH|EN|EX|FK|FY|G|GL|GY|GU|HA|HD|HG|HP|HR|HS|HU|HX|IG|IM|IP|IV|JE|KA|KT|KW|KY|L|LA|LD|LE|LL|LN|LS|LU|M|ME|MK|ML|N|NE|NG|NN|NP|NR|NW|OL|OX|PA|PE|PH|PL|PO|PR|RG|RH|RM|S|SA|SE|SG|SK|SL|SM|SN|SO|SP|SR|SS|ST|SW|SY|TA|TD|TF|TN|TQ|TR|TS|TW|UB|W|WA|WC|WD|WF|WN|WR|WS|WV|YO|ZE)(\d[\dA-Z]?[]?\d[ABD-HJLN-UW-Z]{2}))|BFPO[ ]?\d{1,4}

Matches "SO452H", which isn't a valid postcode, it should match "SO452HL", which it doesn't (it reports a match 1 letter before the end of the string, confirm-able by adding ^$ to the regex)

The problem lies within "\d[ABD-HJLN-UW-Z]{2}", 1 digit, and any 1 of the letters listed, when it should be 1 digit and any 2 letters listed, changing this to: "[0-9][ABD-HJLN-UW-Z]{2}" resolve that issue.

And as noted above by Lloyd, it fails to match "SO45 2HL". note the space, Which is the correct locale formatting for postcodes.

This also fails to match overseas territories:
http://en.wikipedia.org/wiki/Postcodes_in_the_United_Kingdom#Overseas_Territories
eg: "AI-2640" or PCRN 1ZZ
I guess one could argue to which locale these belong.

Official Standards:
http://www.cabinetoffice.gov.uk/govtalk/schemasstandards/e-gif/datastandards/address/bs7666_address.aspx

The standards adopt a "validate structure approach" rather than attempting to validate the postcode, they delegate this task to the PAF database, which isn't currently freely available - although there is movement to make it so (which equal opposition)

The structure validation listed in the official standards BS7666:

(GIR 0AA)|((([A-Z-[QVX]][0-9][0-9]?)|(([A-Z-[QVX]][A-Z-[IJZ]][0-9][0-9]?)|(([A-Z-[QVX]][0-9][A-HJKSTUW])|([A-Z-[QVX]][A-Z-[IJZ]][0-9][ABEHMNPRVWXY])))) [0-9][A-Z-[CIKMOV]]{2}))

http://www.cabinetoffice.gov.uk/media/291293/bs7666-v2-0.xml

You will note it contains a space.

comment:3 Changed 5 years ago by g@…

9 months and no progress?

comment:4 Changed 5 years ago by pedberg

  • Milestone set to UNSCH

Blank milestone -> UNSCH per cldrbug 3400:

comment:5 Changed 3 years ago by mark

  • Milestone changed from UNSCH to future

comment:6 Changed 3 years ago by kent.karlsson14@…

comment:7 Changed 19 months ago by emmons

  • Milestone changed from future to UNSCH

merging future and UNSCH

comment:8 Changed 4 months ago by srl

  • Status changed from assigned to accepted

comment:9 Changed 4 months ago by markus

  • version 1.7 deleted
View

Add a comment

Modify Ticket

Action
as accepted
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.