Re: [Evolution] Extracting address info for contacts



Personally I think the whole idea of automatically extracting this info
is bunk, but e-lahey disagrees.

It would probably have been instructive to give an example of the
addresses this works with (a full address).

An australian address usually conforms to something like

Person
[company]
Street Address or PO box number
Town State Postcode
[country]

But other valid endings include:
Town state
postcode

Town postcode

Town
postcode

Town
State postcode

Also "town" might be just the city, or the city + suburb too (but i
guess its just a string to a compter program).

Hand written addressed envelopes are not supposed to contain any
puncuation, and the postcode is always entered in a separate spot on the
envelope (printed labels don't need this).  And you only ever put the
country on for international letters.

Your rules will work with some of those, but i suspect the existing ones
will too.

I dont see what's wrong with just having entry boxes for each separate
unit myself.   The only reason you want freeform entry is if you want to
print it the same, does this current get remembered for printing?

 !Z

On 08 Apr 2001 03:43:36 -0700, Richard Zach wrote:
I noticed that when you enter an address into the free-form text field
in the card edit dialog, Ev doesn't do a very good job of extrancting
the various fields (city, state, zip, etc) unless it's a US address.
I posted a bug about it with a proposal for an algorithm, in bug 2133
<http://bugzilla.ximian.com/show_bug.cgi?id=2133>, which I include
below. Maybe y'all have comments--I don't know what addresses might look
like, really, outside USA/Canada/Europe.

-Richard

- set <address> to the text in the first line
- from bottom, find first line that contains any numbers (this contains
a zip code)
- the zip code is either the maximum trailing or the maximum leading
string which contains any numbers (it's either "city state zip" or "city
zip" or "state zip" or "zip city", possibly with commas somewhere). I
believe even in UK/Canadian post codes each part contains at least one
number, e.g, "WC1 1TN").  Note that you could have something like "D
57654 Berlin" (in which case "D 57654" is the ZIP, and something like
"75004 Paris cedex 05" or "CZ-5432 Prague 8" (so numbers on both ends
mean the initial part is the ZIP). So I suppose do this:

From right, search for first all-alpha word. The part behind it is X.
Then match all alpha-only words, that part is Y. The part remaining on
the left is Z. If Z is empty, zip = X, continue processing Y; otherwise
zip = Z, continue processing YX.

- Y is either "city, state" or "city" or "state". So if there's a comma
with strings not containing numbers on either side we know it's city and
state. Otherwise: if there are two lines below, they are <state> and
<country>. If there's only one line below, it could be either the
state/province or the country.

This obviously won't work for all cases, but would probably be better
than
what it does now.  In particular, it will screw up in Canada, where you
could have either one of the following:

Montreal HF3 HT3
Quebec

Montreal
Quebec HF3 HT3 

Montreal
Quebec
Canada HF3 HT3



_______________________________________________
evolution maillist  -  evolution helixcode com
http://lists.helixcode.com/mailman/listinfo/evolution





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]