Geocoding (5min)

Marc Tobias Metten

Lokku Ltd.

Overview

Free Trade Wharf, 340 The Highway, Wapping/Limehouse London
Fountain Road (Flat 2), Edgbaston, UK
15, ASTLEY HOUSE, LONDON
Albert Villas, Gilbert Mews, LEIGHTON BUZZARD, Bedfordshire, LU7 1NF
Lake Lock Drive, Wakefield West Yorkshire
Church Lane, London
[any material that should appear in print but not on the slide]

nestoria.co.uk

nestoria.co.uk

www.nestoria.co.uk/wimbledon/

listing data flow

http://api.nestoria.co.uk/api?action=search_listings&place_name=soho&listing_type=rent
...
"request" : {
  "action" : "search_listings",
  "country" : "uk",
  "place_name" : "soho",
  "listing_type" : "rent",
  "number_of_results" : 20,
},
"response" : {
  "created_unix" : 1176295820,
  "link_to_url" : "http://www.nestoria.co.uk/soho/property/rent/results-20",
  "listings" : [
    {
      "title" : "Flat to let, Richmond Mews"
      "summary" : "Set in the heart of vibrant Soho this funky...",
      "keywords" : "Mews, Unfurnished, Flat, Reception",
      "latitude" : 51.513,
      "longitude" : -0.133674,
      "lister_name" : "Ludlow Thompson",
      "price" : 475,
      "price_formatted" : "475 GBP per week",
      "lister_url" : "http://rd.nestoria.co.uk/rd?l...
      "thumb_url" : "http://limg.nestoria.co.uk/6/0/...",
      ...
    

choosing a geocoder

geocoder - base data

geocoder - added data

 Soho = area(W1F,W1D,WC2H)+20%
 Clapham Park = center(SW4 8)+1km
 Isle of White = Isle of Wight
 Hyde Park Square = Hyde Park
    

praise regular expressions

UK postcodes
(((^[BEGLMNS][1-9]\d?) | (^W[2-9] ) | ( ^( A[BL] | B[ABDHLNRST] |
C[ABFHMORTVW] | D[ADEGHLNTY] | E[HNX] | F[KY] | G[LUY] | H[ADGPRSUX] |
I[GMPV] | JE | K[ATWY] | L[ADELNSU] | M[EKL] | N[EGNPRW] | O[LX] |
P[AEHLOR] | R[GHM] | S[AEGKL-PRSTWY] | T[ADFNQRSW] | UB | W[ADFNRSV] |
YO | ZE ) \d\d?) | (^W1[A-HJKSTUW0-9]) | (( (^WC[1-2]) | (^EC[1-4]) |
(^SW1) ) [ABEHMNPRVWXY] ) ) (\s*)? ([0-9][ABD-HJLNP-UW-Z]{2}))

housenumbers

\b(\d+[\-\/]?\d*[\-]?[A-F]*[\-\/]?\w*)\b

Challenges

Measure confidence

Google geocoder confidence [1]

0 - Unknown location
1 - Country
2 - Region (state, province, prefecture, etc.)
3 - Sub-region (county, municipality, etc.)
4 - Town (city, village)
5 - Post code (zip code) <= in UK this is 10
6 - Street
7 - Intersection
8 - Address

Questions?