Posted by: Karlo Bartels | 2010/12/14

Batch Geocode Library

Geocoding is “the process of finding associated geographic coordinates (often expressed as latitude and longitude) from other geographic data, such as street addresses, or postal codes.” Reverse geocoding is the opposite process: find the street address of geographic coordinates.

To easily facilitate batch geocoding, i.e. processing large sets of geographic data, I created a batch geocode library that uses the Bing Spatial Data Services API. This API allows processing large sets of data by creating a Geocode Dataflow job on the server that handles the geocode request and returns the geocoded data.

To use the Geocode Dataflow API you need to:

  1. Create a Geocode job and upload the data
  2. Get the status of the geocode job
  3. Download the geocode job results

Each steps requires creating HttpWebRequest/HttpWebResponse objects to upload the requests and download the results.

GeoCode Library

The library I am presenting here does all the plumbing for you and can be easily incorporated into your own projects. It exposes two main classes:

  • GeoCodeJob: creates a batch geocode job based on an input file or a GeoCodeEntity array.
  • GeoCodeResult: retrieves the status of a geocode job and downloads the job results.

Next to the aforementioned classes, the library incorporates the classes available in the Spatial Data Services schema. This allows the library to work with and expose the Spatial Data Services classes. The following picture illustrates the relationships between the classes in the schema (click to enlarge):

geocodelib-1

GeoCodeJob

This class contains an overloaded method by the name of CreateJob. The method allows you to generate a geocode job from either a data file or a GeocodeEntity array. Either method returns the geocode job ID that you need in order to check the job status and download the results. Please note that before you can call any of these methods you need the specify a Bing Maps application ID. For information about getting a Bing Maps Key, see Getting a Bing Maps Key.

The following code snippet shows how a data file is processed:

// Read file contents, build URI, create job and get job ID
using (FileStream dataStream = File.OpenRead(dataFile))
{
  Uri requestURI = BuildRequestURI(inputFormat, jobDescription);
  HttpWebRequest request = BuildWebRequestFromStream(requestURI,
                           dataStream);
  jobID = GetJobID(request);
}
 

The BuildRequestURI method builds a request URI for the Spatial Data Services REST call, while the BuildWebRequestFromStream creates an HttpWebRequest based on the supplied stream and URI.

The GetJobID method is listed below:

private string GetJobID(HttpWebRequest request)
{
  string jobID = String.Empty;
  string locationHeader = String.Empty;
  using (HttpWebResponse response =
        (HttpWebResponse)request.GetResponse())
  {
    if (response.StatusCode != HttpStatusCode.Created)
      throw new WebException("Unexpected status code: " +
                             response.StatusCode.ToString());

    // Check for presence of Location header
    locationHeader = response.GetResponseHeader("Location");
    if (String.IsNullOrEmpty(locationHeader))
      throw new WebException
            ("Missing 'Location' header in the job response.");

    // Return job ID from Location header
    if (locationHeader.IndexOf("/") > 0)
      jobID = locationHeader.Substring
             (locationHeader.LastIndexOf("/") + 1);
    else
      throw new Exception
         ("Invalid 'Location' header: ID could not be parsed.");
  }
  return jobID;
}
 

The GetJobID method returns the geocode job ID (‘5bf10c37df944083b1879fbb0556e67e’ for example) that should be used to retrieve the job status and its results. The GeoCodeResult class contains methods to do just that. 

GeoCodeResult

This class contains the GetJobStatus and GetJobResponse methods to retrieve the process status of a geocode job and download the results. To perform these tasks, you could write code like this:

// Check job status
GeoCodeResult res = new GeoCodeResult();
res.BingMapsKey = _bingMapsKey;
GeoCodeJobStatus result = res.GetJobStatus(jobID);
if (result == GeoCodeJobStatus.Completed)
{
   // Job complete, get results
   DataflowJob job = res.JobDetails;
   GeocodeFeed feed = res.GetJobResponse(job);
}
 

The JobDetails property of the GeoCodeResult class returns a serialized DataflowJob that is received every time you call the GetJobStatus method. The following snippet shows the DataflowJob section of the job response:

<DataflowJob>
  <Id>5bf10c37df944083b1879fbb0556e67e</Id>
  <Link role="self">https://spatial.virtualearth.net
                   /REST/v1/dataflows/Geocode/
                   5bf10c37df944083b1879fbb0556e67e</Link>
  <Link role="output" name="succeeded">
             https://spatial.virtualearth.net/REST/v1/dataflows/
             Geocode/5bf10c37df944083b1879fbb0556e67e/
             output/succeeded</Link>
  <Description>Xml</Description>
  <Status>Completed</Status>
  <CreatedDate>2010-05-10T13:22:35.0553408-07:00</CreatedDate>
  <CompletedDate>2010-05-10T13:23:49.1959658-07:00</CompletedDate>
  <TotalEntityCount>12</TotalEntityCount>
  <ProcessedEntityCount>12</ProcessedEntityCount>
  <FailedEntityCount>0</FailedEntityCount>
</DataflowJob>
 

As soon as the geocode job has completed, a Link node will appear in the DataflowJob that contains the download URL. The results can be downloaded by passing a DataflowJob object to the GetJobResponse method. The following code snippet shows how:

HttpWebRequest request = (HttpWebRequest)
                         WebRequest.Create(uriBuilder.Uri);
request.ContentType = "application/xml";
using (HttpWebResponse response =
      (HttpWebResponse)request.GetResponse())
{
  if (response.StatusCode == HttpStatusCode.OK)
  {
    using (StreamReader reader = new StreamReader
          (response.GetResponseStream()))
    {
      // Serialize XmlStream into a GeocodeFeed object
      XmlSerializer serializer = new XmlSerializer
                                 typeof(GeocodeFeed));
      feed = (GeocodeFeed)serializer.Deserialize(reader);
    }
  }
  else
  {
    throw new WebException("Unexpected status code: " +
                            response.StatusCode.ToString());
  }
}
 

The GetJobResponse method deserializes the XML response available at the download URL in the DataflowJob as a GeocodeFeed object. This object is the top-level object in the Spatial Data Services schema (see illustration in the diagram above).

Next, you can use the GeocodeFeed object to programmatically retrieve the Geocode job results like so:

GeocodeResponse response = feed.GeocodeEntity[0].GeocodeResponse;
MessageBox.Show("Lat/long coordinates of the first item: " +
  response.InterpolatedLocation.Latitude.ToString() + " " +
  response.InterpolatedLocation.Longitude.ToString());
 

or serialize it back to XML:

using (FileStream fs = new FileStream("geocode result.xml",
                           FileMode.OpenOrCreate))
{
  XmlSerializer serializer = new XmlSerializer
                            (typeof(GeocodeFeed));
  serializer.Serialize(fs, feed);
}
 

Wrap-up

As you can see, the Batch Geocode Library allows you to perform all tasks necessary to start batch geocoding your data without too much trouble. To get started quickly, I including a small test application:

test application

Simply download the source code here, open the Main form, view the code and enter your Bing Maps application ID in the top line:

private string _bingMapsKey = "<Your Bing Maps application ID";
 

… and press F5 to run the application.

Happy geocoding!

Advertisements

Responses

  1. This is great! Appreciate you sharing!
    I assume you used xsd.exe to create the xsd and cs file? What was the exact command you entered? I was having trouble getting the xsd created properly.

    Christie

  2. I created the XSD file by copying the XML schema content in the Geocode Dataflow Data Schema section of the Bing Spatial Data Services help file into an empty XSD file, which I saved as SpatialDataServices.xsd. Next, I ran the xsd.exe tool to generate the resulting SpatialDataServices.cs class file. Please note the comments at the top of the class about re-creating it.

  3. Hi there,

    I’m still hunting for solutions, but I have a number of addresses I want to plot on a bing map. Since I don’t have the longitude/latitude, I’m realizing that I need that. So I learned of the REST API, and this seems like as good of approach as any. The problem is, I need this process automated – rather than, as in your application, having a physical user response, I need to be able to automate this process. That is, query my SQL 2008 db every hour and see if any addresses don’t have long/lat entries, pull those entries, format as XML, then make a geo-code request, get the response, and put the long/lat into the database.

    Is there a way to automate this script you’ve got, or are there any resources out there that might be of use to me in automating the Bing Maps geocoding process?

    On the other hand, is there an easier solution even? Can I somehow simply read an XML document with addresses and plot the points on the Bing Map?

    Thanks!

  4. Hi Kevin,

    You can use my geocode library in an automated approach. The Windows Forms application I included is for demonstration purposes only. You could create another library project that queries your database and calls into the library I created to geo-code the results. You do not have to create an XML file to create a job request, you could also use the geocoding object model included in the library to create such a request. Have a look at the “Create Job From Feed” option in the sample application. It generally boils down to using the GeocodeEntityBuilder object to create a collection of geocode request, which you subsequently pass to the CreateJob method of the GeoCodeJob class. Now that I think about it, this could be the subject for another blog post 😉

    The other option you suggested involves calling the geocode REST service to first geocode the addresses and then calling the imagery REST service to create a map image. You would have to create a class which calls the REST services and return the result as a Bitmap. My project uses the SOAP version of the Bing Maps services, but I won’t hold it against you if you decide to use the REST services 😉

    Regards,
    Karlo

  5. Hi Karlo,

    I geocoded several data sets last year with your post here. I really appreciate it.

    At the moment, I need to geocode a new set of addresses but there is an error that I couldn’t solve. I checked the codes and the code works until the line “GeocodeFeed feed = res.GetJobResponse(job);” and at this line it gives the error message “There is an error in XML document (2,2).” The input xml file is as follows:

    – – – – – – –

    Can you help me with this problem?

    Thanks in advance!

    Erkan

  6. Hi Erkan,

    So if I understand it correctly you did manage to create a job and get a job id in return? If that is the case then the problem probably lies in the returned XML structure. The Geocode dataflow schema was updated to version 2.0 recently, so my guess is you need to recreate the xsd to reflect those changes. See one of my previous comments on how to do that. You can find the new schema here (in the XML Schema section of the page): http://msdn.microsoft.com/en-us/library/jj735477.aspx

    Regards,
    Karlo

  7. Hi Karlo,

    Thanks a lot for the prompt reply! Indeed, I get the job id in return. I will check your comments

    Cheers,

    Erkan


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Categories

%d bloggers like this: