Thursday, November 6, 2014

geocodeHERE 0.1 is on CRAN

In my previous blog post, I detailed how I created my first R package called geocodeHERE. This package is a convenient wrapper for Nokia's HERE geocoding API. The cool thing about this API is that it allows for bulk geocoding. So, instead of doing n API calls to geocode n addresses, you can do it with just a couple API calls. Also, you can run 10,000 API calls per day vs. Google's 2,500.

Any how, I went through the process to submit the package to CRAN and it was accepted. It took me 3 attempts, but I did it and it should be replicated across the mirrors by now. You can check it out on CRAN here.

Here's how it works, starting with downloading and installing the package (you'll need the httr package installed)…

install.packages("geocodeHERE", repo="http://cran.rstudio.com",
                 dependencies=TRUE)  
## Installing package into ‘/home/c/R/x86_64-pc-linux-gnu-library/3.1’
## (as ‘lib’ is unspecified)
## trying URL 'http://cran.rstudio.com/src/contrib/geocodeHERE_0.1.tar.gz'
## ...
## * DONE (geocodeHERE)

Now, we can try to run some simple queries. Given an address or place name, it will return the latitude, longitude, or NA if it can't find anything

library(geocodeHERE)  
geocodeHERE_simple("wrigley field chicago IL")
## $Latitude
## [1] 41.95
## 
## $Longitude
## [1] -87.65
geocodeHERE_simple("navy pier chicago IL")
## $Latitude
## [1] 41.94
## 
## $Longitude
## [1] -87.7
geocodeHERE_simple("the bean chicago IL")
## [1] NA

That's kinda boring though. There exists functions in other packages to do this same thing, namely ggmap::geocode(). What's interesting about Nokia HERE's API is that you can do gecoding of many addresses in bulk.

data(chicago_landmarks)
addresses <- chicago_landmarks[,"Address"]
# tack on "chicago IL" to the end of these addresses
addresses <- paste(addresses, "chicago IL")
# make a dataframe with an id column so you can match the lat, lngs back
addresses_df <- data.frame(id=1:length(addresses), addresses=addresses)
str(addresses_df)
## 'data.frame':    375 obs. of  2 variables:
##  $ id       : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ addresses: Factor w/ 374 levels "1001 W. Belmont Avenue chicago IL",..: 127 149 160 161 345 290 312 206 318 203 ...
address_str <- df_to_string(addresses_df)
request_id <- geocodeHERE_batch_upload(address_string = address_str,
                                       email_address = "test@test.com")
# wait about 15 seconds... status should go from "running" to "completed"
Sys.sleep(15)
geocodeHERE_batch_status(request_id)
## [1] "completed"
# download the data
geocode_data <- geocodeHERE_batch_get_data(request_id)
# match it back to your original addresses dataframe
addresses_df <- merge(addresses_df, geocode_data, by.x="id", by.y="recId", all.x=T)
str(addresses_df)
## 'data.frame':    375 obs. of  16 variables:
##  $ id              : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ addresses       : Factor w/ 374 levels "1001 W. Belmont Avenue chicago IL",..: 127 149 160 161 345 290 312 206 318 203 ...
##  $ SeqNumber       : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ seqLength       : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ displayLatitude : num  41.9 41.9 41.9 41.9 41.8 ...
##  $ displayLongitude: num  -87.6 -87.6 -87.6 -87.6 -87.6 ...
##  $ houseNumber     : int  300 333 35 3600 NA 6901 860 4605 9326 4550 ...
##  $ street          : chr  "W Adams St" "N Michigan Ave" "E Wacker Dr" "N Halsted St" ...
##  $ district        : chr  "Loop" "Loop" "Loop" "Lake View" ...
##  $ city            : chr  "Chicago" "Chicago" "Chicago" "Chicago" ...
##  $ postalCode      : int  60606 60601 60601 60613 60637 60649 60611 60640 60643 60640 ...
##  $ county          : chr  "Cook" "Cook" "Cook" "Cook" ...
##  $ state           : chr  "IL" "IL" "IL" "IL" ...
##  $ country         : chr  "USA" "USA" "USA" "USA" ...
##  $ matchLevel      : chr  "houseNumber" "houseNumber" "houseNumber" "houseNumber" ...
##  $ relevance       : num  1 1 1 1 0.63 1 0.99 1 1 1 ...

All of the previous commands were done under the DEMO license keys. If you are going to use this a lot, it's probably best to get your own API keys. You can do that here.

If you have any questions, hit me up on twitter or shoot me an email at corynissen_AT_gmail.com

5 comments:

  1. Hi,
    Looks like a neat and useful package, thanks.

    I was testing the code you provided, and when I got to "request_id <- geocodeHERE_batch_upload(...)" I got and error: "Error in response$Response : $ operator is invalid for atomic vectors" Any idea what the problem is?

    Thanks so much

    ReplyDelete
    Replies
    1. I tested the code in the help document and it seemed to work fine for me. Can you please provide the entire code that failed for you?

      Delete
  2. Hi,
    I'm using Rstudio Version 0.98.501,
    R version 3.1.2 (2014-10-31) "Pumpkin Helmet"
    Platform: i686-pc-linux-gnu (32-bit)
    on Ubuntu 12.04 (32 bit)

    I used the following code:
    addresses <- chicago_landmarks[,"Address"]
    addresses <- paste(addresses, "chicago IL")
    addresses_df <- data.frame(id=1:length(addresses), addresses=addresses)
    address_str <- df_to_string(addresses_df)

    request_id <- geocodeHERE_batch_upload(address_string = address_str,
    email_address = "wgaul@hotmail.com")

    and got:
    ## Error in response$Response : $ operator is invalid for atomic vectors

    I also tried running it with email_address = "test@test.com" as in the blog post but it gets the same error.

    I registered an app to get API keys to see if that would help, but when I run
    the code with my App_id and App_code specified, I get:
    ## Error in response$Response :
    ## object of type 'externalptr' is not subsettable
    ## In addition: Warning message:
    ## XML content does not seem to be XML: '{"error":"Forbidden","error_description":"
    ## These credentials do not authorize access. Please contract your customer
    ## representative or email locationapi@here.com to discuss upgrading your account."}'

    ReplyDelete
    Replies
    1. I tried installing an older version of httr... version .3 and got the same error. Can you update the httr package and see if it resolves your issue? The email address isn't a big deal... you can put gibberish in there if you want and it will still work

      Delete
  3. That solved it. I had httr version 0.4 installed but have now upgraded to version 0.6.0 and things are working. Thanks so much!

    ReplyDelete

Note: Only a member of this blog may post a comment.