openrefine


How to extract ONLY lat, lon values for node “osm_type”:“node” in a Nominatim response using Google Refine


I'm using Google Refine to geocoding addresses with requests to Nominatim API as suggested in this great post https://opensas.wordpress.com/2013/06/30/using-openrefine-to-geocode-your-data-using-google-and-openstreetmap-api/.
All works fine: here you are two samples ...
http://open.mapquestapi.com/nominatim/v1/search.php?format=json&q=Via%20Pietro%20Paleocapa%2073,Alzano%20Lombardo,Italia
produces
[{"place_id":"55017260","licence":"Data \u00a9 OpenStreetMap contributors, ODbL 1.0. http:\/\/www.openstreetmap.org\/copyright","osm_type":"way","osm_id":"22565087","boundingbox":["45.7324335","45.736092","9.7222512","9.7235157"],"lat":"45.7343899","lon":"9.7231855","display_name":"Via Pietro Paleocapa, Alzano Lombardo, BG, Lombardy, 24027, Italy","class":"highway","type":"unclassified","importance":0.6}]
and
http://open.mapquestapi.com/nominatim/v1/search.php?format=json&q=Via%20Cernaia%2020,%20Torino%20,%20Italia
produces
[{"place_id":"24085209","licence":"Data \u00a9 OpenStreetMap contributors, ODbL 1.0. http:\/\/www.openstreetmap.org\/copyright","osm_type":"node","osm_id":"2334729647","boundingbox":["45.0715728","45.0715728","7.6742348","7.6742348"],"lat":"45.0715728","lon":"7.6742348","display_name":"20, Via Cernaia, Quadrilatero Romano, Circoscrizione 1, Turin, TO, Piemont, 10122, Italy","class":"place","type":"house","importance":0.201}]
The differences are about that the first response has a "osm_type":"way" type and the second one has a "osm_type":"node" type.
I'm interested ONLY in responses about "osm_type":"node", and for these ones I'd like to extract lat and lon values.
I don't know how to extract them using GREL in Google Refine ..... Any suggestions?
If could be useful I can also obtain the reponses in XML ... here you're the requests
http://open.mapquestapi.com/nominatim/v1/search.php?format=json&q=Via%20Pietro%20Paleocapa%2073,Alzano%20Lombardo,Italia&format=xml
http://open.mapquestapi.com/nominatim/v1/search.php?format=json&q=Via%20Cernaia%2020,%20Torino%20,%20Italia&format=xml
You can approach this in several ways, but the basic step is to extract the osm_type. Given the JSON you've posted here the GREL would be:
value.parseJson()[0].osm_type
One approach would be to create a column based on this value, then use a Facet to filter to those where the value in this new column is 'node'.
Alternatively you could combine the steps in a single GREL statement using 'if':
if(value.parseJson()[0].osm_type=="node",value.parseJson()[0].lat,"")
This extracts the latitude if osm_type is equal to 'node' and otherwise puts an empty string in the cell.
A slight tweak on Owen's formula can remove some of the redundancy:
with(value.parseJson()[0], place, if(place.osm_type=='node',place.lat,''))
It's not a big savings here, but it's a good technique to know about when the expressions get longer and more complex. The with control function assigns a value to a variable that you can use later.

Related Links

How to save only specific JSON elements in a new OpenRefine column
Openrefine: cross.cell for similar but not identical values
OpenRefine changing the port and host when executable is run directly
How can I join two datasets using a key in OpenRefine, with the secondary table having more than one value?
Open Refine: Open Project Issue
Progressive number in Openrefine column
Lost all my files on Openrefine [closed]
freebaseapps reconciliation stuck in Open Refine 2.6
OpenRefine - add sequence number, reset for each record
How to transpose cell data by section in Open Refine?
OpenRefine columnwise scripting
Remove content inside parentheses
Extra blank space between words
forNonBlank function in OpenRefine
Import columns to existing OpenRefine project
Bulk replace text in all columns

Categories

HOME
ibm-bluemix
log4j
pdf
zeromq
reserved
hashmap
vmware
slick-slider
node-pdfkit
correlation
windows-server-2012
decimal
pc
nodatime
zapier
xlsxwriter
wijmo
android-widget
vlsi
pepper
su
fopen
linkerd
large-file-upload
libssl
r-raster
preg-match
google-rich-snippets
siesta-swift
gtrendsr
webtest
framemaker
lxd
google-api-nodejs-client
amazon-kinesis-kpl
wpf-controls
android-ble
disassembling
serve
hockeyapp
abstract-class
theano.scan
cookiecutter-django
reportingservices-2005
quadratic-programming
jvm-languages
isbn
modelmapper
node-sass
.net-4.6.2
slick-3.0
convertapi
termination
pnotify
tactic
knockout-components
feeds
mediaelement
gnome-shell-extensions
azure-sdk
database-optimization
disque
master-slave
captivenetwork
ipconfig
xpath-1.0
associative-array
natvis
vhd
metaclass
riak-cs
deis
mono-embedding
sankey-diagram
createprocessasuser
directoryservices
website-monitoring
c3
knuth
navigationservice
jboss-weld
eclipse-memory-analyzer
pushbackinputstream
yui-compressor
interface-orientation
twrequest
getmessage
pendrive
subscript
boost-filesystem
gnustep
vdsp
inotifycollectionchanged
nhibernate.search
mirah
user-friendly
libs
pascal-fc
zend-decorators
wsdl.exe
caching-application-block

Resources

Mobile Apps Dev
Database Users
javascript
java
csharp
php
android
MS Developer
developer works
python
ios
c
html
jquery
RDBMS discuss
Cloud Virtualization
Database Dev&Adm
javascript
java
csharp
php
python
android
jquery
ruby
ios
html
Mobile App
Mobile App
Mobile App