openrefine


How to extract ONLY lat, lon values for node “osm_type”:“node” in a Nominatim response using Google Refine


I'm using Google Refine to geocoding addresses with requests to Nominatim API as suggested in this great post https://opensas.wordpress.com/2013/06/30/using-openrefine-to-geocode-your-data-using-google-and-openstreetmap-api/.
All works fine: here you are two samples ...
http://open.mapquestapi.com/nominatim/v1/search.php?format=json&q=Via%20Pietro%20Paleocapa%2073,Alzano%20Lombardo,Italia
produces
[{"place_id":"55017260","licence":"Data \u00a9 OpenStreetMap contributors, ODbL 1.0. http:\/\/www.openstreetmap.org\/copyright","osm_type":"way","osm_id":"22565087","boundingbox":["45.7324335","45.736092","9.7222512","9.7235157"],"lat":"45.7343899","lon":"9.7231855","display_name":"Via Pietro Paleocapa, Alzano Lombardo, BG, Lombardy, 24027, Italy","class":"highway","type":"unclassified","importance":0.6}]
and
http://open.mapquestapi.com/nominatim/v1/search.php?format=json&q=Via%20Cernaia%2020,%20Torino%20,%20Italia
produces
[{"place_id":"24085209","licence":"Data \u00a9 OpenStreetMap contributors, ODbL 1.0. http:\/\/www.openstreetmap.org\/copyright","osm_type":"node","osm_id":"2334729647","boundingbox":["45.0715728","45.0715728","7.6742348","7.6742348"],"lat":"45.0715728","lon":"7.6742348","display_name":"20, Via Cernaia, Quadrilatero Romano, Circoscrizione 1, Turin, TO, Piemont, 10122, Italy","class":"place","type":"house","importance":0.201}]
The differences are about that the first response has a "osm_type":"way" type and the second one has a "osm_type":"node" type.
I'm interested ONLY in responses about "osm_type":"node", and for these ones I'd like to extract lat and lon values.
I don't know how to extract them using GREL in Google Refine ..... Any suggestions?
If could be useful I can also obtain the reponses in XML ... here you're the requests
http://open.mapquestapi.com/nominatim/v1/search.php?format=json&q=Via%20Pietro%20Paleocapa%2073,Alzano%20Lombardo,Italia&format=xml
http://open.mapquestapi.com/nominatim/v1/search.php?format=json&q=Via%20Cernaia%2020,%20Torino%20,%20Italia&format=xml
You can approach this in several ways, but the basic step is to extract the osm_type. Given the JSON you've posted here the GREL would be:
value.parseJson()[0].osm_type
One approach would be to create a column based on this value, then use a Facet to filter to those where the value in this new column is 'node'.
Alternatively you could combine the steps in a single GREL statement using 'if':
if(value.parseJson()[0].osm_type=="node",value.parseJson()[0].lat,"")
This extracts the latitude if osm_type is equal to 'node' and otherwise puts an empty string in the cell.
A slight tweak on Owen's formula can remove some of the redundancy:
with(value.parseJson()[0], place, if(place.osm_type=='node',place.lat,''))
It's not a big savings here, but it's a good technique to know about when the expressions get longer and more complex. The with control function assigns a value to a variable that you can use later.

Related Links

OpenRefine - Lost records
Incrementing a date in openrefine
add numbers down a column in OpenRefine
OpenRefine split on character in multivalue cell
Openrefine: text facet by counting
Select multiple repeated records OpenRefine
Simple OpenRefine IF to create a new column
OpenRefine split in multiple cells
How to export the cell that contains new line character properly?
Is it possible to run an OpenRefine script in the background?
Browser cluster link does not work properly in Open Refine
How to save only specific JSON elements in a new OpenRefine column
Openrefine: cross.cell for similar but not identical values
OpenRefine changing the port and host when executable is run directly
How can I join two datasets using a key in OpenRefine, with the secondary table having more than one value?
Open Refine: Open Project Issue

Categories

HOME
sendgrid
ionic-framework
zeromq
heroku
layout
fft
grep
at-command
v8
dxl
jpeg
ezpublish
multiple-records
amazon-cloudformation
cloudkit
maude-system
vault
spring-xd
modelica
postgres-xl
alignment
static-libraries
ghc
graphlab
amazonsellercentral
jquery-ajaxq
sylius
mmap
zurb-foundation-6
opennlp
underflow
virtualdub
uninstall
maxmind
ping
bosh
siesta-swift
fog
typed.js
hybridauth
asset-pipeline
bytecode-manipulation
unobtrusive-validation
android-fingerprint-api
leading-zero
lumberjack
texmaker
segment
executenonquery
chain-builder
paxos
worksheet
git-diff
statsd
mplayer
diagnostics
pdfclown
flow-control
mediaelement
gnome-shell-extensions
topbeat
angular-strap
time-and-attendance
log4c
dstu2-fhir
react-native-listview
bluegiga
nessus
itextpdf
pyke
android-listview
sailfish-os
nsight
remobjects
javax.mail
tween
titanium-modules
article
heisenbug
jquery-layout
rebol3
xceed-datagrid
dealloc
picturefill
mbr
jsctypes
quartz-graphics
dataservice
excel-2003
eclipse-memory-analyzer
android-hardware
anonymous-methods
automount
bubble-chart
reddot
cufon
subgurim-maps
django-tagging
calling-convention
jmock
lpeg
yslow
subviews
ext3
webkit.net
data-driven
simpletest
mediarss
data-acquisition

Resources

Database Users
RDBMS discuss
Database Dev&Adm
javascript
java
csharp
php
android
javascript
java
csharp
php
python
android
jquery
ruby
ios
html
Mobile App
Mobile App
Mobile App