openrefine


Openrefine not working as expected


I'm very new to OpenRefine, so please bear with me if i have made a simple mistake.
I'm parsing a HTML website to gather some date.
Everything went fine with fetching the individual pages, but now the parsing of the HTML fails.
I'm creating a new column, based on the one holding all the page's HTML. I'm trying to get to the data in a specific DIV[20].
In the"create column based on this column" window it gives me a preview when using value.parseHtml().select("DIV")[20] , which results in exactly what i need... executing it gives me nothing but blank cells.
it even tells me that it is "filling 0 rows with grel:value.parseHtml().select("DIV")[20]"
Any clue what i'm doing wrong here?
You just need to finalize with .toString() to output the JSON.org object AS a string.
This is explained on our wiki here: https://github.com/OpenRefine/OpenRefine/wiki/StrippingHTML#extract-html-attributes-text-links-with-integrated-grel-commands
I also updated the select() function with that example: https://github.com/OpenRefine/OpenRefine/wiki/GREL-Other-Functions#selectelement-e-string-s

Related Links

Browser cluster link does not work properly in Open Refine
How to save only specific JSON elements in a new OpenRefine column
Openrefine: cross.cell for similar but not identical values
OpenRefine changing the port and host when executable is run directly
How can I join two datasets using a key in OpenRefine, with the secondary table having more than one value?
Open Refine: Open Project Issue
Progressive number in Openrefine column
Lost all my files on Openrefine [closed]
freebaseapps reconciliation stuck in Open Refine 2.6
OpenRefine - add sequence number, reset for each record
How to transpose cell data by section in Open Refine?
OpenRefine columnwise scripting
Remove content inside parentheses
Extra blank space between words
forNonBlank function in OpenRefine
Import columns to existing OpenRefine project

Categories

HOME
sendgrid
google-chrome-extension
testng
atom-editor
tinymce
layout
platform-builder
electron
packages
ios-charts
here-api
mapserver
fallback
visual-studio-cordova
interop
beyondcompare
nhibernate-envers
shared-hosting
tokenize
excel-2007
telerik-reporting
angular2-aot
django-storage
virtualdub
bpel
tinymce-4
preg-match-all
pim
vsts-build-task
mapdb
suricata
main
git-merge
theano.scan
angularjs-factory
awt
eigenvalue
isbn
reactive-cocoa-5
babel-core
rails-routing
promela
hendrix
hls.js
integrity
composite-key
dynamics-sl
synchronous
setuptools
log4c
mathematica-frontend
clang-static-analyzer
markojs
make-install
packagist
goose
itextpdf
separator
0xdbe
deis
arcanist
interrupted-exception
sysinternals
npapi
typekit
relocation
mdt
websocket4net
preferences
code-access-security
flask-cors
mysql-error-1062
jboss-weld
ril
comaddin
interface-orientation
anonymous-methods
simba
chuck
labwindows
nsmanagedobject
external-accessory
blackberry-playbook
yetanotherforum
html-input
fixed-width
database-management
exchange-server-2003
data-driven
avatar
noscript
pascal-fc
3gp

Resources

Database Users
RDBMS discuss
Database Dev&Adm
javascript
java
csharp
php
android
javascript
java
csharp
php
python
android
jquery
ruby
ios
html
Mobile App
Mobile App
Mobile App