Use google-refine on csv without headers and with various number of columns per record
I'm attempting to import in open-refine a csv extracted from a NoSQL database (Cassandra) without headers and with different number of columns per record. For instance, fields are comma separated and could look like below: 1 - userid:100456, type:specific, status:read, feedback:valid 2 - userid:100456, status:notread, message:"some random stuff here but with quotation marks", language:french There's a maximum number of columns and there aren't cleansing required on their names. How do I make up a big excel file I could mine using pivot table?
If you can get JSON instead, Refine will ingest it directly. If that's not a possibility, I'd probably do something along the lines of: import as lines of text split into two columns containing row ID and fields split multi-valued cells on fields column using comma as a separatd split fields column into two columns using colon as a separate use key/value on these two columns to unfold into columns
Reconciliation services for OpenRefine not working?
Appending a specific string in GREL
How to extract ONLY lat, lon values for node “osm_type”:“node” in a Nominatim response using Google Refine
Replace null cell with space character
Open refine by google on private data
Openrefine not working as expected
Open Refine Error Uploading Data?
Open Refine / Google Refine - edit cells in multiple columns
Open Refine : Reconciliation with Freebase data based on ORganization Name
Keep newest duplicate row depending on multiple Columns
multiple filters in google openrefine
Where does openrefine store projects?
Domain Names to Webpage Titles in OpenRefine
How does one run Google refine on a different port than 3333?
OpenRefine - Cross-column clustering
Grel to apply to ALL columns or current column