Use google-refine on csv without headers and with various number of columns per record
I'm attempting to import in open-refine a csv extracted from a NoSQL database (Cassandra) without headers and with different number of columns per record. For instance, fields are comma separated and could look like below: 1 - userid:100456, type:specific, status:read, feedback:valid 2 - userid:100456, status:notread, message:"some random stuff here but with quotation marks", language:french There's a maximum number of columns and there aren't cleansing required on their names. How do I make up a big excel file I could mine using pivot table?
If you can get JSON instead, Refine will ingest it directly. If that's not a possibility, I'd probably do something along the lines of: import as lines of text split into two columns containing row ID and fields split multi-valued cells on fields column using comma as a separatd split fields column into two columns using colon as a separate use key/value on these two columns to unfold into columns
Lost all my files on Openrefine [closed]
freebaseapps reconciliation stuck in Open Refine 2.6
OpenRefine - add sequence number, reset for each record
How to transpose cell data by section in Open Refine?
OpenRefine columnwise scripting
Remove content inside parentheses
Extra blank space between words
forNonBlank function in OpenRefine
Import columns to existing OpenRefine project
Bulk replace text in all columns
Split multi valued cells in more than one column into rows (Open Refine)
OpenRefine - Fill between cells but not at the end of the list
Reconciliation services for OpenRefine not working?
Appending a specific string in GREL
How to extract ONLY lat, lon values for node “osm_type”:“node” in a Nominatim response using Google Refine
Replace null cell with space character