pandas


is there an equivalent of data-frame in OCaml?


I have been on the R side for some years. I don't do any hardcore statistics, but rather use R as a sophisticated 'csv-files' manipulater. nevertheless, i do need to process a huge amout of data, in a distributed way.
I found that R is not fast enough for my application anymore and I am now investigating other languages.
the first choice is Python-pandas, which is faster. Also, I read that Ocaml could be 10x faster than python, which sounds very attractive to me.
However, i found that the standard libraries of Ocaml seems to be quite low-level. I cannot find any high-level containers like R's data frame.
How do you guys represent data frames in Ocaml? do you use a list of tuples? can anyone share a bit knowledge here?
thanks!
I had to google for data frames in R, not being familiar with R, but it seems like you're looking for records, or perhaps a list of records. Or as you suggest, maybe a list of tuples would have similar properties to R data frames if you add some functions to access the data in the tuples more easily. But I think records would be closer as you can refer to the name of a field in the record.
See the chapter on Records in Real World OCaml.
I am actually working on a dataframe class right now for OCaml. Hopefully I will have it finished in a few weeks. My progress so far is on GitHub. (Note: The current version on github does not have function 100%).
https://github.com/PamExx/TimeSeries/blob/master/TimeSeries.ml
As indicated in Thomas answer, such a rich data structure would be provided by a specialized library. You can start with either an array of records or a record of arrays. If your rows are not floating-point numbers only, a record of arrays might be slightly preferable. But perhaps it is more important for cache locality whether you work across rows (then array of records) or across columns (then record of arrays). Beware that you might want to base computations on low-level libraries such as LACAML or Stream Processing with OCaml -- you should study their APIs to get inspiration how to implement your high-level data structure. It would be nice if someone provided the actual high-level library! You can also try to work with both OCaml and R using OCaml-R.

Related Links

how to do logical operation between dataframe columns?
Console hangs up at the time of plotting
Pandas apply a function at fixed interval
float type column in pandas to convert to tuple/list
Getting an error with Pandas Panel boolean indexing
pandas dataframe subtraction causing nan
Pandas dataframe: truncate string fields
how to add new categorical column in pandas
Finding different Ids with the same value in pandas dataframe
Why can't iterrows do math - and instead returns integer values where these should be floats
How to merge/concatenate based on column multiindex? (Pandas)
Groupby function on pandas dataframe
How are the nan values filled in x.add(y, fill_value = 0)?
mulitiindexing in python: how to select level0-index based on multiple values in level1 rows
What's the Pandas way to write `if()` conditional between two `timeseries` columns?
dask read_csv upcasts bool to object

Categories

HOME
arduino-uno
blogger
angular-material
fft
octobercms
routes
kalman-filter
at-command
spring-jdbc
indesign
directx
cross-validation
windows-10-universal
adobe-analytics
nano-server
user-input
imacros
modelica
correlation
google-cloud-spanner
jplayer
decimal
libtiff
foselasticabundle
after-effects
progressive-web-apps
angular-ui
lucene.net
scichart
conemu
cloudhub
dcevm
tapestry
dxf
java-7
fopen
fifo
publish
xacml
wtx
google-cloud-endpoints-v2
gtrendsr
stringtemplate
mmenu
ios5
libraries
occlusion
az-application-insights
ssjs
ansible-playbook
estimote
arena-simulation
mplayer
libusb-win32
nsarray
minimization
medium.com
visual-c++-2008
setuptools
wdf
gridpane
ado.net-entity-data-model
pickadate
nessus
netmq
phpcas
unity-networking
reactive-banana
execute
sdhc
eclipse-clp
varargs
embedded-code
php-ci
method-parameters
graph-api-explorer
angularjs-ng-click
html-helper
project-planning
contenttype
reactfx
doskey
quantlib-swig
interface-orientation
kyotocabinet
specification-pattern
code-cleanup
punbb
xmlspy
hamachi
fluent-interface
filtered-index
nsviewanimation
mirah
swing-app-framework
text-coloring
thread-local-storage
spec#

Resources

Mobile Apps Dev
Database Users
javascript
java
csharp
php
android
MS Developer
developer works
python
ios
c
html
jquery
RDBMS discuss
Cloud Virtualization
Database Dev&Adm
javascript
java
csharp
php
python
android
jquery
ruby
ios
html
Mobile App
Mobile App
Mobile App