pandas


is there an equivalent of data-frame in OCaml?


I have been on the R side for some years. I don't do any hardcore statistics, but rather use R as a sophisticated 'csv-files' manipulater. nevertheless, i do need to process a huge amout of data, in a distributed way.
I found that R is not fast enough for my application anymore and I am now investigating other languages.
the first choice is Python-pandas, which is faster. Also, I read that Ocaml could be 10x faster than python, which sounds very attractive to me.
However, i found that the standard libraries of Ocaml seems to be quite low-level. I cannot find any high-level containers like R's data frame.
How do you guys represent data frames in Ocaml? do you use a list of tuples? can anyone share a bit knowledge here?
thanks!
I had to google for data frames in R, not being familiar with R, but it seems like you're looking for records, or perhaps a list of records. Or as you suggest, maybe a list of tuples would have similar properties to R data frames if you add some functions to access the data in the tuples more easily. But I think records would be closer as you can refer to the name of a field in the record.
See the chapter on Records in Real World OCaml.
I am actually working on a dataframe class right now for OCaml. Hopefully I will have it finished in a few weeks. My progress so far is on GitHub. (Note: The current version on github does not have function 100%).
https://github.com/PamExx/TimeSeries/blob/master/TimeSeries.ml
As indicated in Thomas answer, such a rich data structure would be provided by a specialized library. You can start with either an array of records or a record of arrays. If your rows are not floating-point numbers only, a record of arrays might be slightly preferable. But perhaps it is more important for cache locality whether you work across rows (then array of records) or across columns (then record of arrays). Beware that you might want to base computations on low-level libraries such as LACAML or Stream Processing with OCaml -- you should study their APIs to get inspiration how to implement your high-level data structure. It would be nice if someone provided the actual high-level library! You can also try to work with both OCaml and R using OCaml-R.

Related Links

Insert 0-values for missing dates within MultiIndex
Reindexing dataframes
pandas access axis by user-defined name
Trouble with groupss and aggregation
Replace MultiIndex's contents with DataFrame columns
What's the `DataFrameGroupBy`-equivalent of `dict.keys`?
How to split a dataframe according to a boolean criterion?
Pandas Rolling Computations on Sliding Windows (Unevenly spaced)
Resampling Minute data
How to get the last n row of pandas dataframe?
Resample time series in pandas to a weekly interval
Suppress output of object when plotting in ipython
qtconsole not rendering pandas dataframes as html notebook_repr_html option
How to get the sub_level index value of Pandas dataframe?
Pandas groupby size “count” intermittent under-count
rank data over a rolling window in pandas DataFrame

Categories

HOME
wso2-am
debugging
deezer
kde
nullpointerexception
react-virtualized
view
fft
session
spagobi
programming-languages
medical
vifm
imacros
quickbooks
resize
windows-azure-storage
timeout
oracle-coherence
textfield
abi
wkwebview
tibco-mdm
cultureinfo
hammerspoon
crystal-reports-2010
tdd
su
django-storage
phpfox
tooltipster
buck
openoffice.org
google-api-nodejs-client
jmonkeyengine
web-mining
automake
xenforo
ssjs
dism
ws-security
fedex
turbogears
businessworks
segment
apple-news
osx-mavericks
youcompleteme
sage-one
veracode
lift-json
pintos
watchconnectivity
jquery-filter
ableton-live
probability-density
home-directory
azure-virtual-network
database-optimization
dstu2-fhir
jwplayer7
sigabrt
energy
uid
ford-fulkerson
freedesktop.org
wireshark-dissector
sframe
clipperlib
inmobi
ng-animate
qcodo
xcode-6.2
markers
internet-connection
castle
tld
yorick
mechanize-ruby
didselectrowatindexpath
cloud-connect
page-layout
ms-project-server-2010
listings
dataservice
cos
gwt-rpc
padarn
coderush
twrequest
hosts-file
work-stealing
boost-filesystem
invite
calling-convention
locate
gdlib
forums
meego
nintendo-ds
avatar
simpletest
windows-live-messenger

Resources

Mobile Apps Dev
Database Users
javascript
java
csharp
php
android
MS Developer
developer works
python
ios
c
html
jquery
RDBMS discuss
Cloud Virtualization
Database Dev&Adm
javascript
java
csharp
php
python
android
jquery
ruby
ios
html
Mobile App
Mobile App
Mobile App