pandas


is there an equivalent of data-frame in OCaml?


I have been on the R side for some years. I don't do any hardcore statistics, but rather use R as a sophisticated 'csv-files' manipulater. nevertheless, i do need to process a huge amout of data, in a distributed way.
I found that R is not fast enough for my application anymore and I am now investigating other languages.
the first choice is Python-pandas, which is faster. Also, I read that Ocaml could be 10x faster than python, which sounds very attractive to me.
However, i found that the standard libraries of Ocaml seems to be quite low-level. I cannot find any high-level containers like R's data frame.
How do you guys represent data frames in Ocaml? do you use a list of tuples? can anyone share a bit knowledge here?
thanks!
I had to google for data frames in R, not being familiar with R, but it seems like you're looking for records, or perhaps a list of records. Or as you suggest, maybe a list of tuples would have similar properties to R data frames if you add some functions to access the data in the tuples more easily. But I think records would be closer as you can refer to the name of a field in the record.
See the chapter on Records in Real World OCaml.
I am actually working on a dataframe class right now for OCaml. Hopefully I will have it finished in a few weeks. My progress so far is on GitHub. (Note: The current version on github does not have function 100%).
https://github.com/PamExx/TimeSeries/blob/master/TimeSeries.ml
As indicated in Thomas answer, such a rich data structure would be provided by a specialized library. You can start with either an array of records or a record of arrays. If your rows are not floating-point numbers only, a record of arrays might be slightly preferable. But perhaps it is more important for cache locality whether you work across rows (then array of records) or across columns (then record of arrays). Beware that you might want to base computations on low-level libraries such as LACAML or Stream Processing with OCaml -- you should study their APIs to get inspiration how to implement your high-level data structure. It would be nice if someone provided the actual high-level library! You can also try to work with both OCaml and R using OCaml-R.

Related Links

retrieve data from pandas dataframe if it doesn't cooccur in previous column
pandas resample MAX-VALUE with corresponding ANGLE-VALUE
Performance issues with writing data to HDFStore
Finding same value index of pandas Series
Get Maximum Value from Dataframe
Slicing in group by function
Grouping factors in python patsy
pandas Series groupby col not found
Annotate labels in pandas scatter plot
Arithmetic in pandas HDF5 queries
Exception appending DataFrame chunk with string values to large HDF5 file using pandas
Unable to use seaborn.countplot
How to convert a key and list of values to a dataframe in pyspark?
Pandas standard deviation miracle
Pandas: Test for key existence in dictionary
Speedup of pandas groupby

Categories

HOME
keycloak
mockito
push-notification
smarty
plot
objectgears
spagobi
cmd
jsrender
survey
windows-server
yum
flyway4
add
awesome-wm
kentor-authservices
imacros
handsontable
modx-revolution
google-cloud-ml
fortumo
clearcase-ucm
tostring
fatal-error
code-review
custom-wordpress-pages
opencover
wkwebview
tapestry
realex-payments-api
telerik-reporting
crystal-reports-2010
vlsi
fopen
neo4j-spatial
libuv
delicious-api
socialengine
google-sites-2016
commit
mmenu
hexo
disassembling
lto
dynamics-crm-2013
unspecified
hue
ansible-playbook
temporary-files
trim
s
auto-update
osx-mavericks
typescript1.8
r-forge
knpmenubundle
qtwebview
django-scheduler
minimization
query-performance
time-and-attendance
url-pattern
deadbolt-2
mikroc
asp.net-4.5
dlna
fadeout
bluemix-app-scan
intel-fortran
lemon
thredds
wordml
jqgrid-formatter
service-accounts
ios9.1
remobjects
file-copying
device-manager
gulp-less
java.nio.file
page-layout
xsockets.net
ruby-datamapper
dbconnection
tws
file-locking
hungarian-algorithm
sharp-repository
hamiltonian-cycle
trusted
gil
cascalog
armcc
psi
gdlib
suppress
getresponsestream
dbal
webkit.net
procedural-music
django-notification
ajax-forms

Resources

Mobile Apps Dev
Database Users
javascript
java
csharp
php
android
MS Developer
developer works
python
ios
c
html
jquery
RDBMS discuss
Cloud Virtualization
Database Dev&Adm
javascript
java
csharp
php
python
android
jquery
ruby
ios
html
Mobile App
Mobile App
Mobile App