Why Pandas feels clunky when coming from R
For even simpler data management in R, the author may have a alook at the `data.table` package [1]. His example would then be
and much faster to runlibrary(data.table) dt <- fread("purchases.csv") dt[, .(total = sum(amount - discount)), by = "country"]
A number of aspects of the design of Pandas make more sense if you know it's background. It originated in a quantitative finance environment where it's common to be working with time series data. It grew from that in to a more general purpose data manipulation and analysis library.
This is fine and all, (although I’m not impressed by the quality of the python code), but the examples don’t show how to do meta programming.
I often dont want to manually write out column names, but programmatically specify them, and similar for a lot of other of these examples. I don’t want to manually configure them.
I haven’t seen examples of that higher level programming in these various R python comparisons. It’s always tediously manual examples.
The examples usually feel like manual analyst query type tasks. Even the tone strongly reinforces that with text like “oh and Maria asked me to xyz”
Are the "puRHcases" in the headings intentional or a typo?