Hacker News Clone

Why Pandas feels clunky when coming from R

by braza on 2/20/2024, 6:08 AM with 6 comments

by karencarits on 2/20/2024, 10:30 PM
For even simpler data management in R, the author may have a alook at the `data.table` package [1]. His example would then be
```
    library(data.table)
    dt <- fread("purchases.csv")
    dt[, .(total = sum(amount - discount)), by = "country"]
```
and much faster to run
[1] https://rdatatable.gitlab.io/data.table/
by FateOfNations on 2/21/2024, 12:56 AM
A number of aspects of the design of Pandas make more sense if you know it's background. It originated in a quantitative finance environment where it's common to be working with time series data. It grew from that in to a more general purpose data manipulation and analysis library.
by mint2 on 2/21/2024, 7:34 PM
This is fine and all, (although I’m not impressed by the quality of the python code), but the examples don’t show how to do meta programming.
I often dont want to manually write out column names, but programmatically specify them, and similar for a lot of other of these examples. I don’t want to manually configure them.
I haven’t seen examples of that higher level programming in these various R python comparisons. It’s always tediously manual examples.
The examples usually feel like manual analyst query type tasks. Even the tone strongly reinforces that with text like “oh and Maria asked me to xyz”
by efilife on 2/21/2024, 12:59 PM
Are the "puRHcases" in the headings intentional or a typo?