CS109a: Introduction to Data Science – Resources
These notes might be a great source for what they cover, but as a whole I find this to be a good example of what is currently wrong with data science education. While the syllabus has bullet points that include "1. data collection", "2. data management", and "5. communication", the content and schedule have a 90%+ overlap with a standard machine learning course. They even use a statistical learning textbook (a good one, but still).
Statistics departments keep trying to latch on to the excitement (and money) around data science by changing the superfluous things like department names and course titles without actually adjusting what they teach. I would love to see a version of this that actually engages at a non-superficial level with topics such as database design, theory(ies) of data visualization, methods for storytelling with data, and interactive design.
This seems to be a tremendous amount of material to cover - with associated programming exercises to boot - for a course that requires only intro courses in CS and Statistics as prerequisites. So one does wonder how superficial it might be and how much students adhere to the warning about Google usage. Or perhaps Harvard students truly are that smart and hard-working that they can manage to go deep into all this material while managing with the rest of a full course load! https://harvard-iacs.github.io/2021-CS109A/pages/syllabus.ht...
this is the newer version
Video recordings of the lectures seem to require access to Harvard's Canvas platform? Is it possible for outsiders to watch them?
I'm new in this field and one thing I have a hard time understanding is how to apply all these ML algorithms, python libraries, etc. on very large data (i.e. how to deal with the memory problem, etc.). If someone could point me to links and/or hands-on courses I would really appreciate it.
Is there a similar class online with PyTorch?
undefined
Couldn’t find videos. Anybody have luck with that ?
Keras... Yikes