Practical Data Science in Python

  • It might be useful to those not familiar with it, but this blogpost was written using IPython Notebooks - you can code, plot and then render to HTML all in the browser. Most of my data science work is done using this format. If Python isn't your language of choice, there are lots of plugins for Python Notebook to let you effectively do in-browser REPL with plotting and documentation: http://ipython.org/notebook.html

    It's changed the way I work (and blog)

  • There's this thought constantly bugging me - Python is popular among data scientists, but it also happens to be quite a slow language (roughly speaking) in comparison to the likes of Java or Go for instance. Hypothetically speaking, would it not be more beneficial to use something like Rust instead?

  • For the ones who are seeking for data science in Python, that is great. Thanks for sharing!

  • undefined

  • If you're coming from web development and used to using virtualenv, anaconda has environment management too. Run $(conda install conda-env). You can still pip install things into conda environments too. you'll probably want to $(conda install binstar) and search for various packages with that don't come in stock anaconda. For example, you can $(conda install --javascript node)

  • I gave a strikingly/humorously similar talk at a meetup in Boston ~1.5 years ago:

    http://nbviewer.ipython.org/github/mmautner/email_classifier...