The Django Project Debates User Tracking

  • Why does it not suffice to examine 'pip install Django' metrics from PyPI? That would be a reliable indicator of the relative popularity of the package against other packages in a level playing field.

    While it would overcount the number of true installations of projects using Django, judging by the number of times I spin up a VM for testing, I would still argue that would be a better metric than a custom GA integration for which you'd have no relevant point of comparison. Even if they were to make this opt-out, what would they compare it to?

    A: "Based on our custom GA developer tracking, we count 400,000 new Django projects this month."

    B: "Django is the 4th most frequently installed third party Python package, based on the Python package index."

    Personally I'd trust statement B more than A. No one can independently verify statement A.

  • At first before reading the article I was very much against it. But after reading it, it seems a little bit more reasonable but I would really strongly prefer two things if this were to ever get implemented: 1) don't use Google. There must be services that will provide analytics pro-bono/really cheaply for open source/non-profits that aren't tied to a company with terrible privacy track records like Google. 2) make it abundantly clear that this will happen and explicitly give the opt-out instructions the first time (if it was indeed opt-out). As in, "We are enabling user tracking for better usage statistics. If you would like to opt-out, please type <...>." I know that I rarely read changelogs and that if things are not presented to me at installation time they would probably sneak in, through the fault of the Django team or not, but I'd worry that such a thing (notifying users directly) isn't easily possible through pip installations/setup.py.

    However, would it be very useful statistically compared to the Pypi installation numbers? Sure, Python is different than NPM because NPM almost always locally installs packages whereas Python installs globally by default, but the numbers must still be high as Django is likely one of the highest installed packages from Pypi and in Python-land in general and as czep points out, because they would only be tracking themselves, it would be hard to compare numbers to anything. It would be useful from a total amount perspective but it wouldn't have any use in comparing to other packages because the kind of data would be different: Django would have usage statistics whereas Pypi has installation/download statistics.

    I'm also surprised this is even necessary, since the main purpose of this is supposedly to be able to talk to potential investors for the DSF with concrete numbers. Is Django being basically familiar with every Python developer not enough? I'd really want to know specifically if investors have said they want usage data explicitly, rather than the nebulous idea that it may help make it easier to raise money before I'm more open to the proposal.

  • I use django professionally, and if tracking usage helps guide development or attract sponsors to achieve higher quality -- I'm all for it.

    There is a problem to be solved (how to make OSS sustainable), and I'm both interested in solving that problem and trying different approaches to solve it.

    (edited for less use of the phrase "I'm all for it")

  • Someone proposed tracking django developers using the django command line? What a ludicrous and creepy idea.

    edit: why downvote? that's what it says:

    > the developer commands: startproject, startapp, runserver

  • So even if we do have an accurate usage count, say 10 millions, so what? What's the Foundation's plan to get funding?

    I think they should run annual campaign like Mozilla and Wikipedia. The spend of the money should be 100% transparent. I am not really sure why we need a Foundation. I get the hosting cost, and rewarding people to work on very difficult features and enhancements, but what else? Conference cost & scholarship? What else.

  • I do not support user tracking. Id fork it at that point.

  • I use Django and don't mind being tracked if it helps development. However, the proposed tracking sounds like hit tracking which doesn't give you any meaningful numbers only trends. So I think tracking pip installs would give you the same trends.

  • The best part:

    "It is encouraging to see that a community can discuss such issues without heating up too much and shows great maturity for the Django project."

  • undefined

  • jezdez' proposal seems to be rather reasonable: just force the user to explicitly select yes or no - that gets over the objection that people will be too lazy to opt-in, since the effort is the same. And it removes another source of bias, which is the disabling of the tracking by redistributers like Debian, since the user does provide explicit permission.

  • The threat to add user tracking could be the best incentive for me to donate more to the project.

  • I think this is an strong idea, and I don't see any issues with the proposed implementation using google analytics.

    Certainly it seems more practical than any of the proposed alternatives suggested here. (Eg, micropayments. Come on, that's not even plausible...)

  • I allow both Eclipse and Firefox DE to collect usage and bug information during my use of those systems ... I feel there are a few keys to making this decision for both platforms:

    - I can opt out if I want to

    - I can see what's sent if I want to

    - The information is anonymized and aggregated

    I would assume that Django developers would feel the same way as I do if there were these guarantees - that it's also in my interest for the software to improve.

  • I think they did a very good job in discussing it openly instead of going the homebrew-way.

  • What if, instead of tracking, they added micropayments? Have a very simple way to donate $1 every time you run startapp or something like that, and boom, profit.

  • Have they tested charging for Django? Id pay a reasonable fee to use it. I mean, least I could do (aside from donating sporadically).

  • Oh well Django had a good run for me but I don't use spyware. I guess something similar can be built up using Flask.

  • I'm really not sure posting to HN is what lwn subscriber links are for.

  • I use Django quite a bit, and would immediately disable any such tracking mechanism, even going to the extent of maintaining my own fork if necessary.

    Having this on tools (like brew) is sort of OK because you can disable it and not risk having it deployed to production. Having it on a library is senseless, risky in many regards and likely to get it banned from, say, public contracts.

    It is also a likely hook for exploitation, but I'll need to see an implementation first. Which I sure hope won't happen.

  • The problem is Django itself as a framework and Python as a slow infrastructure for it are getting too old with time. I love Django but it grows too restrictive as projects get more complicated (ORM and template rendering for example), not to mention the slow performance compared to new languages like Go and Elixir, which is actually Python's responsibility not Django.

    Django is a monolithic framework that wants to do everything while there are good and even superior alternatives(SQlAlchemy, Jinja2, WTForms), which makes things harder for its developers.