Tag Archives: django

Monetizing the Text-Processing API with Mashape

This is a short story about the text-processing.com API, and how it became a profitable side-project, thanks to Mashape.

Text-Processing API

When I first created text-processing.com, in the summer of 2010, my initial intention was to provide an online demo of NLTK’s capabilities. I trained a bunch of models on various NLTK corpora using nltk-trainer, then started making some simple Django forms to display the results. But as I was doing this, I realized I could fairly easily create an API based on these models. Instead of rendering HTML, I could just return the results as JSON.

I wasn’t sure if anyone would actually use the API, but I knew the best way to find out was to just put it out there. So I did, initially making it completely open, with a rate limit of 1000 calls per day per IP address. I figured at the very least, I might get some PHP or Ruby users that wanted the power of NLTK without having to interface with Python. Within a month, people were regularly exceeding that limit, and I quietly increased it to 5000 calls/day, while I started searching for the simplest way to monetize the API. I didn’t like what I found.

Monetizing APIs

Before Mashape, your options for monetizing APIs were either building a custom solution for authentication, billing, and tracking, or pay thousands of dollars a month for an “enterprise” solution from Mashery or Apigee. While I have no doubt Mashery & Apigee provide quality services, they are not in the price range for most developers. And building a custom solution is far more work than I wanted to put into it. Even now, when companies like Stripe exist to make billing easier, you’d still have to do authentication & call tracking. But Stripe didn’t exist 2 years ago, and the best billing option I could find was Paypal, whose API documentation is great at inducing headaches. Lucky for me, Mashape was just opening up for beta testing, and appeared to be in the process of solving all of my problems 🙂

Mashape

Mashape was just what I needed to monetize the text-processing API, and it’s improved tremendously since I started using it. They handle all the necessary details, like integrated billing, plus a lot more, such as usage charts, latency & uptime measurements, and automatic client library generation. This last is one of my favorite features, because the client libraries are generated using your API documentation, which provides a great incentive to accurately document the ins & outs of your API. Once you’ve documented your API, downloadable libraries in 5 different programming languages are immediately available, making it that much easier for new users to consume your API. As of this writing, those languages are Java, PHP, Python, Ruby, and Objective C.

Here’s a little history for the curious: Mashape originally did authentication and tracking by exchanging tokens thru an API call. So you had to write some code to call their token API on every one of your API calls, then check the results to see if the call was valid, or if the caller had reached their limit. They didn’t have all of the nice charts they have now, and their billing solution was the CEO manually handling Paypal payments. But none of that mattered, because it worked, and from conversations with them, I knew they were focused on more important things: building up their infrastructure and positioning themselves as a kind of app-store for APIs.

Mashape has been out of beta for a while now, with automated billing, and a custom proxy server for authenticating, routing, and tracking all API calls. They’re releasing new features on a regular basis, and sponsoring events like MusicHackDay. I’m very impressed with everything they’re doing, and on top of that, they’re good hard-working people. I’ve been over to their “hacker house” in San Francisco a few times, and they’re very friendly and accomodating. And if you’re ever in the neighborhood, I’m sure they’d be open to a visit.

Profit

Once I had integrated Mashape, which was maybe 20 lines of code, the money started rolling in :). Just kidding, but using the typical definition of profit, when income exceeds costs, the text-processing API was profitable within a few months, and has remained so ever since. My only monetary cost is a single Linode server, so as long as people keep paying for the API, text-processing.com will remain online. And while it has a very nice profit margin, total monthly income barely approaches the cost of living in San Francisco. But what really matters to me is that text-processing.com has become a self-sustaining excuse for me to experiment with natural language processing techniques & data sets, test my models against the market, and provide developers with a simple way to integrate NLP into their own projects.

So if you’ve got an idea for an API, especially if it’s something you could charge money for, I encourage you to build it and put it up on Mashape. All you need is a working API, a unique image & name, and a Paypal account for receiving payments. Like other app stores, Mashape takes a 20% cut of all revenue, but I think it’s well worth it compared to the cost of replicating everything they provide. And unlike some app stores, you’re not locked in. Many of the APIs on Mashape also provide alternative usage options (including text-processing), but they’re on Mashape because of the increased exposure, distribution, and additional features, like client library generation. SaaS APIs are becoming a significant part of modern computing infrastructure, and Mashape provides a great platform for getting started.

Python 3 Web Development Review

Python 3 Web DevelopmentThe problem with Python 3 Web Development Beginner’s Guide, by Michel Anders, is one of expectations (disclaimer: I received a free eBook from Packt for review). Let’s start with the title… First we have Python 3 Web Development. This immediately sets the wrong expectations because:

  1. There’s almost as much jQuery & Javascript as there is Python.
  2. Most of the Python code is not Python 3 specific, and the code that is could easily be translate to Python 2.
  3. Much of the Python code either uses CherryPy or is for generating HTML. This is not immediately obvious, but becomes apparent in Chapter 3 (which is available as a free PDF download: Chapter 3 – Tasklist I Persistence).

Second, this book is also supposed to be a Beginner’s Guide, but that is definitely not the case. To really grasp what’s going on, you need to already know the basics of HTML, jQuery interaction, and how HTTP works. Chapter 1 is an excellent introduction to HTTP and web application development, but the book as a whole is not beginner material. I think that anything that uses Python metaclasses automatically becomes at least intermediate level, if not expert, and the main thrust of Chapter 7 is refactoring all your straightforward database code to use complicated metaclasses.

However, if you mentally rewrite the title to be “Web Framework Development from scratch using CherryPy and jQuery”, then you’ve got the right idea. The book steps you through web app development with CherryPy, database models with sqlite3, and plenty of HTML and jQuery for interface generation and interaction. While creating example applications, you slowly build up a re-usable framework. It’s an interesting approach, but unfortunately it gets muddied up with inline HTML rendering. I never thought a language as simple and elegant as Python could be reduced to the ugliness of common PHP, but generating HTML with string interpolation inside the same functions that are accessing the database gets pretty close. I kept expecting the author to introduce template rendering, which is a major part of most modern web development frameworks, but it never happened, despite the plethora of excellent Python templating libraries.

While reading this book, I often had the recurring thought “I’m so glad I use Django“. If your aim is rapid application development, this is not the book for you. However, if you’re interested in creating your own web development framework, or would at least like to understand how a framework like Django could be created, then buy a copy Python 3 Web Development.

Django Application Conventions

A Django application is really just a python package with a few conventionally named modules. Most apps will not need all of the modules described below, but it’s important to follow the naming conventions and code organization because it will make your application easier to use. Following these conventions gives you a common model for understanding and building the various pieces of a Django application. It also makes it possible for others who share the same common model to quickly understand your code, or at least have an idea of where certain parts of code are located and how everything fits together. This is especially important for reusable applications. For examples, I highly recommend browsing through the code of applications in django.contrib, as they all (mostly) follow the same conventional code organization.

models.py

models.py is the only module that’s required by Django, even if you don’t have any code in it. But chances are that you’ll have at least 1 database model, signal handler, or perhaps an API connection object. models.py is the best place to put these because it is the one app module that is guarenteed to be imported early. This also makes it a good location for connection objects to NoSQL databases such as Redis or MongoDB. Generally, any code that deals with data access or storage should go in models.py, except for simple lookups and queries.

managers.py

Model managers are sometimes placed in a separate managers.py module. This is optional, and often overkill, as it usually makes more sense to define custom managers in models.py. However, if there’s a lot going in your custom manager, or if you have a ton of models, it might make sense to separate the manager classes for clarity’s sake.

admin.py

To make your models viewable within Django’s Admin system, then create an admin.py module with ModelAdmin objects for each necessary model. These models can then be autodiscovered if you use the admin.autodiscover() call in your top level urls.py.

views.py

View functions (or classes) have 3 responsibilities:

  1. request handling
  2. form processing
  3. template rendering

If a view function is doing anything else, then you’re doing it wrong. There are many things that fall under request handling, such as session management and authentication, but any code that does not directly use the request object, or that will not be used to render a template, does not belong here. One valid is exception is sending signals, but I’d argue that a form or models.py is a better location. View functions should be short & simple, and any data access should be primarily read-only. Code that updates data in a database should either be in models.py or the save() method of a form.

Keep your view functions short & simple – this will make it clear how a specific request will produce a corresponding response, and where potential bottlenecks are. Speed has business value, and the easiest way to speed up code is to make it simpler. Do less, and move the complexity elsewhere, such as forms.py.

Use decorators generously for validating requests. require_GET, require_POST, or require_http_methods should go first. Next, use login_required or permission_required as necessary. Finally, use ajax_request or render_to from django annoying so that your view can simply return a dict of data that will be translated into a JSON response or a RequestContext. It’s not unheard of to have view functions with more decorators than lines of code, and that’s ok because the process flow is still clear, since each decorator has a specific purpose. However, if you’re distributing a pluggable app, then do not use render_to. Instead, use a template_name keyword argument, which will allow developers to override the default template name if they wish. This template name should be prefixed by an appropriate subdirectory. For example, django.contrib.auth.views uses the template subdirectory registration/ for all its templates. This encourages template organization to mirror application organization.

If you have lots of views that can be grouped into separate functionality, such as account management vs everything else, then you can create separate view modules. A good way to do this is to create a views subpackage with separate modules within it. The comments contrib app organizes its views this way, with the user facing comments views in views/comments.py, and the moderator facing moderation views in views/moderation.py.

decorators.py

Before you write your own decorators, checkout the http decorators, admin.views.decorators, auth.decorators, and annoying.decorators. What you want may already be implemented, but if not, you’ll at least get to see a bunch of good examples for how to write useful decorators.

If you do decide to write your own decorators, put them in decorators.py. This module should contain functions that take a function as an argument and return a new function, making them higher order functions. This enables you to attach many decorators to a single view function, since each decorators wraps the function returned from the next decorator, until the final view function is reached.

You can also create functions that take arguments, then return a decorator. So instead of being a decorator itself, this kind of function generates and returns a decorator based on the arguments provided. render_to is such a higher order function: it takes a template name as an argument, then returns a decorator that renders that template.

middleware.py

Any custom request/response middleware should go in middleware.py. Two commonly used middleware classes are AuthenticationMiddleware and SessionMiddleware. You can think of middleware as global view decorators, in that a middleware class can pre-process every request or post-process every response, no matter what view is used.

urls.py

It’s good practice to define urls for all your application’s views in their own urls.py. This way, these urls can be included in the top level urls.py with a simple include call. Naming your urls is also a good idea – see django.contrib.comments.urls for an example.

forms.py

Custom forms should go in forms.py. These might be model forms, formsets, or any kind of data validation & transformation that needs to happen before storing or passing on request data. The incoming data will generally come from a request QueryDict, such as request.GET or request.POST, though it could also come from url parameters or view keyword arguments. The main job of forms.py is to transform that incoming data into a form suitable for storage, or for passing on to another API.

You could have this code in a view function, but then you’d be mixing data validation & transformation in with request processing & template rendering, which just makes your code confusing and more deeply nested. So the secondary job of forms.py is to contain complexity that would otherwise be in a view function. Since form validation is often naturally complicated, this is appropriate, and keeps the complexity confined to a well defined area. So if you have a view function that’s accessing more than one variable in request.GET or request.POST, strongly consider using a form instead – that’s what they’re for!

Forms often save data, and the convention is to use a save method that can be called after validation. This is how model forms behave, but you can do the same thing in your own non-model forms. For example, let’s say you want to update a list in Redis based on incoming request data. Instead of putting the code in a view function, create a Form with the necessary fields, and implement a save() method that updates the list in redis based on the cleaned form data. Now your view simply has to validate the form and call save() if the data is valid.

There should generally be no template rendering in forms.py, except for sending emails. All other template rendering belongs in views.py. Email template rendering & sending should also be implemented in a save() method. If you’re creating a pluggable app, then the template name should be a keyword argument so that developers can override it if they want. The PasswordResetForm in django.contrib.auth.forms provides a good example of how to do this.

tests.py

Tests are always a good idea (even if you’re not doing TDD), especially for reusable apps. There are 2 places that Django’s test runner looks for tests:

  1. doctests in models.py
  2. unit tests or doctests in tests.py

You can put doctests elsewhere, but then you have to define your own test runner to run them. It’s often easier to just put all non-model tests into tests.py, either in doctest or unittest form. If you’re testing views, be sure to use Django’s TestCase, as it provides easy access to the test client, making view testing quite simple. For a complete account of testing Django, see Django Testing and Debugging.

backends.py

If you need custom authentication backends, such as using an email address instead of a username, put these in backends.py. Then include them in the AUTHENTICATION_BACKENDS setting.

signals.py

If your app is defining signals that others can connect to, signals.py is where they should go. If you look at django.contrib.comments.signals, you’ll see it’s just a few lines of code with many more lines of comments explaining when each signal is sent. This is about right, as signals are essentially just global objects, and what’s important is how they are used, and in what context they are sent.

management.py

The post_syncdb signal is a management signal that can only be connected to within a module named management.py. So if you need to connect to the post_syncdb signal, management.py is the only place to do it.

feeds.py

To define your own syndication feeds, put the subclasses in feeds.py, then import them in urls.py.

sitemaps.py

Custom Sitemap classes should go in sitemaps.py. Much like the classes in admin.py, Sitemap subclasses are often fairly simple. Ideally, you can just use GenericSitemap and bypass custom Sitemap objects altogether.

context_processors.py

If you need to write custom template context processors, put them in context_processors.py. A good case for a custom context processor is to expose a setting to every template. Context processors are generally very simple, as they only return a dict with no more than a few key-values. And don’t forget to add them to the TEMPLATE_CONTEXT_PROCESSORS setting.

templatetags

The templatetags subpackage is necessary when you want to provide custom template tags or filters. If you’re only creating one templatetag module, give it the same name as your app. This is what django.contrib.humanize does, among others. If you have more than one templatetag module, then you can namespace them by prefixing each module with the name of your app name followed by an underscore. And be sure to create __init__.py in templatetags/, so python knows it’s a proper subpackage.

management/commands

If you want to provide custom management commands that can be used through manage.py or django-admin.py, these must be modules with the commands/ subdirectory of a management/ subdirectory. Both of these subdirectories must have __init__.py to make them python subpackages. Each command should be a separate module whose name will be the name of the command. This module should contain a single class named Command, which must inherit from BaseCommand or a BaseCommand subclass. For example, django.contrib.auth provides 2 custom management commands: changepassword and createsuperuser. Both of these commands are modules of the same name within django.contrib.auth.management.commands. For more details, see creating Django management commands.

jQuery Validation with Django Forms

Django has everything you need to do server-side validation, but it’s also a good idea to do client-side validation. Here’s how you can integrate the jQuery Validation plugin with your Django Forms.

jQuery Validation Rules

jQuery validation works by assigning validation rules to each element in your form. These rules can be assigned a couple different ways:

  1. Class Rules
  2. Metadata Rules
  3. Rules Object

Django Form Class Rules

The simplest validation rules, such as required, can be assigned as classes on your form elements. To do this in Django, you can specify custom widget attributes.

from django import forms
from django.forms import widgets

class MyForm(forms.Form):
    title = forms.CharField(required=True, widget=widgets.TextInput(attrs={
        'class': 'required'
    }))

In Django 1.2, there’s support for a required css class, but you can still use the technique above to specify other validation rules.

Django Form Metadata Rules

For validation methods that require arguments, such minlength and maxlength, you can create metadata in the class attribute. You’ll have to include the jQuery metadata plugin for this style of rules.

from django import forms
from django.forms import widgets

class MyForm(forms.Form):
    title = forms.CharField(required=True, minlength=2, maxlength=100, widget=widgets.TextInput(attrs={
        'class': '{required:true, minlength:2, maxlength:100}'
    }))

jQuery Validate Rules Object

If your validation requirements are more complex, or you don’t want to use the metadata plugin or class based rules, you can create a rules object to pass as an option to the validate method. This object can be generated in your template like so:

<script type="text/javascript">
FORM_RULES = {
    '{{ form.title.name }}': 'required'
};

$(document).ready(function() {
    $('form').validate({
        rules: FORM_RULES
    });
});
</script>

The reason I suggest generating the rules object in your template is to avoid hardcoding the field name in your javascript. A rules object can also be used in conjunction with class and metadata rules, so you could have some rules assigned in individual element classes or metadata, and other rules in your rules object.

Error Messages

If you want to keep the client-side validation error messages consistent with Django’s validation error messages, you’ll need to copy Django’s error messages and specify them in the metadata or in a messages object.

Metadata Messages

Messages must be specified per-field, and per-rule. Here’s an example where I specify the minlength message for the title field.

from django import forms
from django.forms import widgets

class MyForm(forms.Form):
    title = forms.CharField(minlength=2, widget=widgets.TextInput(attrs={
        'class': '{minlength:2, messages:{minlength:"Ensure this value has at least 2 characters"}}'
    }))

Messages Object

Messages can also be specified in javascript object, like so:

<script type="text/javascript">
FORM_RULES = {
    '{{ form.title.name }}': 'required'
};

FORM_MESSAGES = {
    '{{ form.title.name }}': 'This field is required'
};

$(document).ready(function() {
    $('form').validate({
        rules: FORM_RULES,
        messages: FORM_MESSAGES
    });
});
</script>

Just like with validation rules, messages in element metadata can be used in conjunction with a global messages object. Note: if an element has a title attribute, then the title will be used as the default error message, unless you specify ignoreTitle: false in the jQuery validate options.

Error Labels vs Errorlist

Django’s default error output is an error list, while the default for jQuery Validation errors is a label with class="error". So in order to unify your validation errors, there’s 2 options:

  1. make jQuery Validation output an error list
  2. output error labels instead of an error list in the template

Personally, I prefer the simple error labels produced by jQuery validation. To make Django generate those instead of an error list, you can do the following in your templates:

{{ field }}
{% if field.errors %}
{# NOTE: must use id_NAME for jquery.validation to overwrite error label #}
<label class='error' for='id_{{ field.name }}' generated="true">{{ field.errors|join:". " }}</label>
{% endif %}

You could also create your own error_class for outputting the error labels, but then you’d lose the ability to specify the for attribute.

If you want to try to make jQuery validation produce an error list, that’s a bit harder. You can specify a combination of jQuery validation options and get a list, but there’s not an obvious way to get the errorlist class on the ul.

$('form').validate({
    errorElement: 'li',
    wrapper: 'ul'
});

Other options you can look into are errorLabelContainer, errorContainer, and a highlight function.

Final Recommendations

I find it’s easiest to specify class and metadata rules in custom widget attributes 90% of the time, and use a rules object only when absolutely necessary. For example, if I want to require only the first elements in a formset, but not the rest, then I may use a rules object in addition to class and metadata rules. For error messages, I generally use a field template like the above example that I include for each field:

{% with form.title as field %}{% include "field.html" %}{% endwith %}

Or if the form is really simple, I do

{% for field in form %}{% include "field.html" %}{% endfor %}

Django Model Formsets

Django model formsets provide a way to edit multiple model instances within a single form. This is especially useful for editing related models inline. Below is some knowledge I’ve collected on some of the lesser documented and undocumented features of Django’s model formsets.

Model Formset Factory Methods

Django Model Formsets are generally created using a factory method. The default is modelformset_factory, which wraps formset_factory to create Model Forms. You can also create inline formsets to edit related objects, using inlineformset_factory. inlineformset_factory wraps modelformset_factory to restrict the queryset and set the initial data to the instance’s related objects.

Adding Fields to a Model Formset

Just like with a normal Django formset, you can add additional fields to a model formset by creating a base formset class with an add_fields method, then passing it in to the factory method. The only difference is the class you inherit from. For inlineformset_factory, you should inherit from BaseInlineFormSet.

If you’re using modelformset_factory, then you should import and inherit from BaseModelFormSet instead. Also remember that form.instance may be used to set initial data for the fields you’re adding. Just check to make sure form.instance is not None before you try to access any properties.

from django.forms.models import BaseInlineFormSet, inlineformset_factory

class BaseFormSet(BaseInlineFormSet):
    def add_fields(self, form, index):
        super(BasePlanItemFormSet, self).add_fields(form, index)
        # add fields to the form

FormSet = inlineformset_factory(MyModel, MyRelatedModel, formset=BaseFormSet)

Changing the Default Form Field

If you’d like to customize one or more of the form fields within your model formset, you can create a formfield_callback function and pass it to the formset factory. For example, if you want to set required=False on all fields, you can do the following.

def custom_field_callback(field):
    return field.formfield(required=False)

FormSet = modelformset_factory(model, formfield_callback=custom_field_callback)

field.formfield() will create the default form field with whatever arguments you pass in. You can also create different fields, and use field.name to do field specific customization. Here’s a more advanced example.

def custom_field_callback(field):
    if field.name == 'optional':
        return field.formfield(required=False)
    elif field.name == 'text':
        return field.formfield(widget=Textarea)
    elif field.name == 'integer':
        return IntegerField()
    else:
        return field.formfield()

Deleting Models in a Formset

Pass can_delete=True to your factory method, and you’ll be able to delete the models in your formsets. Note that inlineformset_factory defaults to can_delete=True, while modelformset_factory defaults to can_delete=False.

Creating New Models with Extra Forms

As with normal formsets, you can pass an extra argument to your formset factory to create extra empty forms. These empty forms can then be used to create new models. Note that when you have extra empty forms in the formset, you’ll get an equal number of None results when you call formset.save(), so you may need to filter those out if you’re doing any post-processing on the saved objects.

If you want to set an upper limit on the number of extra forms, you can use the max_num argument to restrict the maximum number of forms. For example, if you want up to 6 forms in the formset, do the following:

MyFormSet = inlineformset_factory(MyModel, MyRelatedModel, extra=6, max_num=6)

Saving Django Model Formsets

Model formsets have a save method, just like with model forms, but in this case, you’ll get a list of all modified instances instead of a single instance. Unmodified instances will not be returned. As mentioned above, if you have any extra empty forms, then those list elements will be None.

If you want to create custom save behavior, you can override 2 methods in your BaseFormSet class: save_new and save_existing. These methods look like this:

from django.forms.models import BaseInlineFormSet

class BaseFormSet(BaseInlineFormSet):
    def save_new(self, form, commit=True):
        # custom save behavior for new objects, form is a ModelForm
        return super(BaseFormSet, self).save_new(form, commit=commit)

    def save_existing(self, form, instance, commit=True):
        # custom save behavior for existing objects
        # instance is the existing object, and form has the updated data
        return super(BaseFormSet, self).save_existing(form, instance, commit=commit)

Inline Model Admin

Django’s Admin Site includes the ability to specify InlineModelAdmin objects. Subclasses of InlineModelAdmin can use all the arguments of inlineformset_factory, plus some admin specific arguments. Everything mentioned above applies equally to InlineModelAdmin arguments: you can specify the number of extra forms, the maximum number of inline forms, and even your own formset with custom save behavior.

Far Future Expires Header with django-storages S3Storage

One way to decrease your site’s load time is to set a far future Expires header on all your static content. This doesn’t help first-time visitors, but can greatly improve the experience of returning visitors. And you get to decrease your bandwidth needs at the same time, because all your static content will be cached by their browser.

S3

weotta puts all of its awesome plan images in Amazon’s S3 using django-storages S3Storage backend, which by default does not set any Expires header. To remedy this, I set AWS_HEADERS in settings.py like so

from datetime import date, timedelta
tenyrs = date.today() + timedelta(days=365*10)
# Expires 10 years in the future at 8PM GMT
AWS_HEADERS = {
	'Expires': tenyrs.strftime('%a, %d %b %Y 20:00:00 GMT')
}

Now every uploaded file gets an Expires header set to 10 years in the future.

upload_to

One potential drawback to using a far future Expires header is that if you change the file content without also changing the file name, no one will notice because they’ll keep using the old cached version of the file. Luckily, Django makes it easy to create (mostly) unique new file names by letting you include strftime formatting codes in a FileField or ImageField upload_to path, such as upload_to='images/%Y/%m/%d'. This way, every uploaded file automatically gets stored by date, which means it would take some deliberate effort to change the contents of a file without also changing the file name.

Django Tools and Links

Using Django
Social Apps
Forms
Notifications
Geolocation
Misc

Django IA: Registration-Activation

django-registration is a pluggable Django app that implements a common registration-activation flow. This flow is quite similar to the password reset flow, but slightly simpler with only 3 views:

  1. register
  2. registration_complete
  3. activate

The basic idea is that an anonymous user can create a new account, but cannot login until they activate their account by clicking a link they’ll receive in an activation email. It’s a way to automatically verify that the new user has a valid email address, which is generally an acceptable proxy for proving that they’re human. Here’s an Information Architecture diagram, again using jjg’s visual vocabulary.

Django Registration IA

Here’s a more in-depth walk-thru with our fictional user named Bob:

  1. Bob encounters a section of the site that requires an account, and is redirected to the login page.
  2. But Bob does not have an account, so he goes to the registration page where he fills out a registration form.
  3. After submitting the registration form, Bob is taken to a page telling him that he needs to activate his account by clicking a link in an email that he should be receiving shortly.
  4. Bob checks his email, finds the activation email, and clicks the activation link.
  5. Bob is taken to a page that tells him his account is active, and he can now login.

As with password reset, I think the last step is unnecessary, and Bob should be automatically logged in when his account is activated. But to do that, you’ll have to write your own custom activate view. Luckily, this isn’t very hard. If you take a look at the code for registration.views.activate, the core code is actually quite simple:

from registration.models import RegistrationProfile

def activate(request, activation_key):
    user = RegistrationProfile.objects.activate_user(activation_key.lower())

    if not user:
        # handle invalid activation key
    else:
        # do stuff with the user, such as automatically login, then redirect

The rest of the custom activate view is up to you.

Django IA: Auth Password Reset

Django comes with a lot of great built-in functionality. One of the most useful contrib apps is authentication, which (among other things) provides views for login, logout, and password reset. Login & logout are self-explanatory, but resetting a password is, by nature, somewhat complicated. Because it’s a really bad idea to store passwords as plaintext, you can’t just send a user their password when they forget it. Instead, you have to provide a secure mechanism for users to change their password themselves, even if they can’t remember their original password. Lucky for us, Django auth provides this functionality out of the box. All you need to do is create the templates and hook-up the views. The code you need to write to make this happen is pretty simple, but it can be a bit tricky to understand how it all works together. There’s actually 4 separate view functions that together provide a complete password reset mechanism. These view functions are

  1. password_reset
  2. password_reset_done
  3. password_reset_confirm
  4. password_reset_complete

Here’s an Information Architecture diagram showing how these views fit together, using Jesse James Garrett’s Visual Vocabulary. The 2 black dots are starting points, and the circled black dot is an end point.

Django Auth Password Reset IA

Here’s a more in-depth walk-thru of what’s going on, with a fictional user named Bob:

  1. Bob tries to login and fails, probably a couple times. Bob clicks a “Forgot your password?” link, which takes him to the password_reset view.
  2. Bob enters his email address, which is then used to find his User account.
  3. If Bob’s User account is found, a password reset email is sent, and Bob is redirected to the password_reset_done view, which should tell him to check his email.
  4. Bob leaves the site to check his email. He finds the password reset email, and clicks the password reset link.
  5. Bob is taken to the password_reset_confirm view, which first validates that he can reset his password (this is handled with a hashed link token). If the token is valid, Bob is allowed to enter a new password. Once a new password is submitted, Bob is redirected to the password_reset_complete view.
  6. Bob can now login to your site with his new password.

This final step is the one minor issue I have with Django’s auth password reset. The user just changed their password, why do they have to enter it again to login? Why can’t we eliminate step 6 altogether, and automatically log the user in after they reset their password? In fact, you can eliminate step 6 with a bit of hacking on your own authentication backend, but that’s a topic for another post.