Mnesia Records to MongoDB Documents
I recently migrated about 50k records from mnesia to MongoDB using my fork of emongo, which adds supervisors with transparent connection restarting, for reasons I'll explain below.
Why Mongo instead of Mnesia
mnesia is great for a number of reasons, but here's why I decided to move weotta's place data into MongoDB:
- easy to access from python and other languages
- schema-less data, so you're not constrained to records, and will never have to do mnesia:transform_table ever again
- don't have to keep everything in memory (or only on disk as the case may be)
- simple & flexible indexing & querying
Converting Records to Docs and vice versa
First, I needed to convert records to documents. In erlang, mongo documents are basically proplists. Keys going into emongo can be atoms, strings, or binaries, but keys coming out will always by binaries. Here's a simple example of record to document conversion:
record_to_doc(Record, Attrs) ->
% tl will drop record name
lists:zip(Attrs, tl(tuple_to_list(Record))).
This would be called like record_to_doc(MyRecord, record_info(fields, my_record)). If you have nested dicts then you'll have to flatten them using dict:to_list. Also note that list values are coming out of emongo are treated like yaws JSON arrays, i.e. [{key, {array, [val]}}]. For more examples, check out the emongo docs.
Heavy Write Load
To do the migration, I used etable:foreach to insert each document. Bulk insertion would probably be more efficient, but etable makes single record iteration very easy.
I started using the original emongo with a pool size of 10, but it was crashy when I dumped records as fast as possible. So initially I slowed it down with timer:sleep(200), but after adding supervised connections, I was able to dump with no delay. I'm not exactly sure what I fixed in this case, but I think the lesson is that using supervised gen_servers will give you reliability with little effort.
Read Performance
Now that I had data in mongo to play with, I compared the read performance to mnesia. Using timer:tc, I found that mnesia:dirty_read takes about 21 microseconds, whereas emongo:find_one can take anywhere from 600 to 1200 microseconds, querying on an indexed field. Without an index, read performance ranged from 900 to 2000 microseconds. I also tested only requesting specific fields, as recommended on the MongoDB Optimiziation page, but with small documents (<10 fields) that did not seem to have any effect. So while mongodb queries are pretty fast at 1ms, mnesia is about 50 times faster. Further inspection with fprof showed that nearly half of the cpu time of emongo:find is taken by BSON decoding.
Heavy Read Load
Under heavy read load (thousands of find_one calls in less than second), emongo_conn would get into a locked state. Somehow the process had accumulated unparsable data and wouldn't reply. This problem went away when I increased the size of the pool size to 100, but that's a ridiculous number of connections to keep open permanently. So instead I added some code to kill the connection on timeout and retry the find call. This was the main reason I added supervision. Now, every pool is locally registered as a simple_one_for_one supervisor that supervises every emongo_server connection. This pool is in turn supervised by emongo_sup, with dynamically added child specs. All this supervision allowed me to lower the pool size back to 10, and made it easy to kill and restart emongo_server connections as needed.
Why you may want to stick with Mnesia
Now that I have experience with both MongoDB and mnesia, here's some reasons you may want to stick with mnesia:
- very fast in-memory reads
- transactional
- simple master-master replication
- great for distributed read-heavy applications
Despite all that, I'm very happy with MongoDB. Installation and setup were a breeze, and schema-less data storage is very nice when you have variable fields and a high probability of adding and/or removing fields in the future. It's simple, scalable, and as mentioned above, it's very easy to access from many different languages. emongo isn't perfect, but it's open source and will hopefully benefit from more exposure.
Erlang Release Handling with Fab and Reltools
You've already got a first target system installed, and now you've written some new code and want to deploy it. This article will show you how to setup make and fab commands that use reltools to build & install new releases.
Appup
Your code should be part of an OTP application structure. Additionally, you will need an appup file in the ebin/ directory for each application you want to upgrade. There's a lot you can do in an appup file:
- reload a module
- add or delete a module
- update a running process
- and lots more. Refer to the Appup Cookbook and appup reference manual for more details.
Once you've updated app files with the newest version and configuration and created appup files with all the necessary commands, you're ready to create a new release.
Note: The app configuration will always be updated to the newest version, even if you have no appup commands.
Release
To create a new release, you'll need a new rel file, which I'll refer to as NAME-VSN.rel. VSN should be greater than your previous release version. My usual technique is to copy my latest rel file to NAME-VSN.rel, then update the release VSN and all the application versions.
Note: reltools assumes that the rel file will be in $ROOTDIR/releases/, where $ROOTDIR defaults to code:root_dir(). This path is also used below in the make and fab commands. You can pass a different value for $ROOTDIR, but releases/ is hard coded. This may change in the future, but for now your rel files must be in $ROOTDIR/releases/ if you want to use reltools.
Reltools
Before you finalize the new release, make sure reltools is in your code path. There 2 ways to do this:
- Make a copy of reltools and add it to your application.
- Clone elib and add it to your code path with
erl-paPATH/TO/elib/ebin.
If you choose option 1, be sure to include reltools in your app modules, and add it to your appup file with {add_module, reltools}.
But I'll assume you want option 2 because it provides cleaner code separation and easier release handling. Keeping elib external means you can easily pull new code, and only need to add the elib application to your rel file with the latest vsn.
Make Upgrade
Now that you have a new release defined, and elib is in your code path, you're ready to build release upgrade packages. Below is the make command I use to call reltools:make_upgrade("NAME-VSN"). Be sure to update PATH/TO/ to your particular code paths.
ERL=erl # use default erl command, but can override path on command line
src: FORCE
@$(ERL) -pa lib/*/ebin -make # requires an Emakefile
upgrade: src
@$(ERL) -noshell \ # run erlang with no shell
-pa lib/*/ebin \ # include your local code repo
-pa PATH/TO/elib/ebin \ # include elib
-pa PATH/TO/erlang/lib/*/ebin \ # include local erlang libs
-run reltools make_upgrade $(RELEASE) \ # run reltools:make_upgrade
-s init stop # stop the emulator when finished
FORCE: # empty rule to force run of erl -make
Using the above make rules, you can do make upgrade RELEASE=PATH/TO/releases/NAME-VSN to build a release upgrade package. Once you can do this locally, you can use fab to do remote release builds and installs. But in order to build a release remotely, you need to get the code onto the server. There are various ways to do this, the simplest being to clone your repo on the remote server(s), and push your updates to each one.
fab release build install
Below is an example fabfile.py for building and installing releases remotely using fab. Add your own hosts and roles as needed.
PATH/TO/TARGET should be the path to your first target system.
release is a separate command so that it you are only asked for NAME-VSN once, no matter how many hosts you build and install on.
build will run make upgrade RELEASE=releases/NAME-VSN on the remote system, using the target system's copy of erl. Theoretically, you could build a release package once, then distribute it to each target system's releases/ directory. But that requires each target system being exactly the same, with all the same releases and applications installed. If that's the case, modify the above recipe to run build on a single build server, have it put the release package into all the other node's releases/ directory, then run install on each node.
install uses _rpcall to run rpc:call(NODE@HOST, reltools, install_release, ["NAME-VSN"]). I've kept _rpcall separate so you can see how to define your own fab commands by setting env.mfa.
from fabric.api import env, prompt, require, run
env.erl = 'PATH/TO/TARGET/bin/erl'
def release():
'''Prompt for release NAME-VSN. rel file must be in releases/.'''
prompt('Specify release as NAME-VERSION:', 'release',
validate=r'^\w+-\d+(\.\d+)*$')
def build():
'''Build upgrade release package.'''
require('release')
run('cd PATH/TO/REPO && hg up && make upgrade ERL=%s RELEASE=releases/%s' % (env.erl, env.release))
def install():
'''Install release to running node.'''
require('release')
env.mfa = 'reltools,install_release,["%s"]' % env.release
_rpccall()
def _rpccall():
require('mfa')
evalstr = 'io:format(\"~p~n\", [rpc:call(NODE@%s, %s)])' % (env.host, env.mfa)
# NOTE: local user must have same ~/.erlang.cookie as running nodes
run("%s -noshell -sname fab -eval '%s' -s init stop" % (env.erl, evalstr))
Workflow
Once you've updated your Makefile and created fabfile.py, your workflow can be something like this:
- Write new application code.
- Update the app and appup files for each application to upgrade.
- Create a new rel file as
releases/NAME-VSN.rel. - Commit and push your changes.
- Run
fabreleasebuildinstall. - Enter
NAME-VSNfor your new release. - Watch your system hot upgrade in real-time

Troubleshooting
Sometimes reltools:install_release(NAME-VSN) can fail, usually when the release_handler can't find an older version of your code. In this case, your new release will be unpacked but not installed. You can see the state of all the known releases using release_handler:which_releases().. This can usually be fixed by removing old releases and trying again. Shell into your target system and do something like this (where OLDVSN is the VSN of a release marked as old):
See the release_handler manual for more information.
release_handler:remove_release("OLDVSN"). % repeat as necessary
release_handler:install_release("VSN").
release_handler:make_permanent("VSN").
How to Create an Erlang First Target System
One of the neatest aspects of erlang is the OTP release system, which allows you to do real-time upgrades of your application code. But before you can take advantage of it, you need to create a embedded first target system. Unfortunately, the documentation can be quite hard to follow, so this is my attempt at clearly explaining how to create your own first target system. At 12 steps, it's definitely not simple, but you only have to get it right once ![]()
Assumptions
- You're running linux/unix, probably Ubuntu.
- You already have the desired version of erlang installed. I'll refer to the install dir as
$ERLDIR, which should be the same ascode:root_dir().The latest release, as of 6/1/2009, is R13B01, with erts-5.7.2. - You have your own application code that you want to include in the target system. These apps are located in
$REPODIR/lib/, follow the OTP directory structure, and have app files inebin/.
Steps
- Create the initial release resource file , which I'll refer to as
FIRST.rel. I'll also assume the release version is1.0. The rel file should include your own applications as well as any OTP applications your code depends on. PutFIRST.relin the directory you want to use for creating your target system, such as/tmp/build/. Warning: do not put this file in$REPODIR/releases/. Otherwise step 5 will not work because systools will have issues creating the package. - Optional: Create sys.config in the same directory as
FIRST.rel.sys.configcan be used to override the default application configuration for any application include in the release. - Open an erlang console in the same directory as
FIRST.rel. This directory is where the target system will be created. - Call
systools:make_script("FIRST", [no_module_tests, {path, ["$REPODIR/lib/*/ebin"]}]).This will create a boot script for the target system. The script file must be created for the next step to work. - Call
systools:make_tar("FIRST", [no_module_tests, {path, ["$REPODIR/lib/*/ebin"]}, {dirs, [include, src]}, {erts, "$ERLDIR"}]). This will create a release package containing your code and include files, plus all the.beamfiles for the included OTP applications. Note:no_module_testswill ignore errors that don't matter, such as missingsrccode, which is common for OTP apps. - Exit the console. You should find
FIRST-1.0.tar.gzin your current directory. Ideally, this would be the last step, but more likely, you'll need to do the customizations covered below. Unpack the tarball into your target directory andcdinto it. For a different take on these first steps, check out An Introduction to Releases with Erlybank. - Copy
erts-5.7.2/bin/startintobin/(ifbin/doesn't exist, create it). Editbin/startand set theROOTDIRto your target directory (which should also be your current directory). This is the same$ROOTDIRreferred to below. Also copyerts-5.7.2/bin/run_erlanderts-5.7.2/bin/start_erlintobin/, then domkdir log(or change the paths at the bottom ofbin/start). At this point, you may also want to add your own emulator flags, such as-sname NODE -smp auto -setcookie MYCOOKIE +A 128. - Copy
erts-5.7.2/bin/erlintobin/and set the sameROOTDIRas you did inbin/start. - Copy
$ERLDIR/bin/start_clean.bootor$ERLDIR/bin/start_sasl.boottobin/start.boot. I like usingstart_sasl.bootsince it provides more logging. But if you don't want extra logging, usestart_clean.boot. - Run
echo"5.7.21.0">releases/start_erl.data. This tells erlang which version ofertsto run, and which release version to use at startup. - Run
bin/erland callrelease_handler:create_RELEASES("$ROOTDIR", "$ROOTDIR/releases/", "$ROOTDIR/releases/FIRST.rel", []).Exit the console, and there should be a filereleases/RELEASEScontaing a spec. - That's it, you're done! At this point you should be able to run
bin/start, then usebin/to_erlto get the console (Ctrl-D to exit). If you want to deploy to other nodes, you can repack the target system, distribute it to each node, then unpack it and runbin/start. If you do distribute to other nodes, make sure to unpack in the same location on each node, otherwise you'll have to go back to step 7 and modifyROOTDIR.
Fin
At this point you should have customized, self-contained erlang target system that you can distribute and run on all your nodes. Now you can finally take advantage of release handling with hot code swapping. In an upcoming article, I'll cover how to deploy release upgrades using reltools and fab.
Unit Testing with Erlang's Common Test Framework
One of the first things people look for when getting started with Erlang is a unit testing framework, and EUnit tends to be the framework of choice. But I always had trouble getting EUnit to play nice with my code since it does parse transforms, which screws up the handling of include files and record definitions. And because Erlang has pattern matching, there's really no reason for assert macros. So I looked around for alternatives and found that a testing framework called common_test has been included since Erlang/OTP-R12B. common_test (and test_server), are much more heavy duty than EUnit, but don't let that scare you away. Once you've set everything up, writing and running unit tests is quite painless.
Directory Setup
I'm going to assume an OTP compliant directory setup, specifically:
- a top level directory we'll call project/
- a lib/ directory containing your applications at project/lib/
- application directories inside lib/, such as project/lib/app1/
- code files are in app1/src/ and beam files are in app1/ebin/
So we end up with a directory structure like this:
project/
lib/
app1/
src/
ebin/Test Suites
Inside the app1/ directory, create a directory called test/. This is where your test suites will go. Generally, you'll have 1 test suite per code module, so if you have app1/src/module1.erl, then you'll create app1/test/module1_SUITE.erl for all your module1 unit tests. Each test suite should look something like this: (unfortunately, wordpress doesn't do syntax highlighting for erlang, so it looks kinda crappy)
-module(module1_SUITE).
% easier than exporting by name
-compile(export_all).
% required for common_test to work
-include("ct.hrl").
%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% common test callbacks %%
%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Specify a list of all unit test functions
all() -> [test1, test2].
% required, but can just return Config. this is a suite level setup function.
init_per_suite(Config) ->
% do custom per suite setup here
Config.
% required, but can just return Config. this is a suite level tear down function.
end_per_suite(Config) ->
% do custom per suite cleanup here
Config.
% optional, can do function level setup for all functions,
% or for individual functions by matching on TestCase.
init_per_testcase(TestCase, Config) ->
% do custom test case setup here
Config.
% optional, can do function level tear down for all functions,
% or for individual functions by matching on TestCase.
end_per_testcase(TestCase, Config) ->
% do custom test case cleanup here
Config.
%%%%%%%%%%%%%%%%
%% test cases %%
%%%%%%%%%%%%%%%%
test1(Config) ->
% write standard erlang code to test whatever you want
% use pattern matching to specify expected return values
ok.
test2(Config) -> ok.Test Specification
Now the we have a test suite at project/app1/test/module1_SUITE.erl, we can make a test specification so common_test knows where to find the test suites, and which suites to run. Something I found out that hard way is that common_test requires absolute paths in its test specifications. So instead of creating a file called test.spec, we'll create a file called test.spec.in, and use make to generate the test.spec file with absolute paths.
test.spec.in
{logdir, "@PATH@/log"}.
{alias, app1, "@PATH@/lib/app1"}.
{suites, app1, [module1_SUITE]}.Makefile
src:
erl -pa lib/*/ebin -make
test.spec: test.spec.in
cat test.spec.in | sed -e "s,@PATH@,$(PWD)," > $(PWD)/test.spec
test: test.spec src
run_test -pa $(PWD)/lib/*/ebin -spec test.specRunning the Tests
As you can see above, I also use the Makefile for running the tests with the command make test. For this command to work, run_test must be installed in your PATH. To do so, you need to run /usr/lib/erlang/lib/common_test-VERSION/install.sh (where VERSION is whatever version number you currently have). See the common_test installation instructions for more information. I'm also assuming you have an Emakefile for compiling the code in lib/app1/src/ with the make src command.
Final Thoughts
So there you have it, an example test suite, a test specification, and a Makefile for running the tests. The final file and directory structure should look something like this:
project/
Emakefile
Makefile
test.spec.in
lib/
app1/
src/
module1.erl
ebin/
test/
module1_SUITE.erlNow all you need to do is write your unit tests in the form of test suites and add those suites to test.spec.in. There's a lot more you can get out of common_test, such as code coverage analysis, HTML logging, and large scale testing. I'll be covering some of those topics in the future, but for now I'll end with some parting thoughts from the Common Test User's Guide:
It's not possible to prove that a program is correct by testing. On the contrary, it has been formally proven that it is impossible to prove programs in general by testing.
There are many kinds of test suites. Some concentrate on calling every function or command... Some other do the same, but uses all kinds of illegal parameters.
Aim for finding bugs. Write whatever test that has the highest probability of finding a bug, now or in the future. Concentrate more on the critical parts. Bugs in critical subsystems are a lot more expensive than others.
Aim for functionality testing rather than implementation details. Implementation details change quite often, and the test suites should be long lived.
Aim for testing everything once, no less, no more




