Importing Amazon Kindle Paperwhite to Croatia

Every year I try to surprise my wife with some extraordinary birthday gift. Last year I bought her a nice looking designer dress, an e-book and ordered the Kindle so she can read the Ender’s Game once more. I opted for the newest Kindle, the Paperwhite one. So, ordering this kind of stuff to Croatia is not that easy. We did enter the EU, but for some reason I couldn’t have ordered it from UK, Germany or Italy. Go figure.

As I was a bit late with organizing the delivery, I didn’t have the time to go through 4 people to deliver Kindle like it is a reactor-grade plutonium. So it had to be the States. I placed the order on Wednesday, got charged $160 + $20 shipping. Amazon said that it would take five days to deliver. Two days later, I got an email from local DHL that the package arrived and that I should reply to them with my EORI number.

What is an EORI number?

An EORI number is a unique number throughout the European Community, assigned by a Customs Authority in a Member State to operators (businesses) or persons Economic Operator (EO). By registering for Customs Purposes in one Member State, an EO is able to obtain an EORI number which is valid throughout the European Community. The EO will then use this number in all communications with any EC Customs Authorities for appropriate customs declarations.

Ok, time to find a customs office and request the number. Things in Croatia work a bit different than in more western parts of the world. Namely, to request the EORI number you have to fill a form. And send it via mail. Snail mail. The mail that travels three days from the post office to its destination 6 km away. The mail that my Kindle, which is freezing in some cargo area, derides, having traveled from Nevada over Frankfurt to Zagreb in only two days.

Customs receive my request next Wednesday, and I wait a day to get my EORI number. Needless to say, it is Thursday, my wifes birthday, and all she got was an e-book she can’t read :). Now, EORI requests usually return the same way, via snail mail, but I decide to be assertive and hurry the procedure. Later on I realize that the EORI number is just your PIN prefixed with the country code, and all they needed to do is an INSERT statement to The Great Database Of People That Import Stuff™. Great. Now to pay the cargo “handling”.

DHL, after receiving my EORI number, charges me with $57 (ouch) and delivers Kindle on Friday.

Reading Kindle is great, wife and I share it, buying books is fun and less expensive (money- and space-wise) and I don’t regret one buck spent! Just, what sould I do with this EORI now :)

Don’t Do Now What You Can Put Off Until Later

In this post I will try to address an issue of performance regarding heavy calculations, data processing or other tasks that happen in run time but could be put off for later.

How Stuff Works

Sites usually function quite well, there are a couple of SQL queries, hopefully some memcached /redis hits and a bit of string manipulation to put data into templates and serve them to the user. But from time to time, a new user registers, and  that results in processing and sending the registration email, possibly collecting his Facebook image and storing it locally, getting all his Facebook friends or last dozen of tweets, and all this takes some time. Time the user waits and looks at the spinner, wondering if something has gone wrong and whether he should close the site.

Diagram of user registration w/ sending email and collecting Facebook data

Another example is a situation of multiple image uploads at the same time. Application can take the entire server down if it processes images immediately. If, say, 100 people decide to upload an image at the same time, you have just created 100 processes (or more if you’re creating multiple variations of images) that are using as much of the CPU as possible.

What Can Be Done

To offload the process, you can put it in a queue. The most simple one is a cron job: after user registers, you enter data in the database with the flag “send_email=1″. Now, when the cron runs (every minute? every hour?), it collects all the new users who need to receive the email and it sends them one. Easy, right?

Lets stop for a second and take a closer look. What happens if there are 10 or 1000 tasks that happen in one minute? High volume traffic sites with lots of image uploads, for example. How long do we need to wait for our image to show up if 100 images are already waiting to be processed, and cron sits there, doing nothing for one whole minute?

What happens if we have a bug in our file run by cron that never proceeds to the next task, or if there is an exception that is not being handled? Cron will die and all those people will not get their email. Bad, huh? Cron tasks are OK for regular maintenance, but choose the right tool for the job.

Gearman, Beanstalkd, Resque & QuTee

There are a couple of solutions that allow us to do the heavy lifting in the background.

fast
Diagram of user registration w/ queuing tasks to be processed via background worker

Gearman is one of the most popular job queue processors. It provides APIs for a lot of popular platforms (Java, Python, PHP, Perl). You can use it for background jobs as well as messaging another service to do the work at the run time (say, your PHP application asks a program written in C that runs on another server to do some task better suited for C language…).

Another popular queuing system is Beanstalkd, created originally to power the backend for the ‘Causes’ Facebook app. List of supported client libraries is impressive, and almost every major language is supported.

Resque is a ruby library, created by GitHub, and ported to PHP by Chris Boulton. Resque stores the queue in Redis storage server, and provides a web interface to monitor the tasks (ruby only).

As a side project, I have started to work on QuTee, a queue manager and task processor for PHP. To be feature complete, I have set these goals:

  • it has to have a good API to be easy to use,
  • it has to have as few dependencies as possible,
  • it has to be easily installed / configured, and with multiple backends in mind, it has to solve the background task processing on dedicated machines as well as shared hostings,
  • it has to provide some kind of interface to monitor the tasks status and start/stop workers.

Currently only Redis is supported, and supervisord is advised for maintaining worker processes (since I didn’t want to go into forking and adding another dependency).

Adding the task to the queue is really easy. After creating and configuring the queue, adding a task can be a oneliner:

// Define queue (usually in bootstrap / DIC of the application)
$redisParams    = array(
    'host'  => '127.0.0.1',
    'port'  => 6379
);
$queuePersistor = new Qutee\Persistor\Redis($redisParams);
 
$queue          = new Queue();
$queue->setPersistor($queuePersistor);
 
// Create task
Task::create('Acme/SendRegistrationEmail', array('user_id' => 123456), Task::PRIORITY_HIGH);

In the above example, we have a class SendRegistrationEmail in Acme namespace that implements the TaskInterface. When the worker instantiates the task, it knows how to ->run() the task. However, there is no requirement for the task class to be a contractor of TaskInterface, but then we need to specify what method to run:

$task = new Task();
$task
    ->setName('Acme/DeleteFolder')
    ->setData(array('/usr/'))
    ->setMethodName('doDeleteTheFolder');
 
$queue->addTask($task);

Tasks can have a unique identifier, so one task will not run multiple times. Workers are even easier, they only listen to queues (or one in specific) and run tasks, nothing else:

$worker = new Worker;
$worker
    ->setInterval(30)
    ->setPriority(Task::PRIORITY_HIGH)
    ->run();

This example creates a worker that polls the queue every 30 seconds, and is interested only in tasks from the high priority queue.

Since QuTee is 0.7.0 right now, it lacks some functionality (task status web interface / logging / more backends) but can be used for background job processing.

Caveats

When the code that is of importance to the task or worker changes, one needs to restart the worker. This is a thing to keep in mind when pushing a fix to production. To get around this problem, one could exit from the worker every hour or so, letting the supervisord to restart it.

Another thing to keep in mind about background workers are racing conditions and how to evade them. Lets say we have 100 workers, and have created a task to send newsletter to 1000 users with unique coupon. If each task selects the first coupon from the database at the same time, there is a good chance that a good number of users will get the same coupon (say you didn’t implement the read locking). For that reason it is a good practice to fetch all the necessary data when creating the task, so the task is as stateless as it can be, and that coupon is sent with the rest of the data.

One last thing that comes to mind are multiple database connections. When you start the worker, keep in mind that, if your tasks need a database connection, each task will connect to a database, perform the job and quit. Remember to mysqli_close() the connection since PHPs garbage collection isn’t that good.

Conclusion

Background workers and job queue greatly improve user experience and help reduce overall server load. If you didn’t use background workers up until now because of the hassle of setting everything up, I hope this article has given you some idea where to start and will motivate you to consider using QuTee because of its quick setup and, moreover, because it will develop excellent features, so go ahead and fork it :)

Quick File Opening in NetBeans

NetBeans is an irreplaceable tool for my every day use. Yes, it’s written in boring, slow Java, but the feature set is great and it helps me to get my tasks done without getting in my way. There are some things that could be better or more responsive. One of those things are code scanning and opening files. I like how Sublime Text 2 does the file open (Go to Anything), and this is how to achieve this in NetBeans.

First, without any plugins, you can type ALT + SHIFT + O which brings up the Go to File dialogue. It is ok, but it doesn’t provide the fuzzy search (you can emulate it by putting * between letters). To achieve fuzzy searching, we need the help of a plug-in called Open File Fast. Plug-in is last reported to run with NetBeans 6.9, but I’m running it with latest, 7.2. To install and configure it, follow these steps:

  • Go to http://plugins.netbeans.org/plugin/16495/open-file-fast and download for the latest version
  • In NetBeans, go to Tools → Plugins → Downloaded → Add Plugins, select the downloaded file and click Install (http://wiki.netbeans.org/FaqPluginInstall)
  • Restart NetBeans (although it is not needed to run the plug-in, I found that I couldn’t set key short cut for Open File Fast)
  • Now for the short cut. Go to Tools → Options → Keymap, search for “open file fast” → Assign short cut
I’ve assigned it CTRL + P. It still isn’t near as fast as Sublime Text, but it gets the job done.

Doctrine2 CLI under Silex application

I got it running with Doctrine 2.3.1-DEV. Get doctrine provider using composer:

{
    "minimum-stability": "dev",
    "require": {
        "silex/silex": "1.0.*",
        "taluu/doctrine-orm-provider" : "*",
    },
 
    "autoload": {
        "psr-0": { "Entity": "app/" }
    }
}

Create doctrine.php and put the following content inside. I like to put it in bin/doctrine.php:

#!/usr/bin/env php5
<?php
// Load your bootstrap or instantiate application and setup your config here
 
require_once APP_ROOT .'/vendor/autoload.php';
 
$app        = new Silex\Application();
 
// Doctrine DBAL
$app->register(new Silex\Provider\DoctrineServiceProvider(), array(
    'db.options' => $config['db']
));
 
// Doctrine ORM, I like the defaults, so I've only modified the proxies dir and entities path / namespace
$app->register(new Nutwerk\Provider\DoctrineORMServiceProvider(), array(
    'db.orm.entities'              => array(array(
        'type'      => 'annotation',
        'path'      => APP_PATH .'/Entity',
        'namespace' => 'Entity',
    )),
    'db.orm.proxies_dir'           => APP_ROOT .'/var/cache/doctrine/Proxy',
));
 
use Doctrine\DBAL\Tools\Console\Helper\ConnectionHelper;
use Doctrine\ORM\Tools\Console\Helper\EntityManagerHelper;
use Doctrine\ORM\Tools\Console\ConsoleRunner;
use Symfony\Component\Console\Helper\HelperSet;
 
$helperSet = new HelperSet(array(
    'db' => new ConnectionHelper($app['db.orm.em']->getConnection()),
    'em' => new EntityManagerHelper($app['db.orm.em'])
));
 
ConsoleRunner::run($helperSet);

And that’s it :)

Pretty HTML5 multiple file upload with Bootstrap, jQuery, Twig and Silex

There are a number of ways to achieve multiple file upload functionality, but I like HTML5 way of doing it, and it will be supported across all major browsers when IE10 ships. Also, Twitters’ Bootstrap helped me achieve the look without problems. I used a bit of jQuery for help with events. Alongside vanilla html, I will put Twig form syntax to achieve this, together with Symfony2 Form component, for server side.

Demo: Pretty File Boilerplate

HTML

<span class="prettyFile">
    <input type="file" name="form[files][]" multiple="multiple">
    <div class="input-append">
       <input class="input-large" type="text">
       <a href="#" class="btn">Browse</a>
    </div>
</span>

You can add required or accept attributes as needed.

CSS

.prettyFile > input { display: none !important; }
/*  The rest is from Twitter Bootstrap */
input,
.input-append { display: inline-block; vertica-align: middle; }
 
.input-large {
    border: 1px solid rgba(82, 168, 236, 0.8);
    box-shadow: inset 0 1px 1px rgba(0, 0, 0, .075), 0 0 8px rgba(82, 168, 236, .6);
    border-radius: 3px 0 0 3px;
    font-size: 14px;
    height: 20px;
    color: #555;
    padding: 4px 6px;
    margin-right: -4px;
    width: 210px;
}
.btn {
    background-image: -webkit-linear-gradient(top, white, #E6E6E6);
    background-repeat: repeat-x;
    border: 1px solid rgba(0, 0, 0, 0.14902);
    box-shadow: rgba(255, 255, 255, 0.2) 0px 1px 0px 0px inset, rgba(0, 0, 0, 0.0470588) 0px 1px 2px 0px;
    color: #333;
    display: inline-block;
    font-family: Tahoma, sans-serif;
    font-size: 14px;
    margin: 0 0 0 -1px;
    padding: 4px 14px;
    height: 20px;
    line-height: 20px;
    text-align: center;
    text-decoration: none;
    text-shadow: rgba(255, 255, 255, 0.74902) 0px 1px 1px;
    vertical-align: top;
    width: 47px;
}

Twig

<form action="/path/" {{ form_enctype(form) }} method="post" >
 
    {{ form_errors(form) }}
 
    <div>
        {{ form_label(form.files) }}
        {{ form_errors(form.files) }}
        {# Append [] to the full name of the form name - this is still an issue: https://github.com/symfony/symfony/issues/1400 #}
        {{ form_widget(form.files, { 'full_name': form.files.get('full_name') ~ '[]' }) }}
    </div>
 
    {{ form_rest(form) }}
 
    <input type="submit" />
</form>

Javascript

// Pretty file
if ($('.prettyFile').length) {
    $('.prettyFile').each(function() {
        var pF          = $(this),
            fileInput   = pF.find('input[type="file"]');
 
        fileInput.change(function() {
            // When original file input changes, get its value, show it in the fake input
            var files = fileInput[0].files,
                info  = '';
            if (files.length > 1) {
                // Display number of selected files instead of filenames
                info     = files.length + ' files selected';
            } else {
                // Display filename (without fake path)
                var path = fileInput.val().split('\\');
                info     = path[path.length - 1];
            }
 
            pF.find('.input-append input').val(info);
        });
 
        pF.find('.input-append').click(function(e) {
            e.preventDefault();
            // Make as the real input was clicked
            fileInput.click();
        })
    });
}

PHP

namespace Acme;
 
use Symfony\Component\HttpFoundation\Request;
 
class Controller
{
    public function uploadAction(Request $request)
    {
        $form    = $app['form.factory']->createBuilder('form')
                    ->add('files', 'file', array(
                        'label'     => 'Files',
                        'attr' => array(
                            'multiple'  => 'multiple',
                            // 'accept'    => 'image/*'
                        )
                    ))
                    ->getForm();
 
        if ($request->getMethod() == 'POST') {
            $form->bind($request);
 
            if ($form->isValid()) {
                $data  = $request->files->get($form->getName());
                foreach ($data['files'] as $file) {
                    // Do whatever with the file
                }
            }
 
        }
    }
}

I have used Silex /  Symfony here, but anything will do.

Todo

  • HTML5 XHR drag and drop upload
  • cross-browser testing
  • vanilla javascript implementation

Linux and accessing the Windows shares

I had the problem of accessing the windows shares on my network, I’m using Ubuntu and getting inside the workgroup directory didn’t show any machines. So, I’ve found the solution which is pretty simple:

1. Samba config

# edit /etc/samba/smb.cnf
# set the name of your work group (although, I have 2 different work 
# groups in my network, and I can access them both after this configs)
workgroup = WORKGROUP
 
# setup the netbios name. Use whatever name your machine is called. 
# From samba.org: This sets the NetBIOS name by which a Samba server is known. 
# By default it is the same as the first component of the host's DNS name. 
# If a machine is a browse server or logon server this name (or the first 
# component of the hosts DNS name) will be the name that these services are 
# advertised under.
netbios name = ubuntu
 
# Reorder the naming services
name resolve order = lmhosts wins bcast host6

2. Add wins to Name Service Switch config

# edit /etc/nsswitch.conf or wherever NSS stores its config
 
# Add wins to hosts:
hosts:          files mdns4_minimal [NOTFOUND=return] wins dns mdns4

3. Restart Samba

# /etc/init.d/smbd restart

And that’s it, you should be able to access all of the shares across all of the workgroups.

Development environment

This is a quick post on how to set up your own DNS server with custom TLDso you can easily and more quickly get started on your next project. I am doing my programming on the Linux machine (Ubuntu to be Precise :)). The idea behind this set-up is to evade the need to ever modify your /etc/hosts file. Also, there is a possibility to even skip the creating of Apache VirtualHost directive and restarting the web server. Onward with the How-To.

Disclaimers:

  • I use Ubuntu, so substitute apt-get with yum or what ever you use
  • Anywhere you see the IP 192.168.1.253, replace with your own
  • I haven’t set up any forwarders in named.conf.options

Install and configure DNS (BIND9)

sudo apt-get install bind9

Edit these files

/etc/bind/named.conf.local:

zone "dev" {
    type master;
    file "/etc/bind/db.dev";
};

zone "1.168.192.in-addr.arpa" {
    type master;
    file "/etc/bind/db.192.168.1";
};

/etc/bind/db.dev

$TTL	604800
@		IN		SOA		dev. root.dev. (
	             2012042301		; Serial
			 604800		; Refresh
			  86400		; Retry
			2419200		; Expire
			 604800 )	; Negative Cache TTL
;
@		IN		NS	dev.
@		IN		A	192.168.1.253
*.dev.	14400 	IN 		A	192.168.1.253

/etc/bind/db.192.168.1:

$TTL	604800
@		IN		SOA		dev. root.dev. (
	     2012042301		; Serial
			 604800		; Refresh
			  86400		; Retry
			2419200		; Expire
			 604800 )	; Negative Cache TTL
;
@		IN		NS		dev.
253		IN 		PTR		dev.

Be careful to replace 253 in your files for your own last IP octet. Also, the filename should reflect your IP.

DNS servers setup…

Ok, now that we got this set up, we need to tell our system to use the local DNS server before going for the ISP and beyond. To achieve this, use Networking manager in Ubuntu, here’s how mine looks like. The final goal is for the /etc/resolv.conf too read: nameserver 127.0.0.1.

…and finishing up

Now that everything is set up, restart bind:

sudo /etc/init.d/bind9 restart

Test your setup by pinging anything.dev. If you get the response from your server, all is working great.

Apache Virtual Document Root

If your projects have similar / identical directory structure (i.e. public directory for publicly available files) than you can go a step further and setup the Apache Virtual Document Root. In doing so, you will be able to create a new directory in your projects root and have it magically turned up by calling http://newdirectory.dev.

<IfModule vhost_alias_module>
    <VirtualHost *>
        UseCanonicalName Off
        VirtualDocumentRoot "/path/to/projects/%1/public"

        ServerName projects.dev
        ServerAlias *.dev

        SetEnv APPLICATION_ENV development
    </VirtualHost>
</IfModule>

# Enable mod_vhost_alias apache module
sudo a2enmod vhost_alias
# Restart server
sudo /etc/init.d/apache2 restart

I don’t have this enabled for myself, but it does work, although not well tested. For further info on this topic, check the following links:

P.S. Yes, I got carried away while creating the featured image :)

Comment driven development

There is quite a handful of programming techniques out there; TDD, BDD, YAGNI, DRY to name a few. This post will be about something many people might already be doing but don’t know it has a name: Comment-driven development or comment programming.

CDD is helpful for:

  • prototyping,
  • spitting out your thoughts in code editor, so you don’t forget anything later (good for brainstorming sessions),
  • explaining what needs to be done if someone else is going to be writing the code itself,
  • commenting the code :). Comments could remain, so your code is documented from the get-go

I often start the new PHP file or class or even method with the layout in comments. Here’s an example:

// Get the needed models
// Collect todo items
// Get lists that the todo items belong to
// Send to View

Ok, everything is clear. After the real code sets in it looks like this:

public function someAction()
{
    // Get the needed models
    $todoTable  = new TodoTable();
    $listTable  = new ListTable();
 
    // Collect todo items
    $todos      = $todoTable->findByAuthor($author);
 
    // Get lists that the todo items belong to
    foreach ($todos as &$todo) {
        $todo['lists'] = $listTable->find($todo['list_id']);
    }
 
    // Send to View
    DIC::get('View')->todos = $todos;
}

As you can see, the comments can stay in place. Even for this simple example, it is good practice to document your code.

The Wikipedia article states:

In comment programming the comment tags are not used to describe what a certain piece of code is doing, but rather to stop some parts of the code from being executed. The aim is to have the commented code at the developer’s disposal at any time he might need it.

And later on:

However, comment programming is used instead of a real implementation. The idea is that many functions can be written like this, and then the design can be reworked and revisited without having to refactor a lot of source code.

So, Wikipedia article is somewhat contradictory to itself. The aspect of comment programming I am writing about here is the “comments instead of a real implementation” part.

Do you write your code with comments first approach? Do you use some other technique?

MySQL Workbench 5.2.35 on Ubuntu 11.10 64bit (Oneiric Ocelot)

Well, if no one is gonna do it, you have to do it yourself. I had the need for latest MySQL Workbench – 5.2.35, but I also upgraded to the latest Ubuntu, the Oneiric Ocelot (11.10). As with most of new things, I couldn’t get it to work out of the box, so little compiling session was in order. If you need the package, you can download the deb. It is for 64bit (amd64) architecture. This is the first time I have created a deb package, so I apologize in advance if I didn’t follow some basic rules.

MySQL Workbench 5.2.35 for Ubuntu 11.10 (64bit)

Custom file input can’t be that hard?

Ok, this is a quick post. I wanted to style the input for file upload, and spent some time fine tuning. In order to have this for future use, I made a tutorial for myself :). If you need to style file input, see my demonstration on how to fake file type input or download the file. This is in no way final or tested. If you have suggestions, please leave a comment.

Fake file input