Re-indexing a doctrine object with a symfony task

Yesterday I explained how to add the Searchable behaviour to a doctrine object,  to I’ll cover that how to create a symfony task to re-index all the existing objects. The basic idea of the task is to retrieve all the objects from your database, mark them as dirty and re-save them. If you don’t mark them as dirty, nothing will update when you save, which is what you would expect from doctrine.


First lets create the task with the generate:task command. Now, I like to namespace my symfony tasks into the same name as the project I’m working on, so that all the tasks appear together within the list of options when I run the ./symfony script without any commands. For this example we’ll use the namespace of myTools, and we’ll make the task re-save all Article models, which will trigger the Searchable behaviour we added to it yesterday.

generate:task myTools:reIndexArticles

This will create you a new task under [project root]/lib/task/myToolsReIndexObjectsTask.class.php

The contents of the file will just be the basic template which is created by the symfony tools I’ll leave you to update the briefDescription and detailedDescription yourself as that’s more for your own documentation then this example.

In order to get Doctrine to call a full save on an object it needs to have a state set of Doctrine_Record::STATE_DIRTY. This state is set when a field is changed, or updated in a normal edit. To set this state on an object without making any changes to the fields you can call the state() method on the object. This means that we can update our task to retrieve all of the Article models, set their state to Doctrine_Record::STATE_DIRTY and then save them.

protected function execute($arguments = array(), $options = array())
{
    // initialize the database connection
    $databaseManager = new sfDatabaseManager($this->configuration);
    $connection = $databaseManager->getDatabase($options['connection'])->getConnection();

    $articles = ArticleTable::getInstance()->createQuery('Article a')->execute();

    foreach ($articles as $article) {
        // mark the article as dirty for a re-save
        $article->state(Doctrine_Record::STATE_DIRTY);
        $article->save();
    }
}

Now this is a very basic example as the above will pull all the articles from the database, and if you have thousands of them then it’ll more then likely run out of memory. To improve this you should process batches of articles, so that is an improvement I’ll leave you to work out.

Now that you have your model indexed you need to know how to search this new index for keywords and build a search result, but I’ll leave that for tomorrow, if I get time.

Leave a Reply

Your email address will not be published. Required fields are marked *