Creating a search tool with the Searchable behaviour in Doctrine

Once you’ve set up a searchable behaviour on a doctrine model, and have indexed all the keywords on the model, you are going to want to build a search tool. First we should create a simple symfony form.

class SearchForm extends sfForm
{
  public function configure()
  {
    $this->setWidgets(array('query' => new sfWidgetFormInput(array(), array())));
    $this->widgetSchema->setNameFormat('search[%s]');
    $this->setValidators(array('query' => new sfValidatorPass(array())));
  }
}

Obviously, you can make your form a little nicer, but it should serve the purpose of this example.

Now we want to create a new action to process this form. Let’s assume that the search form is handled by a component somewhere else, and all we care about is the information in the request.

public function executeResults(sfWebRequest $request)
{
  $searchData = $request->getParameter('search', array());

  $this->searchQuery = array_key_exists('query', $searchData) ? $searchData['query'] : '';

  // split searchQuery into keywords
  $keywords = str_word_count(strtolower($this->searchQuery), 1);

  // ignore stop words
  $keywords = $this->removeStopWordsFromArray($keywords);

  $this->pager = new sfDoctrinePager('Article', 10);
  $this->pager->setQuery(ArticleTable::getInstance()->searchByKeywords($keywords));
  $this->pager->setPage($request->getParameter('page', 1));
  $this->pager->init();
}

In addition to this action we need to remove any stop words from the query as these will just add noise to our search code. In the example above I use a function called removeStopWordsFromArray(), which does just this.

public static function removeStopWordsFromArray($keywords)
{
  $stop_words = array(
    'i', 'me', 'my', 'myself', 'we', 'our', 'ours', 'ourselves', 'you', 'your', 'yours',
    'yourself', 'yourselves', 'he', 'him', 'his', 'himself', 'she', 'her', 'hers',
    'herself', 'it', 'its', 'itself', 'they', 'them', 'their', 'theirs', 'themselves',
    'what', 'which', 'who', 'whom', 'this', 'that', 'these', 'those', 'am', 'is', 'are',
    'was', 'were', 'be', 'been', 'being', 'have', 'has', 'had', 'having', 'do', 'does',
    'did', 'doing', 'a', 'an', 'the', 'and', 'but', 'if', 'or', 'because', 'as', 'until',
    'while', 'of', 'at', 'by', 'for', 'with', 'about', 'against', 'between', 'into',
    'through', 'during', 'before', 'after', 'above', 'below', 'to', 'from', 'up', 'down',
    'in', 'out', 'on', 'off', 'over', 'under', 'again', 'further', 'then', 'once', 'here',
    'there', 'when', 'where', 'why', 'how', 'all', 'any', 'both', 'each', 'few', 'more',
    'most', 'other', 'some', 'such', 'no', 'nor', 'not', 'only', 'own', 'same', 'so',
    'than', 'too', 'very',
  );
  return array_diff($keywords, $stop_words);
}

Now we need to look at the ArticleTable class, as this is where the main work of the search is done.

public function searchByKeywords($keywords, Doctrine_Query $q = null)
{
  $article_ids = $this->_getArticleIdsForKeywords($keywords);

  if (is_null($q)) {
    $q = Doctrine_Query::create()
      ->from('Article a');
  }

  $q->whereIn($q->getRootAlias() . '.id', $article_ids);

  return $this->addPublishedArticlesQuery($q);
}

private function _getArticleIdsForKeywords($keywords)
{
  if (!is_array($keywords)) {
    return array();
  }

  $q = Doctrine_Query::create()
    ->from('ArticleIndex')
    ->select('DISTINCT id')
    ->addSelect('keyword')
    ->addSelect('field')
    ->addSelect('COUNT(*) AS nb');

  foreach($keywords as $keyword) {
    $q->orWhere('keyword = ?', $keyword);
  }

  $q->addGroupBy('CONCAT(id,field)')
    ->having('nb >= ?', count($keywords))
    ->orderBy('nb DESC');

  $article_ids = array();
  foreach($q->fetchArray() as $row) {
    $article_ids[$row['id']] = $row['id'];
  }
  if (!count($article_ids)) {
    $article_ids[] = '-1';
  }
  return $article_ids;
}

It is important to note that _getArticleIdsForKeywords() will return an array with at least one value so as not to generate a query with an empty whereIn() statement because Doctrine will ignore that, which I thought was helpful when building this.

I’ll leave the template part for you to finish off, as it is a standard sfDoctrinePager and I’m sure if you’re reading this then you know how to use one of those already.

Leave a Reply

Your email address will not be published. Required fields are marked *