Solr for Latin Languages

When configuring Solr in non-english languages, in this case in Portuguese, one usually wants:

In the case one may wish to stem words, it’s also easy. Though, one should be careful in using stem, since it may produce too much false positives.
To split words by white spaces, use StandardTokenizerFactory or WhitespaceTokenizerFactory.
For stop words removal, use StopFilterFactory.
In the case of stem, use SnowballPorterFilterFactory.
Here’s an snippet of a Portuguese schema.xml:
    <fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
        <!-- in this example, we will only use synonyms at query time
        <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
        -->
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.SnowballPorterFilterFactory" language="Portuguese" />
        <filter class="solr.ASCIIFoldingFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.SnowballPorterFilterFactory" language="Portuguese" />
        <filter class="solr.ASCIIFoldingFilterFactory"/>
      </analyzer>
    </fieldType>

Drupal showing only the front page

Trocaqui had a problem and it started showing only the front page.
It was working fine and, suddenly, it started to show the front page, no matter what link was selected nor what address was inserted in the address bar…
This was so critical that it was even impossible to login into the system to put the system on maintenance mode while fixing the problem…
After some research, it look to be an Apache mod_rewrite problem.
It started to look like a clean URL problem so I decided to try the non-clean URL format and voilá, Drupal started to responde correctly…
Using the /?q= format I was able to login, set the system in maintenance mode and disable the clean URL format.
By now, I was at least able to manage it.
When in a similar situation, just change the clean URL with the standard Drupal format, it’s actually quite easy. Just insert a ?q= between the first / and the following character.
Here’s an example: http://www.trocaqui.com/forum is equivalent to http://www.trocaqui.com/?q=forum.

While this was good enough as a starting point, it was not the best solution, thus I continued the quest for the real solution.

I cloned the production site on my testing environment and was able to reproduce the problem, which is a great first step.
While analyzing the problem, I’ve found out that my .htaccess was broke and tried to fixed it following the steps of others that had .htaccess and Clean URLs problems.
Once the clean URLs were disables, I was unable to set them on again since Drupal tests clean URLs and only allows them to be activated if they will work.

Here’s what worked for me, I edited the .htaccess and fixed things by setting the following:

RewriteEngine ON
RewriteRule "(^|/)\." - [F]
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php?q=$1 [L,QSA]

Have special attention to the RewriteEngine ON, it may be case sensitive.