AEM 6.3 Sample Website We.Retail on Windows using Dispatcher on Apache HTTP Server has errors because of using colons in the URL

It has been a while since I last wrote my blog entry. Lots of changes have happened in my life over the last few years and this is not the time or place to rant.

Introduction

I have started playing with AEM a.k.a Adobe Experience Manager which is Adobe’s platinum Web content management platform. As a part of learning the product, I have installed the following components on my local windows machine

  • AEM Authoring Server
  • AEM Publish Server
  • AEM Dispatcher = which is a module installed within Apache HTTP Server 2.2

The following image shows the high level flow between the various components

  1.  Using a browser we type in a URL to visit a website hosted using AEM. In our case this is the We.Retail sample page
  2. The Apache HTTP Server receives the request and sends it for processing using the configured modules
  3. The AEM Dispatcher module now receives the request and the checks if the page has already been stored in Cache
  4. If the page is not stored in Cache, the dispatcher makes a request to the AEM Publish server which then serves the page. The dispatcher now stores the received content/page in its cache on the file system
  5. The content originally comes from the Authoring server – which pushes the content to the Publish server using a set of “Replication” agents.

 How are the Cached items stored

The dispatcher literally creates folders within the configured Cache directory and persists the HTML and static content received from the Publish server. In my case, I had configured the cache directory within the htdocs root folder.

My Apache server installation was under

C:/Users//AEM/Apache

The DocumentRoot folder was under

C:/Users//AEM/Apache/htdocs

Cached items are stored under htdocs like

C:/Users//AEM/Apache/htdocs/content/we-retail/us/en.html

 What exactly is the issue

When the AEM Dispatcher stores the items in the file system, if there is a colon or a special character in the URL which is not allowed within Windows, then it will not be able to cache the item. However it does try to check for the item in cache and throws an error if it could not find the file.
For a sample URL like
http://localhost/content/we-retail/us/en/jcr:content/root/hero_image/default.img.jpeg
the following error will be thrown in the error logs. Please note the jcr:content in the URL above.
[Fri Jan 05 17:57:26 2018] [error] [client 127.0.0.1] (20024)The given path is misformatted or contained invalid characters: Cannot map GET /content/we-retail/us/en/jcr%3acontent/root/hero_image/default.img.jpeg HTTP/1.1 to file, referer: http://localhost/content/we-retail/us/en.html

 The above error essentially prevents the image from being served by the HTTP Server and throws a 403 forbidden error

How to solve this

Well, I was searching for some sort of a Apache HTTP Server level solution, when I ended up at this blog entry of someone called “bllsht“. Well you did read that right. 🙂
He had created a mod_rewrite rule within the Apache HTTP Server conf which re-wrote jcr:content as _jcr_content and it works brilliantly. 
You can see the solution here –>
Enjoy!