htaccess Rewrites SEF URL and submits to PHP

What?
A quick note on a htaccess rewrite rule I'm liking.

What does it do?
What I type:
copyraw
http://www.mywebsite.com/blog/videos.html
  1.  http://www.mywebsite.com/blog/videos.html 
Sends this to server:
copyraw
http://www.mywebsite.com/index.php?myFolder=blog&myFiles=videos
  1.  http://www.mywebsite.com/index.php?myFolder=blog&myFiles=videos 
How?
copyraw
Options -Indexes +FollowSymlinks
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_URI}  !index.php
RewriteCond %{REQUEST_URI} ^/([^\.]+)\/([\w]+).html  [NC]
RewriteRule .*    index.php?myFolder=%1&myFiles=%2    [L]

ErrorDocument 400 /error/?v=400
ErrorDocument 401 /error/?v=401
ErrorDocument 403 /error/?v=403
ErrorDocument 404 /error/?v=404
ErrorDocument 500 /error/?v=500
  1.  Options -Indexes +FollowSymlinks 
  2.  RewriteEngine On 
  3.  RewriteBase / 
  4.  RewriteCond %{REQUEST_URI}  !index.php 
  5.  RewriteCond %{REQUEST_URI} ^/([^\.]+)\/([\w]+).html  [NC] 
  6.  RewriteRule .*    index.php?myFolder=%1&myFiles=%2    [L] 
  7.   
  8.  ErrorDocument 400 /error/?v=400 
  9.  ErrorDocument 401 /error/?v=401 
  10.  ErrorDocument 403 /error/?v=403 
  11.  ErrorDocument 404 /error/?v=404 
  12.  ErrorDocument 500 /error/?v=500 

Additional Notes
If you do apply the above to your site, bear in mind the following is also true:
copyraw
http://www.mysite.com/blog/pretty_much_anything_i_want_to_type_here.html

--yields
http://www.mysite.com/index.php?myFolder=blog&myFiles=pretty_much_anything_i_want_to_type_here.html
  1.  http://www.mysite.com/blog/pretty_much_anything_i_want_to_type_here.html 
  2.   
  3.  --yields 
  4.  http://www.mysite.com/index.php?myFolder=blog&myFiles=pretty_much_anything_i_want_to_type_here.html 
Anything not ending in ".html" will simply return a 404 error. I've included my error rules (they basically redirect to a branded error page).

So I sanitize on the receiving index.php file:
  1. Check for possible Code Injection
  2. Do NOT allow the use of apostrophe or double-quotes, convert these to a numerical representation only if you need to convert them back later (eg. 034, 039).
  3. Do NOT allow any punctuation you don't use in your site structure. Slashes and underscores /_ are good (so regexp: /[^a-zA-Z0-9_\/]/). If you allow percents (%) or apostrophes (*) then you are asking for trouble.
  4. Note my redirect for errors.
  5. Split the first string "myFolder" with the slash (/) as a delimiter, controlling the syntax/format of your site URLs.
For Example
copyraw
http://www.mysite.com/blog/videos/2010/january/21.html

// sends
index.php?myFolder=blog/videos/2010/january&myFiles=21
  1.  http://www.mysite.com/blog/videos/2010/january/21.html 
  2.   
  3.  // sends 
  4.  index.php?myFolder=blog/videos/2010/january&myFiles=21 
Which, hopefully, the PHP file will handle as:
copyraw
var $site_structure_string = $_GET['myFolder'];
$site_structure_string = preg_replace('/[^a-zA-Z0-9_\\/]/', '', $site_structure_string);
var $site_structure_item = $_GET['myFiles'];
var $site_structure_array = explode('/', $site_structure_string);

// yields
$site_structure_array[0] = 'blog'
$site_structure_array[1] = 'videos'
$site_structure_array[2] = '2010'
$site_structure_array[3] = 'january'
$site_structure_item = '21'
  1.  var $site_structure_string = $_GET['myFolder']
  2.  $site_structure_string = preg_replace('/[^a-zA-Z0-9_\\/]/', '', $site_structure_string)
  3.  var $site_structure_item = $_GET['myFiles']
  4.  var $site_structure_array = explode('/', $site_structure_string)
  5.   
  6.  // yields 
  7.  $site_structure_array[0] = 'blog' 
  8.  $site_structure_array[1] = 'videos' 
  9.  $site_structure_array[2] = '2010' 
  10.  $site_structure_array[3] = 'january' 
  11.  $site_structure_item = '21' 
And don't forget to redirect the user to an error page or back to the home page if something is amiss.

Oh and the above does NOT allow:
copyraw
http://www.mysite.com/blog.html
  1.  http://www.mysite.com/blog.html 
If you want this, I think the rewrite rule is:
copyraw
RewriteCond %{REQUEST_URI} ^/([\w]+).html  [NC]
  1.  RewriteCond %{REQUEST_URI} ^/([\w]+).html  [NC] 
But, er, I like that first check (myFolder) that the submitted URL matches the format of your site (and a lot more opportunity to check for malicious code).
Category: Personal Home Page :: Article: 520

Credit where Credit is Due:


Feel free to copy, redistribute and share this information. All that we ask is that you attribute credit and possibly even a link back to this website as it really helps in our search engine rankings.

Disclaimer: Please note that the information provided on this website is intended for informational purposes only and does not represent a warranty. The opinions expressed are those of the author only. We recommend testing any solutions in a development environment before implementing them in production. The articles are based on our good faith efforts and were current at the time of writing, reflecting our practical experience in a commercial setting.

Thank you for visiting and, as always, we hope this website was of some use to you!

Kind Regards,

Joel Lipman
www.joellipman.com

Related Articles

Joes Revolver Map

Joes Word Cloud

Accreditation

Badge - Certified Zoho Creator Associate
Badge - Certified Zoho Creator Associate

Donate & Support

If you like my content, and would like to support this sharing site, feel free to donate using a method below:

Paypal:
Donate to Joel Lipman via PayPal

Bitcoin:
Donate to Joel Lipman with Bitcoin bc1qf6elrdxc968h0k673l2djc9wrpazhqtxw8qqp4

Ethereum:
Donate to Joel Lipman with Ethereum 0xb038962F3809b425D661EF5D22294Cf45E02FebF
© 2024 Joel Lipman .com. All Rights Reserved.