Featured Articles:
Keyboard Loopback
Properly Extending NTFS disks
Compiling Amarok
Prado Benchmarks
Smarty Caching in Code Igniter
In Progress:
Safer PHP - avoiding compromise through good design
Preventing SQL injection, information leakage, and writing an authentication system from the bottom up.
New Projects:
Writeup coming soon.
For individuals seeking to study virus/trojan/malware behavior, email AV scanners are a bother. Thus this little application.
Justin's foray into the lovely world of lost data....
Random Links:
On Fugitive Thought:
Fugitive Thought Archives
Bold - Today
Dark - Blog Entry
More adventures in the shell
Friday, August 8, 2008 - By Steve
Forced backgrounding of processes
The following dissociates stdout/stderr/stdin (saves output in nohup.out) and removes controlling terminal so that you can close the terminal or log out while still permitting the program to execute.
nohup /path/to/some/command/that/isnt/a/daemon & disown %-
This can be useful for long operations that do not require interaction (otherwise you'd just use the screen utility of course).
For similar goals where this fails, justin has some code to solve the problem. Perhaps he will post it.
Shared terminal session, the ghetto method (readonly)
The following permits a user on terminal A to show another user on terminal B what transpires (in real time) during a terminal session. A fifo is created as a first step and then script is run with its output directed to this fifo -- the script command will appear to hang, but when the user on terminalB uses cat upon the fifo, script will exit succesfully. This is somewhat useful - it alerts the user on termA that the other user is ready.
terminalA$ mkfifo /tmp/outNow, you could use script to record a session and distribute the result afterward, but sometimes observing as things happen is useful. For a more flexible mechanism to accomplish the same, the GNU screen utility has shared session ability - including permissions with respect to reading/writing.
terminalA$ script -f /tmp/outterminalB$ cat /tmp/out
Freestyle Nerds
July 1, 2008 - By Justin
<djahandarie> we're here to do c-s-e on the w-e-b
<djahandarie> listen to me spit these rhymes
<djahandarie> while i program lines
<djahandarie> and commit web accessibility crimes
<djahandarie> word, son
<http402> You talk like your big on these I-Net kicks,
<http402> But your shit flows slower than a two-eighty-six.
<http402> I'm tracking down hosts and nmap scans,
<http402> While Code Igniter's got you wringing your hands.
<http402> Cut the crap rap,
<http402> Or I'll run ettercap,
<http402> Grab your AIM chat,
<http402> N' send a PC bitch-slap!
<http402> peace
<djahandarie> you're talkin bout down hosts and nmap scans
<djahandarie> while i got other plans
<djahandarie> you're at your new job, but you can't even do it right
<djahandarie> you just create a plight with your http rewrites
<djahandarie> i've been on the web since the age of three
<djahandarie> you just got on directly off the bus from mississippi
<djahandarie> respect yo' elders, bitch
<http402> You've been webbin' since three, but still ain't grown up,
<http402> Gotta update your config and send the brain a SIGHUP.
<http402> You say you're that old? No wonder you're slow!
<http402> You're knocking at the door while I run this show!
<http402> Elders my ass, you're shit's still in school,
<http402> Hunt and pecking at the keyboard like a spaghetti-damned fool,
<http402> Rim-riffing your hard drive like a tool,
<http402> Face it. I rule.
<djahandarie> i erase my harddrives with magnets (bitch)
<djahandarie> all you can do is troll on the fagnets
<djahandarie> and son, my brain's wrapped in a nohup
<djahandarie> it wont be hurt by the words you throwup
<djahandarie> dont mind me while i emerge my ownage
<djahandarie> while you're still over there apt-getting your porridge
<djahandarie> you say i'm still in school
<djahandarie> but the fact is that i know the rule
<djahandarie> cuz you need to go back to grade three
<djahandarie> and you better plea, that they take sucky graduates from c-s-e
<http402> Time to bend over and apply a patch,
<http402> Your brain's throwing static like a CD with a scratch.
<http402> Your connection got nuked and you've met your match.
<http402> You run a single process like a VAX with a batch.
<http402> I'd pass the torch to a real winner
<http402> But it'd just scorch a while-loop spinner
<http402> Caught in a loop that you cant escape,
<http402> I run clock cycles around your words and flows,
<http402> Cuz your rhyme is like a PS fan: it' blows,
<http402> Your water-cooled lyrics leak and it shows,
<http402> Take your ass back to alt.paid.for.windows.
<djahandarie> Good god, I can't even respond to that. :P
<djahandarie> You win haha
* http402 takes a bow
FTCache - Generic PHP Cache System
March 1, 2008 - By Justin
I am excited to announce the release of the FTCache library. This is a generic caching library written in PHP for PHP applications to do whatever type of caching you need. It is designed to be completely modular so you can select the algorithm (Strategy) for managing the cache along with the storage mechanism (Container) on instantiation, and switch between them with no change in functionality. For example, if you want to test out different caching algorithms, you can focus completely on the algorithm and the system can handle the storage mechanism for you. It would be perfect for algorithm comparison as well.
The core of the system is heavily and thoroughly tested and should be rock solid, so you have a strong foundation to build upon. The library currently has one strategy and two containers included with the library. More will come as I need them
For anybody looking to integrate it into Code Igniter or another framework, it is completely object oriented, so it should be seamlessly easy. If you have problems, questions or requests, please email me at justin@fugitivethought.com. The official home page of the project is:
Code Igniter vs. Prado
February 25, 2008 - By Justin
Intro
I have been corresponding with one of our readers who has been interested to learn about my recommendations for Prado vs. Code Igniter or other frameworks. I thought it might be interesting to the rest of you to hear my recommendations as well, as there seems to be very little material out there regarding Prado. Below are the relevant portions of my initial recommendation:
Initial Recommendation:
Code Igniter and Prado are very different approaches to frameworks, so it really depends on what you are looking for. Prado has an everything-and-the-kitchen-sink approach and use a Code Behind pattern from separating look and logic.
Code Igniter is very light-weight framework, but much more easily extended, and it uses the Model-View-Controller pattern to separate look and logic.
Prado has some built in internationalization support (http://www.pradosoft.com/demos/quickstart/?page=Advanced.I18N), but if you decide on Code Igniter, this may help: http://codeigniter.com/wiki/Category:Internationalization::Internationalization_Views_i18n/
As for web services, both have some built in support for the more popular services, and I'm sure that no matter which service you are looking for, someone has implemented it for Code Igniter.
Overall, I am personally a big fan of the Code Igniter framework, mostly because it is the one I have used the most. Prado (as you can see from my benchmarking post) has some performance issues and is significantly more difficult to learn. Code Igniter is also more easily expandable. However, I have had issues with Code Igniter on very large applications.
In summary, I would recommend Code Igniter only for simple or medium-difficulty applications. If you are creating a blogging system or very basic shopping cart, or even a web forum, I would say Code Igniter is the way to go to make your life easier. If you are building more complex applications that involve large amounts of form processing, more complicated calculations or if you are already familiar with some form of code behind (ASP .Net for example) then Prado would be more to your liking.
On a final note, if you are interested in something with much more complex and fine-grained permission management, I have found that CakePHP has the cleanest and most intuitive access control list implementations out of any framework and it is well integrated with the rest of the framework.
Response
At this point the reader pointed out some anomalies in my recommendation, so I had to rethink my position somewhat. The most glaring issue being that the Prado Benchmarks post I made shows major performance issues with Prado, so why would I recommend it for larger applications?
Second Recomendation
With Code Igniter, I found a lot of issues with mapping URI's to controllers for more complicated URLs. This becomes a very big problem when you need to do advanced filtering. As I noted in my Fugitive Thought post on Pretty URLs, Code Igniter does not allow you to use both the nice URL mapping along with GETs, which means that if I want to filter by any more than 1 field, I have to either write a lot of extra code to figure out which parameters are provided, which are not and assign them to proper variables, or else I have to give up the pretty URLs in favor of everything being pure $_GET values and all pages going to index.php. For example, if I have an advanced search that I want people to be able to bookmark / link to with three different fields, a normal search file would have a URI like this:
http://myserver/blogs/search?a=blah&b=bloh&c=bleh
With Code Igniter, I can easily map /blogs/search to a specific controller, but I cannot use the $_GET values at the same time. What I have to do is generate something that will use this instead:
http://myserver/blogs/search/blah/bloh/bleh
This is all well and good, but what happens when each of these fields is optional? There are some obvious solutions, but they all require to implement extra code simply because the framework is saying "no". I ran into a few other aggravating issues with Code Igniter as well that felt like they were limiting me far too much on large applications that need more complicated features, including session management and permission handling, etc.
Don't get me wrong, Code Igniter does a LOT of things right, and it is very lightweight. If you put in the proper add-ins like the Smarty templating engine and make use of its template compilation and caching, you can do a lot of advanced stuff and keep it very fast. The reason I recommended Prado for larger applications despite the benchmarks is that a lot of the Prado stuff seems to scale fairly decently. The benchmarks I have displayed are ludicrously slow for a small application, but I don't think it gets too much worse as you grow into larger applications because that overhead is a result of loading pretty much the entire framework in the beginning. I'm not positive, but I'm reasonably sure that the overhead can be mitigated with caching. And if the application is very large, then you are going to have some loading overhead anyway, and the very large suite of features that Prado handles will make up for a lot of the overhead in a lower development time and easier to debug code.
Code Igniter is nice for adding in just about any existing library because of the way they implement the "libraries" features; any existing class that does not rely on $_GET variables to the URL will pretty much just drop right in. This makes it wonderfully expandable, but also adds in a lot of redundancy when you need a lot of libraries since they are all disparate and not integrated and end up re-implementing a lot of the same features, whereas Prado has a lot of the features already built in and integrated to the framework so that all components can share a lot of features and code.
Akelos looks interesting, but you're right about it being very young. One reason that both Prado and Code Igniter are high on my recommendation list is the amount of documentation. That really is priceless in making a framework usable. If you decide to go with something like Akelos, I recommend using an IDE with IntelliSense (http://en.wikipedia.org/wiki/IntelliSense) if you don't already (Eclipse PDT is very nice for PHP intellisense! - http://www.eclipse.org/pdt/. NuSphere's PHPed is also good, but it costs).
Flow of Control Database Design
February 13, 2008 - By Justin
Introduction
I have been working on a number of projects recently that have to do with passing resources between a number of different steps. Each step has a method for determining who has permission to see the resource at that step, and who has permission to modify certain portions of the resource at that step. I am also applying for some jobs that have to do with this type of process flow control on much larger scales, so I thought that it might be a good time to formalize some of the patterns that I have found useful in implementing these kinds of systems.
This entry is going to focus largely on the database design needed to support this system. If there is interest, I can write future entries describing more of the application level design that utilizes these databases.
Problem Statement
The easiest way to describe these patterns is by the description of a practical use-case, so for the entry, we are going to design a basic ticket management system. In large enterprises, especially in the customer support departments, ticket management systems are very popular. The ticket management system provides some flow of control so that when a customer first contacts the department, a ticket is created and some information is attached to it, usually a problem description. If the first person that the customer talks to can resolve the issue on their own, then the ticket will probably only pass through that one step. Often, however, the information has to be passed to some other person, who will perform some action, add a note about it to the ticket, and then pass it back to some other person who will communicate the changes to the customer and repeat the feedback process until the issue has been resolved.
For our very basic ticket management system, we will assume five general entities:
- Customer - a person who has a problem
- Call Center - people who speak directly with the customer. Can solve basic issues like answering common questions.
- IT - people who handle technical problems
- Management - people who handle human problems
- Auditors - people who can look at all tickets in the system so that they can review the processes
In our case, the resource is the ticket, we have a list of groups of people, and we have a good idea of who should be able to have access and when. The customer may be able to view the entirety of a ticket about them up to its current state. Auditors can view the entirety of any ticket. Call Center, IT and Management can view a ticket only when it is at a point where they would need to, and they can only add new additions to the ticket; they cannot modify or remove previous entries on the ticket. We will assume the figuring out what group a person belongs to is trivial.
The general database design for the ticket resource will be like this:
|---------------------| |----------------------------|
| Tickets | | Notes |
|---------------------| |----------------------------|
| id (int) | <--| | id (int) |
| customer (varchar) | |-----| ticket (int) |
| create_date (date) | | author (varchar) |
| problem (text) | | create_date (date) |
|---------------------| | message (text) |
|----------------------------|
The tickets are the main resource and is uniquely identified by its id (primary key). Each ticket will have zero or more notes attached to it, and each note can be uniquely identified by its own id. The note is associated with a ticket by the "ticket" field, which is a foreign key to Tickets.id. The original problem is stored in Tickets.problem, and any messages about the ticket are stored in Notes.message.
Status Pattern
One possible design solution I will call the Status Pattern. In this approach, we have a large table of all of the tickets in the system, and each ticket will have a "status" property. Whenever the ticket is passed to a new entity, the status is changed. In this case, we would add the "status" field to the Tickets table. The status fields can be any of the following values:
- OPEN - ticket was just created, no notes attached yet
- CALL CENTER - the problem has been stated and the call center is currently workign to solve it.
- IT - the problem has been assigned to IT to fix.
- MANAGEMENT - management has to do something to resolve the ticket.
- CLOSED - ticket has been resolved in some way, and is no longer a concern
Advantages
This system has the advantage of being very simple. There is a single place to look at all tickets, and in order to generate the list of which tickets a person can see, we just look for all tickets with a status belonging to that person's group.
Another nice feature of the system is that it is easily expandable. If we have a new group that may need to view the tickets, all we have to do is add a new option for status values.
Disadvantages
While this system is quite simple, there can be problems with it. First of all, the database will end up cluttered with CLOSED tickets over time, which will decrease the access speed for tickets that are not yet resolved. This means that the database will get slower and slower as time goes on, which is not a good thing. Archiving old tickets is a matter of going through and finding all tickets with a CLOSED status, and storing them somewhere else, but this means that we no longer have the one central location to look at all tickets both current and passed.
If you are very security minded (perhaps these tickets contain incredibly sensetive information), then there is another problem with this design: we cannot easily separate permissions across ticket statuses. A lot of database security specialists will tell you that creating a different SQL account for each user with only the permissions that user can have are essential to keeping your information secure. This is discussed in more detail in the next section where we resolve this problem.
Queuing Pattern
This solution takes a lesson from queuing theory and is the type of approach used in areas that require more severe separation of concerns. Instead of having all of the tickets in one centralized table, we will create multiple tables. For example, we will have a table that holds all of the CALL CENTER tickets, a table that holds all of the IT tickets, a table that holds all of the MANAGEMENT tickets, and a table that holds all of the CLOSED tickets. Each of these tables can have the exact same set of fields, and will essentially be duplicates of the original Tickets table. Since the tickets and the notes go hand in hand, this actually requires duplication of both tables. As a basic example, we would have tables like this:
|---------------------| |----------------------------|
| Call_Center_Tickets | | Call_Center_Notes |
|---------------------| |----------------------------|
| id (int) | <--| | id (int) |
| customer (varchar) | |-----| ticket (int) |
| create_date (date) | | author (varchar) |
| problem (text) | | create_date (date) |
|---------------------| | message (text) |
|----------------------------|
|---------------------| |----------------------------|
| IT_Tickets | | IT_Notes |
|---------------------| |----------------------------|
| id (int) | <--| | id (int) |
| customer (varchar) | |-----| ticket (int) |
| create_date (date) | | author (varchar) |
| problem (text) | | create_date (date) |
|---------------------| | message (text) |
|----------------------------|
(etc.)
With this type of database schema, the operation for passing tickets from one group to another more closely resembles its real world function. When the call center wants to pass a ticket to the IT department, the data in the Call_Center_Tickets and Call_Center_Notes tables will be duplicated into the IT_Tickets and IT_Notes tables, and removed from the Call_Center tables. Essentially, we are physically moving the ticket.
Advantages
One advantage of this pattern is that it is easy to generate the most common reports. For example, when a Call Center employee views a list of all of the tickets they need to handle, it is simply a matter of selecting everything from the Call_Center_Tickets table, joined with the appropriate Call_Center_Notes. The only exception to this rule is for an Auditor. The Auditor will have to view all of the tickets, which may require a more complex UNION across multiple sets of tables. However, since there are generally very few Audit reports generated compared to the other groups, this is normally note a problem.
Secondly, this system will remain fast over time, since only the tickets that the users care about are stored in the table that they are accessing rather than cluttering up a single table with lots of CLOSED tickets.
Third, archival of old tickets is very easy, since all of them will be stored in the Closed_Tickets and Closed_Notes tables.
A fourth and very important advantage of this pattern is security. As mentioned in the Status Pattern, a very secure system should require individual permission levels inside of the database itself. If you do not understand this concept, read this decent guide to SQL injection, and pay close attention to the segregate users section. For this database schema, we would have a tickets_call_center user, a tickets_it user, and tickets_management user, etc. Each user will have onyl the permissions that they need to do their job. For example if a client logs in to look at their ticket, the system would connect to the database with the "tickets_client" user. The tickets_client user will only have read permission on tables, so that even if the user is able to find an application vulnerability in the system, they would not be able to change any of the tickets. A call center employee would be connected to the database using the tickets_call_center user, and would have INSERT and SELECT access to the call_center_tickets and call_center_notes tables so that they could create tickets and notes, and then DELETE access to the call_center_* tables and INSERT access to other tables so that they could pass the ticket on to other groups. In the worst case scenario, if they call center employee turns malicious and finds an application level vulnerability, they would only be able to affect call center tickets, and these they would only be able to delete. This helps to contain the damage they can do.
Disadvantages
However, this extra speed and security does not come without some sort of price paid in overhead. As mentioned before, the Auditor user will have to use more complicated queries to view all of the tickets together. This is not usually a major concern. Secondly, the system is significantly more complex. If it is coded properly with the correct use of transactions, there will not be any lost tickets when moving tickets between statuses, but this requires good design and a lot of testing to make sure it is correct. Also, there is a lot of duplicate types of information here that formal models for normalization may not take kindly to.
Pointer Pattern
As with any system, there are many possible solutions, and I cannot cover all posibilities, however this last pattern is another alternative to the previous two, and is in a way a hybrid of the two. With this pattern, we would store all of the tickets in a centralized pair of tables called simply "Tickets" and "Notes". However, instead of having a status field to manage the permissions, we can have a set of individual tables for each type of user. One could be named "call_center", and would simply be a list of all of the Tickets.id values that correspond to tickets that are currently assigned to the call center. This pattern would be duplicated by tables such as "it", "management", etc. This allows us to emulate the real world process again because we can pass a ticket from one group to the next by removing it from one list and adding it to another. However, this pattern does not provide any advantage over the previous two since we cannot secure permissions between entities, and we are required to do JOINS across tables for any sort of request. I am mentioning it only because I have seen it used.
Conclusion
Keep in mind that these example are simplified. In the real world, you will need to take into account other facts such as making sure that call center employees can still view older tickets in case the person calls back a long way down the line. Your choice of the pattern to use depends a lot on personal preference, on the abilities of your DBMS, and on the application where it is being used. For example, some DBMS's may allow you to have more complex permission schemes, so that you can properly secure data in one table rather than having to separate it across multiples ones. When you take this into consideration, the Queuing pattern may not have as many advantages. Any professional application development requires some judgement on the part of the developer, and there is no silver bullet.
Pretty URLs - htaccess friendly url-to-action mapping
January 15, 2008 - By Justin
Thanks to djahandarie for pointing out a code error: PHP_SELF has been replaced with REQUEST_URI.
While I may not be a huge fan of Code Igniter, their approach to URLs is absolutely incredible. I honestly never knew before I started using it that it was even possible to append a bunch of stuff after a filename in the URL without using the question mark. For those of you who don't know what I'm talking about, these two URLs will load the same file:
http://www.fugitivethought.com/index.php
http://www.fugitivethought.com/index.php/foo/bar
If you don't believe me, try it out right now! What's so great about this? Well first of all, it (at least mildly) looks like you are browsing folders. There are no ampersands or question marks mucking up the URL. The second cool thing is that a very simple mod_rewrite rule allows you to remove the index.php from there, to create a very pretty URL that would look like this:
http://www.fugitivethought.com/foo/bar
The .htaccess file that will allow this is taken directly from the Code Igniter user guide (http://codeigniter.com/user_guide/general/urls.html):
RewriteEngine on
RewriteCond $1 !^(index\.php|images|robots\.txt)
RewriteRule ^(.*)$ /index.php/$1 [L]
If you don't get the importance of pretty, human interpretable URLs then you can read more about it here: http://www.aardvarkmedia.co.uk/about/articles/007.html
Now in order to make this approach to URLs useful we need a parser inside of index.php with a very simple map that will allow us to tell it what class and function to call based on the different parts of the URL. While looking to implement the best parts of Code Igniter for a personal project, I wrote a script file that does exactly this.
Basically, you keep a set of "Controller" classes inside of a directory named Controllers. You can change the name of this directory by editing line 34 in the code below (35 in the downloaded file). The map variable (lines 11 to 18) is simply a mapping of what class and function to call based on the first part of the URL (in our above entries, this would be the "foo" portion). If there is no entry on the list to match this first step, then it will use the entry labeled "default". For each entry it calls, it pops the top entry (first entry) off the stack of '/' separated portions of the URL and passes the rest of them to the function. So if we visit http://hostname/foo/bar/cat, the function will figure out which function to call for "foo", and then call it passing an array containing "bar" and "cat". Any further redirecting of the URL can be done inside the function that is called.
// Define the path root for the application.
// Example: A site rooted at http://foo.com/bar/index.php would put bar/index.php here
define('_PAGE_ROOT_', 'bar/index.php');
// Parse the URL
$uri = substr($_SERVER['REQUEST_URI'], strlen(_PAGE_ROOT_) + 2);
$pgs = explode('/', $uri);
// Describe the URI map, further mapping can be done at each controller
// inside of the map method.
// default - load views.php and the Views class and call the home() method
$map['default'] = array('views', 'home');
$map['year'] = array('views', 'year');
$map['month'] = array('views', 'month');
// note the use of "Map" here
$map['control'] = array('control', 'map');
$map['request'] = array('actions', 'request');
$map['csv'] = array('feeds', 'csv');
// Map the first section of the URI to a controller and action
if ( isset($map[$pgs[0]]) ) {
$action = $map[$pgs[0]];
array_shift($pgs);
} else {
$action = $map['default'];
}
// Cleaner format
$controller_class = ucfirst($action[0]);
$controller_action = $action[1];
$controller_params =& $pgs;
// Load the controller
require 'Controllers/'.$action[0].'.php';
// Instantiate the controller
$controller = new $controller_class;
// Call the action
$controller->$controller_action($controller_params);
or you can click here to download a local copy for perusal.
In the example code in the provided file, the map calls a function named "map" in the "control" class if the first string is "control". The map function in the control class looks at the next entry in the stack of URL nibblets and calls a particular function for that. For example, it could call the "bar" method of the control class and pass it whatever it deems necessary.
One major advantage of using this over the Code Igniter code is that I am not artificially blocking using GET in the URL. If you want to have a URL like:
http://hostname/foo/bar/?blah=moo&moreblah=moremoo
then you are perfectly welcome to do so. The controller functions will be called always and be able to use the built in $_GET global in PHP as any normal PHP application would. And you still get to have pretty URLs.