how to fix cgi - majordojo

how to fix cgi - majordojo blog photos projects about subscribe majordojo i am byrne reese. i create stuff. how to fix cgi by byrne on november 6, 2007 10:11 am | permalink | comments (15) over the many years of their coexistence, the terms cgi and perl have become virtually synonymous. this perception that cgi and perl are one and the same has contributed to some small degree to the perception that perl is outdated and an inappropriate language for web programming - unlike its more modern counterparts like ruby or php. i of course know this to be a complete fallacy. first of all, the company i work for is a devout perl shop and our products, all written in perl, are collectively some of the largest and most scalable on the internet. few other companies in fact present more widely attended seminars and tutorials about scaling for the web then ours do at conferences like etech and oscon. but perhaps the more significant fallacy is that cgi and perl are synonymous. in actuality, they technically have very little to do with one another, and there is technically no reason one should be hampered by the other, despite all evidence to the contrary. a little history and background cgi stands for "common gateway interface" and was invented to allow any script, written in any scripting language, to also act as a web based application. in the early days of the internet this was incredibly helpful to the first web programmers (a.k.a. system administrators) proficient in sh, bash, csh, tcsh and perl because it allowed them to quickly deploy simple web based automation tools based on scripts and libraries they had already written. cool. but the inherent flexibility of language agnosticism is also cgi's greatest liability, and also by association, perl's as well. you see, cgi is based upon the principle that when the web server receives a request, it does not know what scripting language will interpret that request. therefore, it defers processing directly to the operative system, and so must do something geeks call "forking and exec'ing" - or in other words, the web server must start up an entirely new process on your server to handle the request. this may not sound like a big deal, but it sometimes can be as each forked process holds the entire perl interpreter in memory. and that is a big deal. first it can consume a lot of your server's memory, and depending upon the size of your application, it can be slow to initialize. it works - and works well by virtue of working, but it by no means ideal. more modern languages have been designed to avoid this. let's take php for instance. php is a language that was designed for web programming exclusively. therefore its architects made a critical (perhaps even obvious) decision early on: if the web server is going to handle a lot of php scripts, why bother forking a process to determine what scripting language will handle the request (this is expensive) - why not just load the php interpreter into memory once and interpret the request within the web server (which is much cheaper)? so when it comes to cgi vs php, it is not really about perl vs. php at all. it is really about understanding two solutions to two different problems - one operating under the assumption that every request will be processed by the same interpreter and the other designed to execute any script via the web. the solution as it stands today in the perl world, there are actually two apache modules that attempt to do what php does inherently: load the perl interpreter into memory so that you no longer have to spawn a new process each and every time your web server receives a cgi request. those two modules are mod_perl and mod_fastcgi. however, these two modules have a critical flaw: they are incredibly complex because they attempt to solve a huge problem set having to do creating a persistent and stateful execution context. the result are two modules that are not only too heavy weight for the average user but also incredibly difficult to install - even for myself. how to actually fix the problem the more i have thought of this problem, the more i have come to believe that there is little standing in perl's way to have all of the benefits that php has gained from being a language designed exclusively for the web. in theory, one should be able to take the source code of mod_php (the apache module that dispatches web requests to a php interpreter) and swap out the component responsible for dispatching a request to php for one that dispatches the request to perl. in theory it should just work (more or less). the result would be an apache module that would be easy to install, and be much more efficient in handling and processing requests online. granted, this solution would not be stateful and persistent the way mod_perl and mod_fastcgi are, but that is not a problem this solution is engineered to solve. introducing mod_perlite all of this is a really long-winded setup for what is a very quick conclusion. i shared this hypothesis with an engineer at work, aaron stone, who shares a passion for perl with me, but who also has a passion for operations. he took on this challenge and devoted part of his 20% time to testing this hypothesis. the output of his work is called mod_perlite. it is largely derivative of php and is capable of processing perl scripts quickly and efficiently. the next step of the project is to make it compatible with the cgi protocol, which can be done by gutting parts mod_cgi and dropping them into mod_perlite. so far our results are promising, and it is possible that with a little hacking we may have just made perl faster on the web and easier to deploy for everyone. if you are interested in helping or participating in this project please let me know -- we could certainly use the help. categories: geeky goodness, open source, programming tags: apache, apache modules, cgi, open source, perl, php, programming, web 15 comments mark carey said sounds interesting, byrne. can you elaborate: what can the current version of modperlite do? and what can it not do (yet)? and without getting into the details, what is the difference between modperl and mod_perlite (is it just the persistence part?) november 6, 2007 10:40 am | reply aaron stone said mark, the goal of mod_perlite is to run single perl scripts in the apache process space, caching perl bytecode as it goes, but flushing script memory after every request. installation is also incredibly simple, and 100% analogous to php installation. right now, mod_perlite can be loaded into apache and serves requests for any file ending in ".pl" with the phrase "just another perl hacker" (ala man perlembed ;-) still to do: - thrash at a few more bits of the perlio - apache interface. - develop a script caching model (ala zend accelerator or apc). - add a script run-timer to kill runaway scripts (ala php's maxexecutiontime). fundamentally, mod_perl seeks to map nearly all of apache's api to perl. mod_perlite seeks merely to put the perl interpreter into the same process space and not much else. november 6, 2007 12:23 pm | reply bud gibson said hi byrne, i think you are hitting part of the nail right on the head. once you get something like mt installed (the first part of the nail), then you have all of these performance elements that can reach up to bite you. this sounds like an interesting stab at those that could prove very useful. november 6, 2007 2:47 pm | reply timothy appnel said does perlite give the user the option of sharing/stashing data across multiple requests if they want? also how does this module detect changes in the code? said differently, if i write a script the loads a module like data::objectdriver and later upgrades that package, how will the file running under perlite pick the change flush the perl byte code? november 6, 2007 3:08 pm | reply byrne replied to timothy appnel's comment the module would be completely stateless, just like php. so no - i could not stash data to be shared from one request to another. that is something mod_fastcgi and mod_perl were designed to do. but mod_perlite is designed to be more light weight. to be simple above all else. that being said, if i chance a .cgi file on my server then mod_perlite will pick up that change immediately - no server restart required. november 6, 2007 3:14 pm | reply timothy appnel said thanks for the clarification. simplicity is probably the best course. session or system scope is probably more then most can deal with and sets up potential problems such as memory leaks. i thought you'd find this link interesting: http://use.perl.org/~jjohn/journal/20761 i'm of the mind that the modperl project blew it in a number of ways. not sure i was clear enough on my question of detecting updates and lushing the byte code cache. let me try another example. movable type (you may have heard of it) has tiny .cgi files that essentially call one module that then does all of the work of loading other modules and processing the request amongst a lot more. if i don't update that .cgi, but do update one of those modules that get loaded later, how does perlite pick that up and recompile the source? november 6, 2007 3:35 pm | reply byrne replied to timothy appnel's comment because mod_perlite should only, by design, keep the interpreter resident in memory. perl modules should be loaded on demand. however a great feature i can see would be a mode where some modules can be included/excluded from being cached in some way. that way commonly used modules are only loaded once. but that too would violate the design constraint of mod_perlite. the idea here is to simply avoid the cost of forking and execing a process, not to try to gain any other efficiency because doing so is a slippery slope that leads right into a well of complexity i just assume avoid. if you need persistence - use mod_perl. november 6, 2007 8:22 pm | reply aaron said indeed, i intend to answer most feature requests with "sounds like you need mod_perl" -- but of course i'm only looking forward to getting to that point and not nearly there yet ;-) november 7, 2007 12:02 am | reply bart lateur said granted, this solution would not be stateful and persistent the way mod_perl and mod_fastcgi are, but that is not a problem this solution is engineered to solve. reverse that: statefulness and persistency is a problem in mod_perl that this solution is avoiding. it is, imo, the main reason why people use php over perl for web applications, and most definitely on shared webservers: because it's too easy for independent projects on the same webserver to trip over each other. if you are interested in helping or participating in this project please let me know -- we could certainly use the help. well, i'm definitely most interested in how this project is going to turn out, so i'm definitely going to keep an eye on it, but as i'm not a c programmer, nor have i ever done anything mod_anything related, i doubt if you could have any use for me at this stage. so i'll keep standing at the sideline for now. if you need a hand that ordinary perl programmers can lend, just give me a yell. november 7, 2007 2:19 am | reply tirwhan said hmm, did you actually benchmark the benefits of this over pure cgi? as pointed out on perlmonks, the time the os takes to fork a process and load the perl interpreter is rather minuscule in comparison to module load times. so unless i'm very much mistaken i don't see what mod_perllite gains (unless you're on windows possibly). november 7, 2007 7:26 am | reply byrne replied to tirwhan's comment to be honest, not yet. remember, this is just a theory at this point. i think mod_perlite needs to be relatively stateless, but it still needs to address at its core the problem of start up time. if in the end fork and exec is not the bottleneck, then we need to shift attention to the more significant contributor to poor performance. i think that this may very well introduce yet another use case for the need to specify a list of perl modules to load upon start up. that way large perl modules can be read into memory and shared. but this must done without opening the door to memory leaks. is that possible? november 7, 2007 9:51 am | reply clinton said like bart, i'm not a c programmer, but what about the model i suggest on perlmonks?this provides environment separation for different web sites, while allowing you to preload modules (and even data), and still makes sure that each request gets served by a pristine perl process. november 8, 2007 5:10 am | reply byrne replied to clinton's comment clinton, i am beginning to see, just based upon the many comments already, that some facility should be given to the module to allow for some modules to be pre-loaded at start up. that clearly has too many benefits and would help us to achieve our primary objective. where i am not to sure i agree is any idea that encourages the spawning of a separate process to manage other processes. when it comes to thread and process management, we should leave that exclusively to apache. if our solution ventures beyond that then my fear is the complexity that would surely follow. all of a sudden you open the door for people to legitimately need a million configuration values to control load, resource allocation, etc. i can't stress enough - the success of this module will lie with its simplicity. if in the process, its architecture and design can inspire more complex solutions or can inform the designs of existing solutions like mod_perl or mod_fastcgi, then that is a good thing. november 8, 2007 7:57 am | reply aristotle pagaltzis said this is really interesting indeed. i can’t wait to see where this goes. as for module preloading, such a facility would be a very good thing indeed. if you omit that, users will have to fight with the same problem as cgi presents: minimising start-up time. the practical upshot of that is generally: try to use as few modules as possible in order to minimise load time. you can imagine what that means. having start-up as a constraint seriously mangles code design decisions every step of the way. lack of preloading would also make it hard to use catalyst, and impossible to use dbix::class with large schemata – the start-up penalty for both such that they’re next to useless in fully non-persistent environments like cgi. if you want to allow people to make unafraid use of cpan, preload support is of the essence. november 11, 2007 1:05 am | reply aaron said it's alive! :-) basic cgi functions work at this time. check out the module from http://code.sixapart.com/svn/mod_perlite/trunk and give it a whirl! build and install instructions are in the readme file and are incredibly straight-forward. november 11, 2007 1:36 pm | reply leave a comment name email address url remember personal info? receive email notification of further comments. comments (you may use html tags for style) about this entry this page contains a single entry by byrne published on november 6, 2007 10:11 am. links for monday, november 05 was the previous entry in this blog. links for tuesday, november 06 is the next entry in this blog. find recent content on the main index or look in the archives to find all content. elsewhere aim profile del.icio.us profile digg profile flickr profile linkedin profile livejournal profile technorati profile skype profile twitter profile facebook profile last.fm profile pandora profile subscribe to this blog's feed search powered by movable type publishing platform this blog is licensed under a creative commons license.

Acceuil

suivante

how to fix cgi - majordojo   serious fix 4.1  Why Blog Post Frequency Does Not Matter Anymore Marketing Profs ...  FOSSwire » Fix a Frozen System with the Magic SysRq Keys  Windows Vista Team Blog : Partners helping fix Vista Software ...  Objet Publicitaire : Magnet Magic fix - ALB01.com  Indonesia's three divas fix the nation's finances International ...  Rob Galbraith DPI: EOS-1D Mark III sub-mirror fix announced in USA ...  KompoZer - Easy web authoring  Mr. Fix It (2006)  XML.com: Using XSLT to Fix Swing  [Profil de fix] OverBlog - Le blog des blogs  Fix Your Money Screw-Ups - Kiplinger.com  frontline: the wall street fix PBS  Portail Internet de la Haute Autorité de santé - H-FIX PDS  Your Freebie Fix - All the latest Freebies, Coupons and Online Deals  はてなブックマーク - FIX  Bug Fix Weekend finished :: pnCommunity :: Support at your fingertips  Windows Mobile 5.0 Fix Site - Home  Tena Fix - Incontinence  Google to fix blog noise problem The Register  BBC/OU Open2.net - Can Gerry Robinson Fix The NHS?  Two charged with hacking PeopleSoft to fix grades - Network World  Bike Shop Frederick, Maryland Bike Repairs, Bike Sales --BICYCLE ...  Object Fix Zip - Freeware for repairing damaged ZIP archives with ...  Ajaxian » IE’s Memory Leak Fix Greatly Exaggerated  IPython fix for Leopard - O'Reilly ONLamp Blog  Fix all Ajax cross-browser problems then deploy  Wiki Autrans - Fix  FIX-IT - bedrijvengids - handelsgids - webdesign - Pc repair ...  PC Fix Error Doctor Registry Cleaner PC Diagnostics PC Checkup  GRC CIH Virus Recovery  You receive an access violation error and the system may appear to ...  FIX: Update to enable DirectX Video Acceleration (DXVA) of Windows ...  Video Coldplay - Fix You - coldplay, fix, you, clip ...  Free Registry Fix 3.9 for Windows  Solar shield could be quick fix for global warming - earth - 05 ...  adaptive path » 8 quick ways to fix your search engine  Fix for COM Surrogate Has Stopped Working Error in Vista :: the ...  Oral Fixation Mints / Get Your Fix  Nail Fungus Onychomycosis - Fix My Fungus  Pierre Fix-Masseau affiches sur AllPosters.fr  Fixit Guide Series - DIY Mac & iPod Repair  Dura Fix Aluminum Welding Aluminum Brazing Aluminum Soldering ...  The Simple Dollar » 31 Days To Fix Your Finances: A Wrapup  Gallery 2.2.3 Security Fix Release Gallery  What To Fix  VCOM: V Communications. Security, Web, OS Management, Partitioning ...  Fix for securityd hogging RAM when reauthorizing apps' Keychain ...  Cafe Hayek: Just Fix It  DriverAgent.com Fix Your Driver Problems Instantly with Driver Agent  Federal 'fix' knocks ca.gov for a loop NetworkWorld.com Community  macosxhints.com - Twenty steps to help diagnose and fix system issues  ca-fix program description.  Fix your Exposé keys - WOW Insider  IndieHIG » Blog Archive » Fix the Leopard Folders (FTLF or FTFLF)  Blogger Buzz: A Layout Solution  M·A·C Cosmetics Studio Fix Powder Plus Foundation  The Right Way To Fix Inaccurate Wikipedia Articles  The Daily Fix - WSJ.com  The Art of Colin Fix