2009
12.02

Perl is making me mad

But its not perls fault….

its mine, I suppose. No scratch that, its PHP’s fault. Damn it too hell indeed! So let me fill you in on some back story.

Once upon a time, about a year back, I was all unemployed and that, and wanted to do some kind of career change, from nothing (previously call centre/builder/bar staff) to something that paid pretty well, so I figured web developers get a lot of money, and theres a lot of room for promotion etc. Now, already possesing all the programming skillz known to man, I figured this would be easy enough.

Anyway, to cut a long story short, not knowing much about this web dev lark, I settled on php, mostly because I thought it was the language to do this, after all perl was well and truley dead at this point [sic] (btw, I have no idea what this sic means, I just figured it makes me look clever!). Any way, I wrote some codez, was pretty damn easy tbh, I had shit up and running faster than something very fast.

A month down the line I had that all sorted and needed to put my new found skillz to good use, so I started making This Aint News – Social News for pirates

Anyway, I soon found out that php has some seriously bad programming techniques, such as jibbing everything into the index.php and letting it do all the magic. I figured this was cool, since it ran lightning fast, but I soon figured out this was pretty unmaintainable, so a few rewrites later I had something nice(ish).

So this was about 9months into me being dole scum, and I finally got a job, mostly I think coz I put the effort into making TAN however this job was perl based, and me having never used perl, I was all like, wow, this perl thing sucks.

Fast forward again a whole year, catalyst is beggining to look real nice, and I mean real nice, but slow. My current code was getting pretty hacky, so I decided once again I’d rewrite it, this time using MVC and a catalyst inspired design, needless to say it was epic, probably the best peice of php anyone anywhere had ever written – ever! Only it had a few issues, being of a catalyst inspired design, it was objecty, very very objecty. And as we all know, php has no concept of long running web apps.  However, I continued, and got right near the end of the final rewrite, only to discover there was no way this code was gonna run fast, since it had to initialise all the objects on each page request.

At this point I figured fuck it, I’ll remake the bastard in catalyst, it wont take too long, the code is pretty similar, its aware of all the cat stlye concepts (stash, flash, c (called f) etc).

Now this blog is coming to an end, I can get to the point I was trying to make, now I have to convert all this perfect php code to perl code, and I don’t want to, it just feels like I’m doing the same work once again, and theres no joy in it.

So to put it simply, I’m blaming everyone else, especially php, and perl for not having phps syntax/functions/object system

SAFE (btw, written, but not read)

p.s. template toolkit sucks, when it compiles the templates, it should compile each template into a super compiled template, not just convert the templates into perl and include the sub-templates  at run time, but thats a blog for another time.

2009
10.30

So, I was digging through an xdebug trace log for This Aint News, trying to figure out why a page took 0.9 seconds to process, first thing I discovered was the 0.4 seconds of that was includes. So, I was gonna look into this, but then I got distracted by the 0.0016 seconds for a filemtime(). So I thought, I know, I’ll do some testing.

So, I made 3 very simple tests in 3 languages, perl php and python. These tests stat the file “foo” 1,000,000 times, and I timed it with time. This is on a bang upto date Arch Linux install, kernel 2.6.31-arch, amd64, JFS, perl v5.10.1, php v 5.3.0, python 2.6.4

test0.pl

[perl]
my $t = 1000000;

for (my $i=0;$i<$t;++$i){
stat(‘foo’);
}
[/perl]

test0.php

[php]
<?php
$t = 1000000;

for ($i=0;$i<$t;++$i){
# note, im using stat here, to keep the test
# as similar as possible, but the results are the
# same with filemtime
stat(‘foo’);
}
[/php]

test0.py

[python]
import os

# note the use of xrange() rather than range()
# range is python, xrange is pure c, and thus fast
# if you use range, the test becomes void
t = xrange(1000000)

for i in t:
os.stat(‘foo’)
[/python]

along with a file called “foo”, which is nothing more than a text file with some foo inside it.

So, I ran these tests 5 times each, averaged the run time and made this table

run time (real)
perl 1.4038
php 24.3848
python 3.982

So, this is not what I was expecting, I assumed they’d all be about the same speed, since stat is a os system call. So, time to figure out why php is about 23x slower than perl at doing a system call…

So, new test cases, same as above, only these only do one stat.

test1.pl

[perl]
stat(‘foo’);
[/perl]

test1.php

[php]
<?php
stat(‘foo’);
[/php]

test1.py

[python]
import os
os.stat(‘foo’)
[/python]

So I break out the handy tool strace, i won’t post the full output here, since no one wants to read all that stuff!

strace perl test1.pl
read(3, "stat('foo');\n", 4096)         = 13
read(3, "", 4096)                       = 0
close(3)                                = 0
stat("foo", {st_mode=S_IFREG|0644, st_size=7, ...}) = 0
exit_group(0)

ok, as you can see from above, which is just the last bit of of the strace output, perl has opened the perl script, and then stats the file “foo”, then exits. Not bad, overall nice clean execution, no unnecessary hanging around.

strace python test1.py
lseek(3, 0, SEEK_CUR)                   = 0
read(3, "import os\nos.stat('foo')\n", 4096) = 25
lseek(3, 25, SEEK_SET)                  = 25
read(3, "", 4096)                       = 0
close(3)                                = 0
munmap(0x7f98237c1000, 4096)            = 0
stat("foo", {st_mode=S_IFREG|0644, st_size=7, ...}) = 0
rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x7f9822ffe7e0}, {0x7f982330ca90, [], SA_RESTORER, 0x7f9822ffe7e0}, 8) = 0
exit_group(0)                           = ?

ok, about the same as above, although it uses a different method of reading the python script, but essentailly the same, it reads the script, stats “foo” and exits. Overall theres more stuff involved in starting python, and its a little slower, but its still doing the right thing and exiting nice and quick.

So, now for the interesting one:

strace php test1.php
read(3, "<?php\nstat('foo');\n", 4096)  = 19
lseek(3, 0, SEEK_SET)                   = 0
mmap(NULL, 528384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fc5c49dc000
rt_sigaction(SIGPROF, {0x5d47c0, [PROF], SA_RESTORER|SA_RESTART, 0x7fc5ca3dcf90}, {SIG_DFL, [], 0}, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [PROF], NULL, 8) = 0
fstat(0, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
fstat(0, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
fstat(0, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fc5c49db000
lseek(0, 0, SEEK_CUR)                   = -1 ESPIPE (Illegal seek)
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fc5c49da000
lseek(1, 0, SEEK_CUR)                   = -1 ESPIPE (Illegal seek)
fstat(2, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
fstat(2, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
lseek(2, 0, SEEK_CUR)                   = -1 ESPIPE (Illegal seek)
getcwd("/srv/http/tmp", 4096)           = 14
lstat("/srv/http/tmp/ph", {st_mode=S_IFREG|0644, st_size=19, ...}) = 0
lstat("/srv/http/tmp", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("/srv/http", {st_mode=S_IFDIR|0755, st_size=32, ...}) = 0
lstat("/srv", {st_mode=S_IFDIR|0755, st_size=16, ...}) = 0
stat("ph", {st_mode=S_IFREG|0644, st_size=19, ...}) = 0
ioctl(3, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fff6c6206b0) = -1 ENOTTY (Inappropriate ioctl for device)
fstat(3, {st_mode=S_IFREG|0644, st_size=19, ...}) = 0
mmap(NULL, 51, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fc5c49d9000
munmap(0x7fc5c49d9000, 19)              = 0
close(3)                                = 0
munmap(0x7fc5cb988000, 4096)            = 0
getcwd("/srv/http/tmp", 4096)           = 14
lstat("/srv/http/tmp/foo", {st_mode=S_IFREG|0644, st_size=7, ...}) = 0
lstat("/srv/http/tmp", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("/srv/http", {st_mode=S_IFDIR|0755, st_size=32, ...}) = 0
lstat("/srv", {st_mode=S_IFDIR|0755, st_size=16, ...}) = 0
lstat("/srv/http/tmp/foo", {st_mode=S_IFREG|0644, st_size=7, ...}) = 0
lstat("/srv/http/tmp", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("/srv/http", {st_mode=S_IFDIR|0755, st_size=32, ...}) = 0
lstat("/srv", {st_mode=S_IFDIR|0755, st_size=16, ...}) = 0
lstat("/srv/http", {st_mode=S_IFDIR|0755, st_size=32, ...}) = 0
lstat("/srv", {st_mode=S_IFDIR|0755, st_size=16, ...}) = 0
getcwd("/srv/http/tmp", 4096)           = 14
lstat("/srv/http/tmp/foo", {st_mode=S_IFREG|0644, st_size=7, ...}) = 0
lstat("/srv/http/tmp", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("/srv/http", {st_mode=S_IFDIR|0755, st_size=32, ...}) = 0
lstat("/srv", {st_mode=S_IFDIR|0755, st_size=16, ...}) = 0
lstat("/srv/http/tmp/foo", {st_mode=S_IFREG|0644, st_size=7, ...}) = 0
lstat("/srv/http/tmp", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("/srv/http", {st_mode=S_IFDIR|0755, st_size=32, ...}) = 0
lstat("/srv", {st_mode=S_IFDIR|0755, st_size=16, ...}) = 0
lstat("/srv/http", {st_mode=S_IFDIR|0755, st_size=32, ...}) = 0
lstat("/srv", {st_mode=S_IFDIR|0755, st_size=16, ...}) = 0
stat("foo", {st_mode=S_IFREG|0644, st_size=7, ...}) = 0
munmap(0x7fc5c49dc000, 528384)          = 0
close(2)                                = 0
close(1)                                = 0
munmap(0x7fc5c49da000, 4096)            = 0
close(0)                                = 0
munmap(0x7fc5c49db000, 4096)            = 0
open("/dev/urandom", O_RDONLY)          = 0
read(0, ":}\245\224D\332\372\257", 8)   = 8
close(0)                                = 0
open("/dev/urandom", O_RDONLY)          = 0
read(0, "\4l\225d\3102\31)", 8)         = 8
close(0)                                = 0
open("/dev/urandom", O_RDONLY)          = 0
read(0, "D\313N\344\363\271\25\37", 8)  = 8
close(0)                                = 0
setitimer(ITIMER_PROF, {it_interval={0, 0}, it_value={0, 0}}, NULL) = 0
munmap(0x7fc5c4a5d000, 67108864)        = 0
munmap(0x7fc5c8a5d000, 2273896)         = 0
munmap(0x7fc5c8c89000, 2201824)         = 0
munmap(0x7fc5c8ea3000, 2128608)         = 0
munmap(0x7fc5c90ab000, 2107928)         = 0
munmap(0x7fc5c9ef3000, 2157480)         = 0
munmap(0x7fc5c9cae000, 2379592)         = 0
munmap(0x7fc5c9a5f000, 2419920)         = 0
munmap(0x7fc5c96d1000, 3726872)         = 0
munmap(0x7fc5ca102000, 323584)          = 0
exit_group(0)                           = ?

ok, same deal as before, the trace starts when the php script is read. First thoughts, wtf? perl/python do this in 3/10 lines respectively, but php, damn I can’t even count that many lines!!!

Right, so ignoring the massive amount of mess that happens once the script begins execution, and lets focus on the all important “stat(“foo”, {st_mode=S_IFREG|0644, st_size=7, …}) = 0″ that we’re looking for, if you look closely you’ll see its in there some where, right under a load of very suspicious looking lstat system calls!

getcwd("/srv/http/tmp", 4096)           = 14
lstat("/srv/http/tmp/ph", {st_mode=S_IFREG|0644, st_size=19, ...}) = 0
lstat("/srv/http/tmp", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("/srv/http", {st_mode=S_IFDIR|0755, st_size=32, ...}) = 0
lstat("/srv", {st_mode=S_IFDIR|0755, st_size=16, ...}) = 0
stat("ph", {st_mode=S_IFREG|0644, st_size=19, ...}) = 0
ioctl(3, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fff6c6206b0) = -1 ENOTTY (Inappropriate ioctl for device)
fstat(3, {st_mode=S_IFREG|0644, st_size=19, ...}) = 0
mmap(NULL, 51, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fc5c49d9000
munmap(0x7fc5c49d9000, 19)              = 0
close(3)                                = 0
munmap(0x7fc5cb988000, 4096)            = 0
getcwd("/srv/http/tmp", 4096)           = 14
lstat("/srv/http/tmp/foo", {st_mode=S_IFREG|0644, st_size=7, ...}) = 0
lstat("/srv/http/tmp", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("/srv/http", {st_mode=S_IFDIR|0755, st_size=32, ...}) = 0
lstat("/srv", {st_mode=S_IFDIR|0755, st_size=16, ...}) = 0
lstat("/srv/http/tmp/foo", {st_mode=S_IFREG|0644, st_size=7, ...}) = 0
lstat("/srv/http/tmp", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("/srv/http", {st_mode=S_IFDIR|0755, st_size=32, ...}) = 0
lstat("/srv", {st_mode=S_IFDIR|0755, st_size=16, ...}) = 0
lstat("/srv/http", {st_mode=S_IFDIR|0755, st_size=32, ...}) = 0
lstat("/srv", {st_mode=S_IFDIR|0755, st_size=16, ...}) = 0
getcwd("/srv/http/tmp", 4096)           = 14
lstat("/srv/http/tmp/foo", {st_mode=S_IFREG|0644, st_size=7, ...}) = 0
lstat("/srv/http/tmp", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("/srv/http", {st_mode=S_IFDIR|0755, st_size=32, ...}) = 0
lstat("/srv", {st_mode=S_IFDIR|0755, st_size=16, ...}) = 0
lstat("/srv/http/tmp/foo", {st_mode=S_IFREG|0644, st_size=7, ...}) = 0
lstat("/srv/http/tmp", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("/srv/http", {st_mode=S_IFDIR|0755, st_size=32, ...}) = 0
lstat("/srv", {st_mode=S_IFDIR|0755, st_size=16, ...}) = 0
lstat("/srv/http", {st_mode=S_IFDIR|0755, st_size=32, ...}) = 0
lstat("/srv", {st_mode=S_IFDIR|0755, st_size=16, ...}) = 0
stat("foo", {st_mode=S_IFREG|0644, st_size=7, ...}) = 0

So, this explains this then, when you ask php to stat a file, it doesn’t just stat the file that you asked it to stat, it goes all out and stats a whole load of other shite as well. This is ontop of the plethora of system calls just to get this far.

Too be honest, I’m seriously considering switching over to catalyst, the only problem I have is time, fortunatly, php and perl are very simialr in syntax, and my code is proper, not shitty n00b scripts, so it will convert to catalyst very quick.

Its a shame though, since I had big hopes for php, but after seeing how something as simple as a stat system call is messed up so bad that it becomes 23x slower than perl.

WTF???

note I also ran these tests on 3 different gentoo machines, as well as another Arch machine, and the results are all about the same, although they’re not on here because I don’t care enough, and it turns out I have a shite load of work ahead of me now, great :|

2009
10.29

This Aint News

So, at some point in the future I’m gonna write about TAN here, mostly about things I figure out whilst developing it.

Some future posts will be about phps really crap implimentation of fastcgi/mod_php, and how this is changing.

They’ll also be something about compiling pure php  templates to reduce page generation time. For example, a page on TAN includes 120 templates overall, this alone takes 0.4 seconds, which is about half of the total page load time!!!

The above blog will also lead onto xdebug, and why its *really* important that you don’t try to tune your app unless you *know exactly* why its going slow.

Then I’ll probably write some rubbish about php being better suited to simple web development, but lacking some crucial benefits of perl, and why this situation is changing.

That being said, I’m now going to slope away and probably never write anything else :p