PHP Developers Meeting Notes

03 June 2009 » In PHP » 9 Comments

Taking advantage of php|tek bringing a lot of people together, we had a PHP developers meeting over 2 days before the conference. Day 1 was dedicated to technical issues in PHP 5 and 6, and day 2 was spent discussing potential features, migration issues, current roadblocks, etc.

The notes from the meeting are available on the wiki. These notes are not necessarily “decisions”, but they do reflect the consensus of the group that was at the meeting and hopefully present a more structured and outlined list of things that we can follow for PHP 5.4/6.

May Wrap-up

02 June 2009 » In PHP, Talks, Travel » 5 Comments

May has come and gone and for me almost half of it was spent on the road. SFO-DEN-BNA, BNA-ORD, ORD-ZRH-TXL-ZRH-ORD, ORD-SFO. To decipher that for you, I first went to Nashville to visit my friend Raquel and see the land of the honky tonks. I was surprised to find an almost full-sized replica of Parthenon there, as well as a really great ƒood/drink scene. Some highlights include Prince’s Hot Chicken Shack (I dare you to order something more than medium hotness), a great beer place called The Flying Saucer (150+ beers on the menu), and especially The Patterson House (recommended by the awesome Steph Dub), where we spent a few hours snacking on tasty bits from the menu and drinking the awesome cocktails that the bartender mixologists prepared in front of us.

Then it was off to Chicago for php|tek 2009 conference. The first two days were dedicated to the first real PHP developers meet-up since November 2005. On Monday we discussed technical issues with regard to PHP 5.3 and 6, and on Tuesday the topic shifted more towards potential features aside from Unicode to entice people to move to 6 and how to ease this migration. Overall it was a productive meeting and the notes should be posted soon. The next day I gave the opening keynote on the present and future of PHP. I managed to throw in a few inside jokes and funny photos in there to lighten up the morning mood. The rest of the conference was productive as well—there were great talks on everything from utilizing HTTP status codes to multi-level caching to a talk that Cal gave on telecommuting. After the conference hours we stopped by the Map Room a couple times for some excellent beer flights (La Folie on tap, OMG).

 

@tychay is not happy

@tychay is not happy

After Chicago, I flew to Berlin for the International PHP Conference Spring Edition 2009. This year they accepted all 3 of my proposals, so I had my work cut out for me. Miraculously, I managed to make the German audience smile and even laugh a couple of times during my keynote. Success! The other two talks intl me this, intl me that on localizing and translating your pages, and All the Little Pieces on using PHP with memcached, mogilefs, and Gearman went well too. Funny enough, the RailsWay conference was going on at the same place—didn’t they know that Terry Chay is coming to town?! This was my first visit to Berlin, so Terry and I played tourists for a bit and went to see Checkpoint Charlie, the remaining pieces of the Berlin Wall, Brandenburg Gate, and Reichstag. It is really amazing to consider that the Berlin Wall used to be 150 km long and embedded a piece of Western Germany in the middle of Eastern one.

Finally, I had a long series of flights home, and despite a mishap at the immigration in Chicago, arrived to my apartment safely and almost on time. It was great to see old friends and new faces and to talk to the best development community out there.

For those of you who wondered where to get the I � Unicode t-shirt that I wore during my keynote, I put the design up on Zazzle, so you can get your own for the next gathering of the Unicode-minded folks.

I Used DMCA

22 April 2009 » In Opinion, PHP, Talks » 13 Comments

Yes, it’s true.

A recent post on Twitter from @atourino pointed to my VIM for (PHP) Programmers slidedeck on scribd.com. The slidedeck has been really popular, gathering close to 50,000 views, 2,500 downloads, a few dozen favorites ratings, and a “Hot” award. Good deal, eh? Except that I didn’t upload this slidedeck—someone else did.

Scribd’s about page describes it as the place “where you publish, discover and discuss original writings and documents”. I’ve used it in the past to find all kinds of documents and there’s a lot to like about the site, but the keyword here is “original”. I really don’t mind sharing the slides—heck, I tell everyone at conferences to download them from my site—but on my Talks page I specifically ask people to obtain permission before re-publishing the slides elsewhere. It’s not a difficult thing to do. So far I’ve resisted putting a copyright notice on every slide, because I was hoping the common sense would apply, but apparently not for everyone.

I contacted Scribd’s customer support to see how I could take the ownership of the document in question. They replied that I would have to submit a DMCA copyright infringement notice and ask for the document to be taken down. I understand that this is their policy, but I think this is going overboard, especially for a case like mine. I really wanted to handle this in a polite manner and in such a way that people’s links to the document wouldn’t break instead of doing the dickish move of demanding it be removed completely. At the same time, I feel that the person who uploaded my slides without permission was wrong. Thus, I had no choice but to send the DMCA notice along with a request for the document to be re-assigned to me.

I would encourage everyone to be more careful in handling publicly available content. Please check for any restrictions on usage and publishing, and if in doubt—ask. This will help avoid resorting to heavy-handed stuff like DMCA notices.

Speaking at Dutch PHP Conference 2009

09 April 2009 » In PHP, Talks » 6 Comments

A couple of months ago I saw a teaser post from iBuildings about the new Dutch PHP Conference in Amsterdam this year. I really like Amsterdam, so I started racking my brain to see what kind of proposal I could submit. Not a day had passed when I got an email from Cal Evans asking if I would like to give the opening keynote at the conference. Success! Of course I agreed and suggested an additional talk about distributed processing with PHP, titled “All the Little Pieces”.

The line-up for the conference looks great: Xdebug’s Derick Rethans, php|architect’s Marco Tabini, Zend Framework architect Matthew Weier O’Phinney, security guru Stefan Esser, “RESTful” Ben Ramsey, PHP core developer Scott MacVicar and many others. The early bird pricing is available until April 30, so I would encourage you to take advantage of it and come see what is bound to be a great event.

Bloom Filters Quickie

03 April 2009 » In Development, PHP » 23 Comments

It’s been a couple of month since the release of pecl/memcached and I was getting the urge to write something else. At the same time, I was reading up on Bloom filters, but couldn’t find a PHP extension that implemented them. Thus, pecl/bloomy was born. Now, you may be wondering, what the heck is a Bloom filter why in the blooming sky would I want to use it? Well, read on.

A Bloom filter is a probabilistic data structure that can be used to answer a simple question, is the given element a member of a set? Now, this question can be answered via other means, such as hash table or binary search trees. But the thing about Bloom filters is that they are incredibly space-efficient when the number of potential elements in the set is large. The way that they achieve this is by allowing false positives with a certain error rate. Basically, a Bloom filter will give you either “no” or “maybe” as the answer and it’s up to you to determine the false positive error rate that you can live with. The smaller the rate the larger the size of the filter, of course, but it takes only 9.6 bits per element for 1% rate and every time you add 4.8 bits, the error rate becomes ten times smaller. Compare this to other structures that require storing at least the data elements themselves in the majority of implementations. However, the more elements you add to the set, the larger the probability of false positives becomes, so it is important to estimate the size of your data set properly.

Another nice thing about Bloom filters is that the time to store and look up the element is constant and does not depend on how many elements are in the set. Now that’s a pretty nice property to have, isn’t it?

So, what are they good for? Well, Google BigTable uses Bloom filters to reduce the disk look-ups for non-existent data; Cassandra also uses them to save IO; Digg might use them to implement checks for green tags on Digg buttons, i.e. have my friends Dugg this, etc. The possibilities are many.

The API of the extension is pretty simple. Create BloomFilter object and specify how many elements you expect to have in the set and what false positive rate you can tolerate. The extension will determine the optimal filter size and the number of hash functions to use. Then all do you is add and check elements.

$b = new BloomFilter(100000, 0.001);
$b->add(‘foo’);
$b->add(‘bar’);
$answer1 = $b->has(‘foo’);
$answer2 = $b->has(‘zoo’);

Here we say that we expect to store 100,000 elements with a false positive rate of 0.1%. Then we add a couple of elements and check a couple too. Now, $answer1 will be “yes”, but it may be wrong 0.1% of the time, i.e. if you check 1000 elements, you may expect to get “yes” for 1 of them when in fact it doesn’t exist in the set. The $answer2 will always be “no” because “zoo” was never added to the filter.

What’s the performance like? From a simple benchmark with 0.1% false positive rate, the time to insert 100,000 items was 0.12 seconds, and time to check 100,000 items was 0.11 seconds (this is on my Macbook Pro). The space used by filter? 179721 bytes.

I encourage you to download the extension, play around with Bloom filters and see what uses can come up with.

UPDATE: As Ryan pointed out in the comments, I misspoke when I said that $answer2 would always be “no” for “zoo”. It might be “yes”, but $answer2 would never be “no” for something that is in the filter, like “foo”.

Fixing FeedBurner Fiasco, Conclusion

06 March 2009 » In Other, PHP, Rants » 4 Comments

In the previous post I mentioned that I was going to migrate to Google account on FeedBurner using a trick to avoid spamming the subscribers with old posts. The trick seemed to work fine, so here’s the explanation. I use WordPress (2.6), but this can be generalized to other systems. In wp-includes/feed-rss2.php, find the beginning of the post loop - the while( have_posts() ) line - and add another one after it to exclude the posts dated earlier than the migration date from the feed. It should look something like this:

 <?php while( have_posts()) : the_post();
 if (get_post_time() < strtotime('2009-03-01')) continue;

The end result of this is that your feed will contain only the items published after the specified date. This may be a bit strange for new subscribers, so I made a new blog post so as not to keep my feed completely empty, but for existing ones it should be transparent.

In general, I want to say that Google has completely mishandled this transition after their acquisition of FeedBurner. Chris Shiflett already explained what problems he saw with it, but I found one more: they broke the Awareness API. I noticed this because my feed statistics were all 0. Turns out that the old API URL (http://api.feedburner.com/) was gone, and you had to use the new one (http://feedburner.google.com/api). Breaking the legacy API URLs is a major violation of the contract you make with the users when you publish the API. At the very least, Google should have silently redirected the requests to the new API instead of doing the most egregious thing possible and simply removing the old URL. Shame on you, Google.

Fixing FeedBurner Fiasco

05 March 2009 » In Other, PHP » 3 Comments

If you use FeedBurner, you probably know that they were acquired by Google recently. They are also forcing you to migrate your account to Google on the next login. This presents a couple of issues detailed by Chris Shiflett in his post.

I’m going to attempt to avoid spamming my subscribers with a ton of recent posts by making sure that the feed contains only the posts dated after the migration. Chris and I are pretty sure that it will work, but I guess the only way is to push the button and see. If this works, I’ll post a quick explanation of how to hack WordPress to modify your feed this way.

Upcoming Talks

27 February 2009 » In PHP, Talks » 2 Comments

Here’s a breakdown of the talks I will be giving over the next couple of months.

PHP Québec Conference

VIM for (PHP) Programmers

Are you stuck choosing between Komodo, Zend Studio, PHPEdit, or Eclipse as your next IDE? Did you just come to Unix from Windows and wonder how to translate your “1337″ Notepad skills to the new platform? Have you pulled out most of your hair struggling to make your current editor do something more complicated than proper indentation? Or do you feel that perhaps you use only 5% of VIM’s potential but desire to learn the true magic?

Then head over to this popular session and grab a seat, because you don’t want to be left standing when everyone else shows up to see what VIM has in store for PHP developers. Plus, it’ll help that hair grow back.

Andrei’s Regex Clinic

Regular Expressions: every developer’s best friend and worst nightmare!

Join Andrei Zmievski, PHP developer and author of the PHP Regex (PCRE) extension, on a journey that will take you from your first steps into the world of regular expressions to complete mastery of this most useful of tools.

A must for everyone who’s ever wondered what /(?\d+)bar/ means.

php|tek

Mid-conference Keynote: The Future of PHP 6

After a brief hiatus, PHP 6 has picked up momentum and is on track for a release. Join Andrei as he covers the salient changes, updates, and new features of the next generation of our favorite language and how they can be useful in your everyday development.

International PHP Conference

intl me this, intl me that

What are the problems with and best solutions to translating your web site or application into other languages? This session will cover several approaches to this problem based on PHP, focusing on utilizing the new intl extension as well as other open source tools. Warning: some live translations may be performed for the audience!

All the Little Pieces: Distributed systems with PHP

Quick, what do memcache, MogileFS, and Gearman have in common? They are scalable, distributed technologies, and they can also interface with PHP, your ubiquitous Web development language. Digg uses all 3 (and a few more) in its quest for social news domination, and this session will share much of what we’ve learned about them and how they are best utilized with PHP.

The Present and Future of PHP

This keynote will explore the innovations coming with the next generation of PHP, the roadmap to development and delivery, and what you can do to be prepared when the big day comes.

Upcoming Conferences

04 February 2009 » In PHP, Talks » 1 Comment

Part of my job description at Digg is to speak, present, and evangelize publicly on a variety of technical issues including open source technologies and their adoption within Digg, so I thought I’d do an update on the upcoming conferences that I’ll be speaking at.

At PHP UK Conference - London (February 27) I’ll be giving my “intl me this, intl me that” talk that got high marks at the last OSCON. The talk covers problems with and approaches to translating your web applications.

For PHP Québec Conference - Montréal (March 4-6) I will have two familiar and popular talks, “Andrei’s Regex Clinic” and “VIM for (PHP) Programmers”.

And finally php|tek - Chicago (May 19-22) has asked me to do the mid-conference keynote on the state and future of PHP.

I’d better kick PHP 6 into gear.


UPDATE: I will not be at the PHP UK Conference due to a visa mishap.

New memcached extension

29 January 2009 » In PHP, Work » 32 Comments

The first project that I’ve been working on since joining Digg has seen the light of day. It’s a new PHP extension for interfacing with memcache servers and it is based on the libmemcached library, which is becoming the standard client library for this task. It’s used by Python, Ruby, Perl, and now - PHP. The extension is available from PECL [1]. There is another memcache PECL extension, but this one offloads the intricacies of communicating with memcache onto libmemcached and instead concentrates on exposing a sensible API and some cool features like asynchronous requests and read-through caching callbacks.

I’m excited about this release and looking forward to putting out more stuff soon.

Now, to write the documentation..

1. http://pecl.php.net/package/memcached

Page 1 of 2312345»...Last »