PHP parse errors with cgi and nginx

Monday, March 15. 2010

So for whatever reason, it took me a while to figure this out earlier today:

2010/03/15 15:44:16 [info] 22274#0: *148224 client closed prematurely connection, so upstream connection is closed too while sending request to upstream, client: a.a.a.a, server: localhost, request: "GET / HTTP/1.1", upstream: "fastcgi://unix:/tmp/.fastcgi.till/socket:", host: "localhost"
2010/03/15 15:44:16 [info] 22274#0: *148207 client closed prematurely connection, so upstream connection is closed too while sending request to upstream, client: a.a.a.a, server: localhost, request: "GET / HTTP/1.1", upstream: "fastcgi://unix:/tmp/.fastcgi.till/socket:", host: "localhost"

The issue was a PHP parse error which I overlooked when I added a new file. The weird thing is, I had nothing in the logs (E_ALL, display_errors is off, but all logs are enabled and I tailed them using multitail) and nginx only displayed a blank page. The errors above were in nginx's own log file.

Defined tags for this entry: , ,

DB_CouchDB_Replicator

Wednesday, March 3. 2010

Update, 2010-03-04: I just rolled a 0.0.2 release. In case you had 0.0.1 installed, just use pear upgrade-all to get it automatically. This release is trying to fix a random hang while reading documents from the source server.

I also opened a repository on Github.

---

As some may have guessed from a previous blog post we are currently running a test setup with CouchDB lounge. My current objective is to migrate our 200 million documents to it, and this is where I am essentially stuck this week.

No replication, no bulk docs

The lounge currently does not support replication (to it) or saving documents via bulk requests, so in essence migrating a lot of data into it is slow and tedious.

I have yet to figure out if there is a faster way (Maybe parallelization?), but DB_CouchDB_Replicator is the result of my current efforts.

I think I gave up on parallelization for now because it looked like hammering the lounge with a single worker was already enough, but generally I didn't have time to experiment much with it. It could have been my network connection too. Feedback in this area is very, very appreciated.

DB_CouchDB_Replicator

DB_CouchDB_Replicator is a small PHP script which takes two arguments, --source and --target. Both accept values in style of http://username:password@localhost:port/db and attempt to move all documents from source to target.

Since long running operations on the Internet are bound to fail, I also added a --resume switch, and while it's running it outputs a progress bar, so it should be fairly easy to resume. And you also get an idea of where it's currently at and how much more time it will eat up.

These switches may change, and I may add more — so keep an eye on --help. Also, keep in mind, that this is very alpha and I give no guarantees.

Installation

Installation is simple! :-)

apt-get install php-pear
pear config-set preferred_state alpha
pear channel-discover till.pearfarm.org
pear install till.pearfarm.org/DB_CouchDB_Replicator

Once installed, the replicator resides in /usr/local/bin or /usr/bin and is called couchdb-replicator.

Fin

The code is not yet on github, but will eventually end up there. All feedback is welcome!

Defined tags for this entry: , , ,

A toolchain for CouchDB Lounge

Friday, February 26. 2010

One of our biggest issues with CouchDB is currently the lack of compaction of our database, and by lack of, I don't mean that CouchDB doesn't support it, I mean that we are unable to actually run it.

Compaction in a nutshell

Compaction in a nutshell is pretty cool.

As you know, CouchDB is not very space-efficient. For once, CouchDB saves revisions of all documents. Which means, whenever you update a document a new revision is saved. You can rollback any time, or expose it as a nifty feature in your application — regardless, those revisions are kept around until your database is compacted.

Think about it in terms of IMAP - emails are not deleted until you hit that magic "compact" button which 99% of all people who use IMAP don't know what it's for anyway.

Another thing is that whenever new documents are written to CouchDB and bulk mode is not used, it'll save them in a way which is not very efficient either. In terms of actual storage and indexing (so rumour has it).

Compaction woes

Since everything is simple with CouchDB, compaction is a simple process in CouchDB too. Yay!

When compaction is started, CouchDB will create a new database file where it stores the data in a very optimized way (I will not detail on this, go read a science book or google if you are really interested in this!). When the compaction process finished, CouchDB will exchange your old database file with the new database file.

The woes start with that e.g. when you have 700 GB uncompacted data, you will probably need another 400 GB for compaction to finish because it will create a second database file.

The second issue is that when you have constant writing on your database, the compaction process will actually never finish. It kind of sucks and for those people who aim to provide close to 100% availability, this is extremely painful to learn.


Continue reading "A toolchain for CouchDB Lounge"

Das Kleingedruckte

Friday, February 5. 2010

Zum Thema, "Was bedeutet eigentlich Flatrate", hier die Vorstellung meines Anbieters.

Ich wollte es zuerst unter "Abzocke" verbloggen, aber eigentlich ist das schon wieder so komisch, dass ich dann davon abgesehen habe. Hier die Fußnote bzw. das Kleingedruckte meines Mobilanbieter (mobilcom-debitel), zum Thema Datenflatrate "T@ke-away Flat". Die interessanten Passagen hab ich hervorgehoben.

Gilt bei Abschluss der mobilcom-debitel Datenoption T@ke-away Flat - Try&Buy, mtl. Grundpreis € 9,95 (erster Monat Grundpreis frei), 24 Monate Mindestlaufzeit. Der Vertrag kann innerhalb von 30 Tagen nach Vertragsschluss gekündigt werden (Kündigung wird wirksam zum 30. Tag nach Vertragsschluss). Erfolgt keine Kündigung innerhalb von 30 Tagen, beträgt die Mindestlaufzeit der Option 24 Monate. Das Inklusivvolumen gilt für nationalen Datenverkehr über den WEB- und WAP-APN. Nach Erreichen von 300 MB (Vodafone), 250 MB (Eplus), 200 MB (o2) Datenvolumen/Monat wird die Datenübertragung auf GPRS-Geschwindigkeit reduziert. WLAN, VPN, VoIP, Instant Messaging sind ausgeschlossen und werden nach dem zugrunde liegendem Tarif berechnet. Im T-Mobile Netz sind zudem Business-Software-Zugriff, Filesharing / FTP, iTunes, Multiplayer-Onlinegames, Internet-Radio bzw. Internet-TV, Client-basierte E-Mail Nutzung ausgeschlossen und werden mit 0,09 € / Minute abgerechnet. Nicht nutzbar mit BlackBerry, iPhone, T-Mobile G1/G2. Die Option unterstützt nur das Surfen mit einem geeigneten Mobiltelefon ohne angeschlossenen Computer. Die Laufzeit der Option verlängert sich automatisch um 12 Monate, wenn nicht 3 Monate vor Ablauf gekündigt wird. Ein Wechsel der Datenoption ist nur zum Ende der Mindestlaufzeit möglich.

… und das macht richtig Lust und Laune diese Zusatzoption abzuschliessen. Und Kunde bin ich bei diesem Laden seit 10.12.1999.

Defined tags for this entry: , ,

Quo vadis PEAR?

Wednesday, January 27. 2010

To PEAR or not to PEAR — PEAR2 is taking a while and I sometimes think that everyone associated with PEAR is busy elsewhere. Since a little competition never hurt, I'm especially excited about these recent developments.

With the release of Pirum, I'm really excited to see two public PEAR channels that aim to make PEAR a standard to deploy and manage your applications and libraries. One is PEARhub and the other is PEAR Farm. I think I'm gonna stick with PEAR Farm for a while, so this blog entry focuses on things I noticed when I first played with it.

PEAR vs PEAR Farm

A lot of people mistake these new channels for the wrong thing. They think that this will eventually replace PEAR. I don't think it will — ever.

While I really welcome the idea that people push PEAR's channel for a standard to distribute apps and libraries (PHPUnit, ezComponents, Zend Framework, Symfony, etc.), it's also very obvious that these open channels will never be the same.

For an idea of what I mean — take a look at open code repositories around the web and especially when it comes to PHP, it's very obvious that while there's a lot of code, most of it is utter crap. (There I said it!)

And no one wants to rely on it when reliability is an objective.

The PEAR Coding Standards were not invented because it's so great to tell people how to write code. But a lot of people need this guidance. While they have a lot of ideas about how to implement a feature or a cool algorithm, their passion does not extend to test coverage or even little documentation. And that's where these sometimes frowned upon coding standards come in handy because they ensure that the code in PEAR is maintainable — which is really just the tip of the ice berg for professional software engineers and I'll save the rest for another blog post.

Let's get to it

Regardless of the fact that there always only few steps, PEAR setups tend to not work for or look complicated to a lot of people. Here are a few tips on how to get started. We'll assume pear itself is installed (apt-get install php-pear).

PEAR Farm

Instead of what is written in the FAQ - suggested steps to get started. ;-)

  • pear channel-discover pearfarm.pearfarm.org
  • pear install pearfarm.pearfarm.org/pearfarm-beta

Spec files and package.xml

The package.xml is a configuration file for the PEAR package. It's pretty long, and since XML is so verbose, it's a turn off to many. PEAR Farm suggests to create a spec instead, so it can create a package.xml for you. This is pretty convenient, but there are also a few gotchas.

  • pearfarm init (inside the code repo)

If you're doing it for the first time, it will issue a warning about a configuration file being created — ignore it and move on.

Continue by editing the .spec file with description, summary, maintainer and whatever else is a suggested edit inside it.

Then, feel free create a package.xml:

  • pearfarm build

This will leave you with a package.xml file in the same directory. Mission accomplished!

Gotchas & Tricks

These are a few things you should edit before you pearfarm push:

  • If you use git and happen to have a .gitignore, pearfarm will add .gitignore to the package.xml as well.

  • pearfarm adds a baseinstalldir attribute to the top most <dir /> entity. Assuming your package is Foo_Bar and you have Foo/Bar.php in your repository, it would install into /usr/share/php/Foo_Bar/Foo/Bar.php, instead of Foo/Bar.php. I'd suggest you remove it and instead namespace in the package name right away — YourName_FooBar.

  • Does it work? Always pear package-validate before you push!

  • Does it install correctly? Feel free to pear install Package-x.y.z.tgz and check if the files installed ok (pear list-files pearfarm.pearfarm.org/Package)

  • A package.xml can have files with different roles, while the standard is php, there's also doc (for documentation) and test for tests. When the package.xml is generated from the .spec, all files get php by default, that's why I'd go through the list in <contents /> to double-check that all files have appropriate roles set.

  • Another thing to take care of (imho), would be <dependencies /> — something that could probably be automated, but otherwise read up about in the manual.

  • Need to rebuild the package from your updated package.xml? Use: pear package!

  • Changelog? Why yes, you can! The package.xml supports it and it would be nice if PEAR Farm eventually displays it. Here's how (after <phprelease />):

    
        
          0.0.2
          alpha
          2010-01-29
          bugfix in makeRequest() (curl), corrected endpoint
        
        
          0.0.1
          alpha
          2010-01-27
          initial release
        
    

Most of these things could be improved in pearfarm. and i'm pretty sure they will. Eventually! :)

The End

And that's all for this time. If you want to see my packages on the PEAR Farm, follow this link.

Defined tags for this entry: ,