Five Non-Mainstream Databases for PHP Apps – Part 3

Matthew Setter is a professional technical writer and passionate web application developer. He’s also the founder of Malt Blue, the community for PHP web application development professionals and PHP Cloud Development Casts – learn Cloud Development through the lens of PHP. You can connect with him on TwitterFacebookLinkedIn or Google+ anytime.

Welcome to the third and final part of the series in which we look at five alternative databases you can use with PHP apps you might not have heard of. In Part 1, we set the scene for the series by looking at Berkeley DB, a database veteran of the open source world. In Part 2, we continued by looking a Gladius DB, a flat-file, pure PHP database, and Firebird, another open source database. We looked at where they came from, their key features and strengths, and what kinds of application they’re best suited for – along with some code samples.

In this final part of the series, we’re going to finish up by looking at two outstanding, yet markedly different, databases: eXist DB and Hypertable. So let’s get started.

eXist DB
eXist DB is a bit of a different beast from the other databases we’ve looked at so far. If you remember Gladius DB (with a pure PHP database) from Part 2, then you’ll like eXist DB.

eXist DB is a database written in Java that stores XML information without needing a middleware layer. This is made possible by having support for a series of interfaces including:

* XPath: XML Path language
* XQuery: XML Query language
WebDAV: Web distributed authoring and versioning
REST: Representational state transfer (URL encoding)
SOAP: Simple Object Access Protocol
XACML: Access Control Language
XML-RPC: a remote procedure call protocol
XQiB: XQuery in browser

eXist DB makes it really simple to develop applications that are written completely in XQuery combined with several other technologies including XSLT, XHTML, CSS and Javascript. It also provides URL rewriting, an MVC framework and XProc support.

In addition to this, it provides a series of add-on modules that you can use at will to extend the functionality as needed. These include, but are not limited to:

* Date/Time
* Cache
* Compression
* Mail
* Scheduler

You can find the entire list in the extension modules documentation.

What’s eXist DB Good For
Created in 2000 by Wolfgang Meier, eXist is an amazingly flexible and capable database system. Since it’s built in Java and supports a wide variety of open standards, eXist can be easily deployed on any operating system with a reliable level of Java support. Like most of the databases we’ve discussed thus far, it also has a stand alone and embedded mode.

Whether you’re looking to embed a database in an existing product or connect to a server over a network, eXist DB has you covered. And since it can interact with so many interfaces, eXist DB can meet your development needs whether you’re working in PHP, Java, C++, C#, Python, Ruby, or more.

How To Use It
As we’re focused on using the database, we’re going to get it installed quickly so we can get started. First, grab the latest copy from the eXist website. Then run the following command, adjusted for the file you downloaded:

[sourcecode language=”php”]
java -jar eXist-[version]-build-XXXX.jar

That will install it locally for you. After that, startup the server with the following:

[sourcecode language=”php”]
bin/ // or .bat for Windows

After this, your server should be running. You can check it by opening:

[sourcecode language=”php”]

When you installed the database, you were prompted for a username and password. From the link above, go to Administration -> Admin and login with those credentials, then click Examples Setup. From there ensure eXist-db shipped files is checked and click Import Files.

After you’ve done this, you have the required data loaded to make the following code sample work. But to work with the code in PHP, you’ll need a copy of PheXist. So grab a copy and extract it into a directory accessible by your web server.

A Simple Example
Now that you have both the eXist-DB server and PheXist PHP library ready to go, create a new script file using your IDE of choice where you’ve deployed the PheXist files.

Have a read and we’ll go through it together.

[sourcecode language=”php”]
<!–?php <br ?–>include (‘include/eXist.php’);

$db = new eXist(”, ”, ‘http://localhost:8080/exist/services/Query?wsdl’);

# Connect
$db->connect() or die ($db->getError());

$query = ‘for $line in //SPEECH[SPEAKER = "BERNARDO"]/LINE return $line’;

print "

# XQuery execution
$result = $db->xquery($query) or die ($db->getError());

# Get results
$hits = $result["HITS"];
$queryTime = $result["QUERY_TIME"];
$collections = $result["COLLECTIONS"];

print "
found $hits hits in $queryTime ms.


# Show results
print "
<strong>Result of the XQuery:</strong>

print "
if ( !empty($result["XML"]) )
foreach ( $result["XML"] as $xml)
print htmlspecialchars($xml) . "
print "

$db->disconnect() or die ($db->getError());
} catch(Exception $e) {

What it does is connect to the local running instance of eXist-DB and query records that match ‘//SPEECH[SPEAKER = “BERNARDO”]/LINE’. If you’re not familiar with XQuery, then check out some good tutorials before continuing.

What we’re looking for is any record in speeches where the speaker is “Bernardo”. Those records are filtered and then returned. Following this, we iterate over the results and output them to the browser. We finish by disconnecting at the end.

As you can see, it’s pretty simple to use – a lot like a traditional RDBMS.

Now we move from an XML database to one designed to scale up to the biggest of jobs: Hypertable. As you’ll see on the Hypertable website, it’s been built for one purpose:

“… for the express purpose of solving the scalability problem, a problem that is not handled well by a traditional RDBMS … Hypertable is based on a design developed by Google to meet their scalability requirements and solves the scale problem better than any of the other NoSQL solutions out there.”

It does so by providing four key benefits that the features of the package deliver on. These are:

* Scalability: Designed to scale and handle larger datasets than most other solutions envisage
* Cost Savings: Designed to require less hardware and power consumption, resulting in better output for less input
* Performance: Through less hardware requirements, more can be done with less resulting in higher throughput and responsiveness
* Clean Semantics: When data is written to Hypertable, it’s there for each request thereafter

But what are the features that I hinted at earlier? Here are but a handful of them for your viewing pleasure:

* Runs on to of distributed filesystems such as Hadoop DFS, GlusterFS and Kosmos FS
* Written nearly 100% in C
* Comprehensive language support for Java, PHP, Python, Perl and C++
* Designed to handle high scalability scenarios
* Data physically sorted by a primary key
* Designed for high efficiency and performance
* Durable data
* Great for ‘Ready Mostly’ situations
* Stores structured or unstructured data

What’s Hypertable Good For?
The key design consideration of Hypertable is to handle Big Data. If you have a lot of data that you need to process, than Hypertable is a good choice for you. Given that, and the variety and the flexibility of the installation options it provides, it’s more than likely there’s an option for you.

If you want full, professional support, you can go with Hypertable Inc. If you’re looking to test it out, as we’re doing in this post, you can install it standalone along with the ThriftBroker layer on a local Linux or *BSD server. In addition, there are a variety of in-between options, such as conjunction with Hadoop or MapR.

How To Use It
After you’ve installed a standalone copy for yourself, we need to do a bit of housekeeping before we can get it up and running. Specifically, we need to create a Hypertable namespace and database we can interact with in the code sample.

I’m going to provide a small example, based on the one in the Hypertable documentation. Follow the instructions below and you’ll be ready to go.

[sourcecode language=”php”]


/opt/hypertable/current/bin/ht shell

use "/";

create namespace "Tutorial";

use Tutorial;

CREATE TABLE QueryLogByUserID ( Query, ItemRank, ClickURL );

LOAD DATA INFILE ROW_KEY_COLUMN="%09UserID"+QueryTime TIMESTAMP_COLUMN=QueryTime "query-log.tsv.gz" INTO TABLE QueryLogByUserID;

From the last command, you should get output similar to:

[sourcecode language=”php”]
Loading 7,464,729 bytes of input data…

0% 10 20 30 40 50 60 70 80 90 100%
Load complete.

Elapsed time: 9.84 s
Avg value size: 15.42 bytes
Avg key size: 29.00 bytes
Throughput: 4478149.39 bytes/s (764375.74 bytes/s)
Total cells: 992525
Throughput: 100822.73 cells/s
Resends: 0

Ok, now that that’s done and ready, open your editor of choice. In there, create a new, blank file and copy the code below – which I’ll explain afterwards.

[sourcecode language=”php”]

if (!isset($GLOBALS[‘THRIFT_ROOT’]))

require_once $GLOBALS[‘THRIFT_ROOT’].’/ThriftClient.php’;

$client = new Hypertable_ThriftClient("localhost", 38080);
$namespace = $client->namespace_open("Tutorial");
$tablename = "QueryLogByUserID";

$tableQuery = $client->hql_query($namespace, "show tables");
$fourRecords = $client->hql_query(
"select * from QueryLogByUserID limit=4"

$mutator = $client->mutator_open($namespace, $tablename, 0, 0);
$key = new Hypertable_ThriftGen_Key(
‘row’=> ‘005753377 2008-11-14 05:50:29’,
‘column_family’=> ‘Query’)
$cell = new Hypertable_ThriftGen_Cell(
‘key’ => $key,
‘value’=> ‘’
$client->mutator_set_cell($mutator, $cell);


echo "scanner examples\n";
$scanner = $client->scanner_open($namespace, $tablename,
new Hypertable_ThriftGen_ScanSpec(array(‘limit’=> 4)));

$cells = $client->scanner_get_cells($scanner);

while (!empty($cells)) {
$cells = $client->scanner_get_cells($scanner);

What we’re doing in the example is connecting to Hypertable via the thrift client on port 38080 and then connecting to an existing namespace, called Tutorial we created previously.

We’re then performing a simple query operation to display the tables, as we could do in a traditional RDBMS such as MySQL or PostgreSQL. Following this, we retrieve four records from the QueryLogByUserID table we created.

Then, we create a new record to be added to the QueryLogByUserID table, which looks like the one below:

[sourcecode language=”php”]
Key Value
2008-11-14 05:50:29

It will show that there was a ‘Query’ request on 2008-11-14 05:50:29 and the query was for We then persist the record and following this, retrieve more records and dump them to the screen.

Now to be fair, that’s a pretty simple example. But hopefully, you get the idea that interacting with Hypertable, using PHP and ThriftBroker, isn’t much different from doing so with a traditional RDBMS. Yes, it’s got a slightly different syntax, but you can get the hand of it quite fast.

Winding Up
In the series, we’ve covered five good databases that are free and ready to now be used with your PHP (or other language) applications, whether they’re for use in embedded work or for applications that need all the power and scalability you can muster.

With Gladius DB and eXist-DB, we’ve looked at flat-file and XML databases. We’ve also looked at a veteran key/has store in Berkeley DB, a fully scalable database in Hypertable and a wonderful RDBMS in Firebird. I hope that one or more of these appeals to you and you’ll give them a test run.

Regardless of what anyone says, there are always options and choices. Make the decision that’s best for you and your projects.

If you have any feedback on any of the databases in this series, let me know if the comments.

For More Information:

Marketing Manager, Content View posts by .

Interested in writing for New Relic Blog? Send us a pitch!