Monday, April 12, 2010

CouchDB 101 - 2 : Install and Get Started


It is time to get down with the CouchDB. As I work on there tutorial there a number of assumptions I am making, the platform being used.

Platform Specifications


  • OS: Ubuntu 9.10 (Linux 2.6.31-20-generic #58-Ubuntu SMP Fri Mar 12 05:23:09 UTC 2010 i686 GNU/Linux)
  • Apache: Apache 2 (Server version: Apache/2.2.12 (Ubuntu) | Server built:   Mar  9 2010 21:20:44)
  • Shell: bash (GNU bash, version 4.0.33(1)-release (i486-pc-linux-gnu))

Installing on Ubuntu 9.10

Update source
$ sudo apt-get update
Install couchdb
$ sudo apt-get install couchdb
Install curl - curl is a command line tool for transferring data with URL syntax, supporting FTP, FTPS, HTTP, HTTPS, SCP, SFTP, TFTP, TELNET, DICT, LDAP, LDAPS, FILE, IMAP, SMTP, POP3 and RTSP.
$ sudo apt-get install curl
Testing if the CouchDB is installed and running correctly. CouchDB runs on port 5984
$ curl http://127.0.0.1:5984/
if all is okay and the CouchDB is running correctly, you should get something like
{"couchdb":"Welcome","version":"0.10.0"}
If you get some message like this
curl: (7) couldn't connect to host
then the CouchDB is nor running. Start it manual and test it again
$ sudo /etc/init.d/couchdb start

Getting Started

Now that we know our CouchDB is installed and running correctly, lets do some basic task. We will make use of the curl tool to make it all happen.

Creating a new database

Creating a database is as easy
$ curl -X PUT http://127.0.0.1:5984/your-database-name
where your-database-name is the name of your database. So for example here I will create the database with a name ostools to store open source tools I use daily.
$ curl -X PUT http://127.0.0.1:5984/ostools
on successful excutions you should get a message
{"ok":true}

To see the database created, list all the databases on your save, run
$ curl -X GET http://127.0.0.1:5984/_all_dbs
you should get
["ostools"]
Try to create another database.
$ curl -X PUT http://127.0.0.1:5984/linuxdistro
and then list databases
$ curl -X GET http://127.0.0.1:5984/_all_dbs
you should get
["linuxdistro","ostools"]

Creating, updating, and deleting database documents

Creating.CouchDB databases are schema-free, meaning that their structure is not strictly defined, and as a result you can change them on the fly as your needs require. If one tool has a graphical front-end, you include it in that ostools’s document. If another tool doesn’t have a graphical front-end, you simply don’t include it. If a tool has several dependencies, you can set the dependency field to be an array of dependencies objects—there is no need to define separate tables.

Lets create a document:-
curl -X PUT http://127.0.0.1:5984/ostools/fish -d '{}'
you should get a response similar to the following:
{"ok":true,"id":"fish","rev":"1-967a00dff5e02add41819138abb3284d"}
You’ve just created a document with the document ID of fish. The CouchDB server has automatically generated a revision number and included this in its response.

Now that your document is in the database, let’s issue a command to retrieve it from CouchDB:
$ curl -X GET http://127.0.0.1:5984/ostools/fish
You should receive a response like the following:
{"_id":"fish","_rev":"1-967a00dff5e02add41819138abb3284d"}
At this point you’re probably thinking that this tool information isn’t very useful. All it has is a unique ID and a revision number; it has no tools-related data whatsoever. So, let’s just delete this contact altogether. Deleting a document in CouchDB is quite similar to deleting a database, except you must specify the latest revision number of the document you want to delete.
curl -X DELETE http://127.0.0.1:5984/ostools/fish?rev=1-967a00dff5e02add41819138abb3284d
All going well, you should receive a response similar to the following:
{"ok":true,"id":"fish","rev":"2-eec205a9d413992850a6e32678485900"}
Passing the wrong revision number will result to "Document update conflict." error.

Now you will create a new tool with some datab for the document. A document in CouchDB is simply a JSON object, and you simply include this JSON in your
curl request using the -d flag to send it along with your HTTP request.
{
 "name":"bash",
 "type":"shell",
 "license":"GPL",
 "url":"http://www.gnu.org/software/bash/"
}

Let create the document
curl -X PUT http://127.0.0.1:5984/ostools/bash -d '{"name":"bash","type":"shell","license":"GPL","url":"http://www.gnu.org/software/bash/"}'
You should get
{"ok":true,"id":"bash","rev":"1-41406722a8966f543acbf49e06c66968"}
To get back this document from the database, use the following command:
curl -X GET http://127.0.0.1:5984/ostools/bash
And guess what your document is back
{"_id":"bash","_rev":"1-41406722a8966f543acbf49e06c66968","name":"bash","type":"shell","license":"GPL","url":"http://www.gnu.org/software/bash/"}
Create another tool can be done using an existing tools as a template. Issue the following command:
curl -X COPY http://127.0.0.1:5984/ostools/bash -H "Destination":"kigm"
Respond will be like
{"id":"kigm","rev":"1-41406722a8966f543acbf49e06c66968"}
Checking on the document kigm, you will see it has the same informations as bash. So lets update that information to reflect the new information for kigm
curl -X GET http://127.0.0.1:5984/ostools/kigm
{"_id":"kigm","_rev":"1-41406722a8966f543acbf49e06c66968","name":"bash","type":"shell","license":"GPL","url":"http://www.gnu.org"}
So lets update that information to reflect the new information for kigm
{
 "name":"kigm",
 "type":"webapp",
 "license":"GPL",
 "url":"http://wiki.github.com/eferuzi/kiGM/",
 "requires":["apache","php","mysql"],
 "developer":"Emanuel Feruzi"
}
When doing an update, you must include the revision field in your JSON document, with the revision identifier that the changes are based on, this is to prevent multiple users from making changes to the same document at the same time.
curl -X PUT http://127.0.0.1:5984/ostools/kigm -d '{"_rev":"1-41406722a8966f543acbf49e06c66968","name":"kigm", "type":"webapp", "license":"GPL", "url":"http://wiki.github.com/eferuzi/kiGM/", "requires":["apache","php","mysql"],"developer":"Emanuel Feruzi"}'
Response
{"ok":true,"id":"kigm","rev":"2-f56ac37c3805fc6058c840462ae85756"}
Let’s check it out with a GET request at any rate:
curl -X GET http://127.0.0.1:5984/ostools/kigm
{"_id":"kigm","_rev":"2-f56ac37c3805fc6058c840462ae85756","name":"kigm","type":"webapp","license":"GPL","url":"http://wiki.github.com/eferuzi/kiGM/","requires":["apache","php","mysql"],"developer":"Emanuel Feruzi"}

Now you can experiment more and share your findings with us.

Next: CouchDB 101 - 3 Creating Views

Friday, April 9, 2010

CouchDB 101 - 1: What is CouchDB?

CouchDB Discovery

It is my tradional to visit www.linuxtoday.com daily to get updated as to what is happening. So as I was scrolling, NoSQL poped up and I was interested as to what that is. As There was another aricle that really cought my attention, CouchDB basics for PHP developers. So the journey began.

What is CouchDB?

Apache CouchDB is a document-oriented database that can be queried and indexed in a MapReduce fashion using JavaScript. CouchDB also offers incremental replication with bi-directional conflict detection and resolution.

CouchDB provides a RESTful JSON API than can be accessed from any environment that allows HTTP requests. There are myriad third-party client libraries that make this even easier from your programming language of choice. CouchDB’s built in Web administration console speaks directly to the database using HTTP requests issued from your browser.

Key Characteristics

Documents

A CouchDB document is an object that consists of named fields. Field values may be strings, numbers, dates, or even ordered lists and associative maps. A CouchDB database is a flat collection of these documents. Each document is identified by a unique ID.

Example
{
  "firstName": "Emanuel",
  "lastName": "Feruzi",
  "email": [
       "emanuel.feruzi@trilabs.co.tz",
       "feruzi@gmail.com"
  ],
   "web": "http://www.joelennon.ie"
} 

In the example above, the firstName, lastName and web are string values, where as email contains a list of emails addresses.

Schema-Free

Unlike SQL databases which are designed to store and report on highly structured, interrelated data, CouchDB is designed to store and report on large amounts of semi-structured, document oriented data. CouchDB greatly simplifies the development of document oriented applications, which make up the bulk of collaborative web applications.

In an SQL database, as needs evolve the schema and storage of the existing data must be updated. This often causes problems as new needs arise that simply weren’t anticipated in the initial database designs, and makes distributed “upgrades” a problem for every host that needs to go through a schema update.

With CouchDB, no schema is enforced, so new document types with new meaning can be safely added alongside the old. The view engine, using Javascript, is designed to easily handle new document types and disparate but similar documents.

Views

To address this problem of adding structure back to semi-structured data, CouchDB integrates a view model using Javascript for description. Views are the method of aggregating and reporting on the documents in a database, and are built on-demand to aggregate, join and report on database documents. Views are built dynamically and don’t affect the underlying document, you can have as many different view representations of the same data as you like.

Distributed

CouchDB is a peer based distributed database system. Any number of CouchDB hosts (servers and offline-clients) can have independent “replica copies” of the same database, where applications have full database interactivity (query, add, edit, delete). When back online or on a schedule, database changes are replicated bi-directionally.
CouchDB has built-in conflict detection and management and the replication process is incremental and fast, copying only documents and individual fields changed since the previous replication. Most applications require no special planning to take advantage of distributed updates and replication.
Unlike cumbersome attempts to bolt distributed features on top of the same legacy models and databases, it is the result of careful ground-up design, engineering and integration. The document, view, security and replication models, the special purpose query language, the efficient and robust disk layout are all carefully integrated for a reliable and efficient system.

Next: CouchDB 101 - 2: The First Encounter - Install and Get Started