Databases, by definition, dictate structure. So for all this unstructured, semi-structured data or data where there is no model, no schema, and the volumes are high and it's noisy, it makes sense to do a lot of pre-processing of this data on technology where you can sort of put it on commodity hardware. You can pre-process and filter the data pretty quickly, without your cost going through the roof.
So there is a lot of work that has happened here in the Open Source community. And then companies like of course IBM and others have embraced the Open Source and they are extending it, so that this data can be ingested quickly, it can be processed and filtered. And then, once these golden nuggets have been found out, then of course it makes sense to put it maybe in a database. It could be a warehouse or an operational data store, so that the rest of your downstream applications can kind of continue to work in a seamless manner. But it's just that now, they are not just working against the structured data that had been stored in the databases, but now they are working with a bigger set of data, some of which was gained by the data that came maybe from the web, like the social media, the content of which obviously all of us are creating there.
But I wouldn't say that the technology for storing it and analysing it is still coming from a lot of the database vendors. Of course, companies like Google and Yahoo came up with the technologies to crawl the web and be able to search for this information. But when we look at the enterprise usage of this data, it's still very much a play which is sort of adjacent to the databases. So it's sort of that you want to extend the data platform to beyond just the database. And like I said, there is Open Source technology, there is capabilities from vendors like IBM, we have products that help in this space as well, which are built on top of Open Source.