Fujitsu Adding Column-Oriented Processing Engine to PostgreSQL
Fujitsu Laboratories last week announced that it’s developed a column-oriented data storage and processing engine that can quickly analyze large amounts of data stored on a PostgreSQL database. The technology, which utilizes vector processing, is being showcased this week at a conference in Japan.
Fujitsu has a long history developing big systems designed to handle heavy transactional loads. It was a close development partner of Sun Microsystems for 64-bit Sparc servers until Sun was acquired by Oracle. Today the $4.5 billion company–which is the third biggest provider of IT services after IBM and HP–continues to sell X64-based systems to customer all over the world.
Those customers, like everybody else, are wondering how to get ahead of the big data explosion and–in particular–how to analyze that data in something that’s close to real-time. Like all big server makers, Fujitsu has done some work around Hadoop and is executing on a big data strategy. And with last week’s announcement of the column-oriented processing engine, Fujitsu is giving them another part of the answer.
As Fujitsu explains, the new engine runs on a PostgreSQL open-source database and gives PostgreSQL users a new way to quickly analyze large amount of data.
“The engine quickly analyzes indexes, which are provided by most database systems, and can be used by developers without special consideration to whether the storage method is row-oriented or column-oriented,” the company says. “With a parallel-processing engine especially suited for processing column-oriented data, analyses run on a single CPU core are conducted 4 times faster than before, and one server equipped with 15 CPU cores can run analyses at least 50 times faster.”
According to Fujitsu, the company has overcome some of the limitations that have been holding back column-oriented databases. This includes the usual tendency of row-oriented data not being automatically updated to reflect to the column-oriented data, and constraints around the amount of memory available.
Fujitsu says there are three key features of the technology, including:
- The creations of “extents” to store large-volume column-oriented data that cannon fit into memory, and which are managed using multi-version concurrency control (MVCC);
- The creation of column-oriented indexes that are automatically updated, which eliminates the need or users to worry about the data-storage method being used;
- The creation of a parallel processing engine that uses the concepts of vector processing, or applying the same process at once to multiple types o f data.
Fujitsu shared details about its new database technology at the Seventh Forum on Data Engineering and Information Management (DEIM 2015), which opened Monday in Koriyama, Fukushima. The company says it’s aiming for a commercial implementation of this technology during fiscal 2015. It will be a part of Symfoware Server, Fujitsu’s database product, the company says.