The latest feature release from the Apache Kafka stream processing community includes server support for time-based search along with more than 200 bug fixes and other improvements.
Confluent Inc., founded by the team that built Apache Kafka at LinkedIn (NYSE: LNKD), said the new time-based feature supports a new searchable index for topics based on message timestamps, which were added in an earlier release. “This allows for finer-grained log retention than was possible previously using only the timestamps from the log segments,” Jason Gustafson, a software engineer at Confluent, Palo Alto, Calif., explained in a blog post.
Other Kafka server upgrades include replication quotas designed to set the upper limit for bandwidth used for replication along with new configuration settings designed to control the time range for cleaning logs and improving log compaction and deletion. The upgrade also takes advantage of new time-based searchable index, Gustafson noted.
The new release, Apache Kafka 0.10.1.0, also incorporates a batch of client APIs focused on interactive queries, improved memory management and other security features. The query feature allows users to treat the stream-processing layer as a “lightweight embedded database,” Confluent said, along with the ability to directly query the current state of a stream processing application.
The memory management upgrades to Kafka include the addition of “record caches” used to unify storage and caching in streams. The caches are used to compact output records as a method of reducing the number of updates for the same record sent downstream. The upgraded API “typically result[s] in reduced load on your streams application, your Kafka cluster, and/or downstream applications and systems such as external databases,” Gustafson added.
Meanwhile, the secure client quotas are designed to give clusters better protection in secure environments.
The company also listed a series of bug fixes and performance improvements for Kafka server and client APIs. A complete list of bug fixes, new features and other improvements can be found in version 0.10.1.0 release notes here.
Jay Kreps, Confluent’s CEO and co-founder, said earlier this year that its Kafka Streams distribution built on Apache Kafka is focused more on “building core applications and micro-services that process data streams” and less on the data analytics domain.
As Apache Kafka makes inroads among enterprise adopters, Confluent announced a batch of new capabilities for its Kafka distribution during last month’s Strata + Hadoop World conference. The version 3.1 release shipping this month includes multi-datacenter replication, automated cross-cluster data balancing and a cloud-migration capability.
The latest release of Apache Kafka can be downloaded here.
Apache Kafka Gain Traction Among Enterprise Users
Commercial Kafka Distro Gets Global Smarts