Category Archives: Software

Amanuensis

I have just published Amanuensis on GitHub:

https://github.com/tristantarrant/amanuensis

Amanuensis is a clustered IndexWriter for Infinispan which leverages JGroups’ channel multiplexing to stream index changes from slave nodes to a master node.

To use Amanuensis, add the appropriate dependency to your pom.xml:

<dependencies>
	<dependency>
		<groupId>net.dataforte.infinispan</groupId>
		<artifactId>amanuensis</artifactId>
		<version>0.0.2</version>
	</dependency>
</dependencies>
<repositories>
	<repository>
		<id>dataforte</id>
		<url>http://www.dataforte.net/listing/maven/releases</url>
		<snapshots><enabled>false</enabled></snapshots>
	</repository>
</repositories>

You also need to tell Infinispan to use Amanuensis’ JGroups channel lookup which enables muxed transport of messages.

<global>
		<transport clusterName="cluster">
			<properties>
				<property name="channelLookup" value="net.dataforte.infinispan.amanuensis.backend.jgroups.MuxChannelLookup" />
			</properties>
		</transport>
	</global>

In your code you need to initialize an instance of AmanuensisManager and obtain an InfinispanIndexWriter for each InfinispanDirectory you want to write to as follows:

import net.dataforte.infinispan.amanuensis.AmanuensisManager;
import net.dataforte.infinispan.amanuensis.IndexerException;
import net.dataforte.infinispan.amanuensis.InfinispanIndexWriter;

AmanuensisManager amanuensisManager = new AmanuensisManager(cacheManager);
amanuensisManager.setAnalyzer(analyzer);
InfinispanIndexWriter indexWriter = amanuensisManager.getIndexWriter(directory);

You then invoke methods on the InfinispanIndexWriter from any node and it will send changes to the Infinispan’s coordinator which will apply them to the directory. Index operations can also be batched together:

indexWriter.startBatch();
indexWriter.deleteDocuments(query);
indexWriter.addDocument(doc);
indexWriter.endBatch();

InfinispanIndexWriter is thread safe in that multiple threads can send batches individually.

The project’s site (together with JavaDocs) is available at: http://www.dataforte.net/software/amanuensis/index.html

Share

Cassandra Connection Pool 0.3.5

I have just release version 0.3.5 of my Cassandra Connection Pool.
It contains a couple of bug-fixes when setting configuration properties via the generic set() method and much improved logging: all logs are prefixed with the pool’s name, periodic activity is lowered at trace from debug, and extra logging is done on pool exhaustion.

I recommend upgrading to it.

Get it from my Maven repo.

Share

CoheSiVe

I have just added a new project to my github repo: CoheSiVe

From the silly capitalization you can already guess that it’s a CSV library. It differs from other libraries I’ve seen in that it doesn’t attempt to read the whole file in one go, but is uses an event-driven architecture so that your application can decide what to do with each row as it is parsed.

Docs are here: http://www.dataforte.net/software/cohesive/index.html

Share

Cassandra CacheStore now in Infinispan trunk

Since I have been accepted as an Infinispan contributor, I have committed my first complete implementation of the Cassandra CacheStore to Infinispan’s trunk. This means that it will be included in Infinispan 5.0 whenever that will be released.

In the meantime I have migrated all the code into my repository and have released a 0.0.2 version which can be used with the current Infinispan 4.1.x and 4.2.x. The Maven dependency (if you use my repository) is:


        net.dataforte.infinispan
	infinispan-cachestore-cassandra
	0.0.2

I would be very grateful if you could test it in your environment.

I am also working on adding multiple host addresses and automatic ring discovery to my Cassandra Connection Pool.

Share

Catwalk Model Processor

I have just released the code for Catwalk a Java Annotation Processor for automatically generating derived domain model classes.

Supposing you have a JPA Entity which you want to pass to a servlet stripped of certain private / internal properties, adding a few annotations to the getters you want to expose allows Catwalk to generate a new class with only those properties and convenience methods for converting between the two types of objects.

The project is still missing a few essentials before being useful, such as documentation, proper examples and being uploaded to a Maven repo.

In the following example, a TestModel class is converted to a WebTestModel class:

TestModel.java

package net.dataforte.test.model;

@Model(pattern = "Web#", classPackage = "net.dataforte.test.webmodel")
public class TestModel {
	String s;
	int i;

	@ModelAttribute
	public String getS() {
		return s;
	}

	public void setS(String s) {
		this.s = s;
	}

	public int getI() {
		return i;
	}

	public void setI(int i) {
		this.i = i;
	}
}

WebTestModel.java

package net.dataforte.test.webmodel;

public class WebTestModel {

	private java.lang.String s;

	public WebTestModel() {}

	public WebTestModel(net.dataforte.test.model.TestModel src) {
		this.fromTestModel(src);
	}

	java.lang.String getS() {
		return s;
	}

	void setS(java.lang.String s) {
		this.s = s;
	}

	public WebTestModel fromTestModel(net.dataforte.test.model.TestModel src) {
		this.s = src.getS();
		return this;
	}

	public net.dataforte.test.model.TestModel toTestModel() {
		net.dataforte.test.model.TestModel that = new net.dataforte.test.model.TestModel();
		that.setS(this.s);
		return that;
	}

}
Share

Java and Large Memory Pages on Linux

Recently I helped configure a system for an application running under Tomcat on Linux with very large memory requirements: a minimum heap of 6GB with a maximum of 11GB. The JVM was initially configured to use the Parallel garbage collector. With this configuration garbage collection of the “Young Generation” was fine, but the “Old Generation” GC was taking over 30 seconds (and blocking all other threads while doing this). We looked into enabling Large Memory Pages, a feature of modern CPUs which allow memory-hungry applications to allocate memory in 2MB chunks instead of the standard 4KB. Documentation on the web on how to do this exactly is sparse and missing some details we ran into. Here’s the sequence of steps we had to take:

  1. configure the kernel’s maximum shared memory to span the whole address space (via the kernel.shmmax and kernel.shmall parameters)
  2. configure the kernel’s allocated large memory pages (via the vm.nr_hugepages parameter)
  3. configure the user limits to ensure that the user running Tomcat can allocate the necessary memory (via the maxlock parameter)
  4. ensure that PAM applies the security limits to users who “login” via su and sudo
  5. configure the JVM for Large Memory Pages

Add the following lines to /etc/sysctl.conf and use sysctl -p to reload the changes into the running kernel although I recommend rebooting the system so that the Large Memory pages can be properly allocated (they have to be contiguous).

# Maximum size of a shared memory segment (in bytes)
kernel.shmmax=17179869184
# Maximum total size of all shared memory segments (in pages of 4KB)
kernel.shmall=3145728
# Number of allocated Large Memory Pages (each one takes up 2MB)
vm.nr_hugepages=6144

Edit /etc/security/limits.conf so that the user running the Java application can lock the correct amount of memory.

tomcat soft memlock 12884901888
tomcat hard memlock 12884901888

Edit /etc/pam.d/su and /etc/pam.d/sudo and ensure that they contain the following line so that the above memory limits are applied:

session required pam_limits.so

Next add the relevant options to the JVM’s command-line:

-XX:+UseLargePages -Xmx11g -Xms6g

Share