No, I'd have to say that the overall run time doesn't bother me that much. I was just hoping to find a cool trick about object allocation. It does annoy me that such a large percentage of my run time is something I can't do anything about by writing better code.
In terms of your questions about the way the code is structured... Someday, if I ever get this stuff in a worthy state, I plan on open sourcing it. In the meantime, I'll try to describe it as clearly as I can.
The nodes (vertices) use doubles for x/y coordinates to support the level of precision needed for geographic applications, but floats are more than adequate for elevations. There's also an int used to relate the vertex back to the source lidar data. All are declared final. I've investigated the way Java lays out the object in memory and know that, right now, it's adding 4 bytes of memory padding to each instance. I may use that for something in the future.
The only information the edge class carries is connections between vertices and other edges. Clearly, you can traverse an edge in two directions: from A to B, and from B to A. Let's call edge AB the "dual" of BA. So I have an instance for each direction. The instance carries a reference one vertex, a reference to its dual, a reference linking to its next edge and one to its prior edge. So when I want to get the two vertices for a particular edge, I take one from each half of the pair. The reference to the dual is declared final, but everything else can be adjusted as edges are connected or disconnected from other edges and vertices. It's sort of like a 2D linked list. This configuration was described as a "Quad Edge" data structure by Guibas and Stolfi in their 1985 paper (see
Quad Edge for useful pictures). I experimented with representing the edge as a single object. This reduced memory space but entailed messy additional runtime processing for logic to determine which direction an edge would be traversed at a particular phase of the logic.
The TIN is effectively a container class for nodes, so I don't maintain a separate list of nodes. It does accept a List<Vertex> as input and I do have an API that extracts one by traversing the edges. It's moderately expensive, but I don't really have a need for it anyway (right now). Triangles are not physically maintained in memory, but can be deduced by traversing the edges. I did define a Triangle class and have an API for building a list of triangles, but again don't really have a need for it right now.
The time complexity of the algorithm is, well, complex. In theory, I think the Location phase could be basically O(n^1.5) for random data and o(n) when subsequent input vertices tend to be located close together in space. Lidar samples tend to be closer to o(n). I implemented a sort based on the
Hilbert Curve but don't use it very often. The Insertion phase looks to me like it should be o(n) except for pathological cases... 1 million points in a straight line with one more point offset by just enough to build triangles would have a time complexity of O(n^2). I've performed a linear regression on measured run times and it appears that the time-complexity overall is close to linear, but has a small quadratic term that would be relevant for samples of 100 million or larger.