TreeGPT: Generative Pre-trained Transformer for forestry applications with 3D point clouds

For my master’s thesis, I tackled a challenge that has captured my interest for years: leveraging LiDAR point clouds to generate automated forest inventories.
I remember when I was studying forest sciences that one of the recurring problems in forest planning was the lack of accurate data about the forest area in question. How to effectively plan the allowable cut, rotation period, tree species composition, and any silvicultural intervention without an accurate overview of what we are working with? Data in this field often comes from the national forest inventory, whose resolution is inadequate when considering individual forest areas. In the past, manual sample inventories were systematically organized by public entities on a widespread basis, but nowadays costs have become prohibitive for efforts of such magnitude.
LiDAR technology rapidly and economically captures a digital 3D model of the surface, discretized into individual points collected in so-called “point clouds” (see Figure 1). Given that, depending on the resolution, we are able to visually distinguish individual trees, their components, and recognize their species from scans of forest areas, the conclusion is evident: theoretically, a computer should also be able to do this.
Computer vision on 2D images is a largely developed and established field, while its application to 3D data is relatively in its infancy. In particular, working with point clouds presents various challenges related to their irregularity and the sparsity and redundancy of the spatial information contained within them, significantly complicating computational aspects.
The idea of using Self Supervised Learning (SSL), a paradigm where a neural network self-trains on uncategorized data, represents the logical development in this field. The technique gained traction with the Generative Pre-Trained Transformer (GPT) in the field of Natural Language Processing (NLP): applying it to practically the entire internet corpus led to ChatGPT and the AI “revolution”. Therefore, using PointGPT seemed like a stimulating and promising challenge to test this approach applied to computer vision on point clouds in the forestry field.
To limit the scope of work, I focused on tree species recognition. The confusion matrix is promising (Figure 2). Pre-training required endless experimentation and attempts, finally achieving a rather smooth progression (Figure 3).
I wish interested readers an enjoyable read. The complete thesis is available for download at the top of the page.

