The Software Architect’s Compass

Neptune’s Eye: First training

2025-10-21T00:00:00+00:00

Introduction

After collecting a dataset, it is time to train the object detection model. I am using the YOLO11 model from Ultralytics. The model is already pre-trained on the large COCO dataset and has already learned generic image features. Therefore, fine-tuning only requires a much smaller dataset to learn new classes.

Goal

Train a first prototype to get a feel of the performance.

-How good does the boat and buoy detection work with actual images and videos taken from the sailboat?

Challenges

One challenge in the domain of sailing is the different and most of the times not optimal lighting:

Reflections: Sunlight reflections on the ocean (glare) can saturate parts of the image.
Low contrast: Haze, fog or cloudy days result in soft edges and low contrast.
Variable illumination: Lighting intensity varies greatly during the day and with the weather.

Another challenge is the small size of the objects:

Objects are often far away with few pixels.
Buoys are small and hard to distinguish from the background or the ocean.

Hardly visible buoy on the horizon

Training

Hardware

The YOLO11 model is trained locally on the NVIDIA Jetson Orin Nano. The GPU computing power is sufficient for a reasonable training time in the range of several hours. However, the memory is limited to 8 GB. For the YOLO11 small model, it is only possible to train with a maximum batch size of 3.

Parameters

Low learning rate = 0.001: To preserve pretrained features and prevent catastrophic forgetting
Starting with 50 epochs for first test
Increased to 500 epochs in a second run

Result

Test on real videos

Buoy detection exceeds expectations: Reliable detection even in difficult lighting and with small object size
Boat detection is not forgotten: No catastrophic forgetting of already trained boat detection

Stable detection of a buoy and sailboats

Metrics

This is just a quick look at the training metrics:

The metrics such as box loss and mean average precision (mAP) flatten out after about 100 epochs
The confusion matrix shows that boats and the background are often misclassified. This will need to be looked at more closely when I have better data for training.

Metrics do not improve with more epochs

Boats and background misclassified

Outlook

In the next steps I will need to collect more domain specific data. The domains main characteristics:

Sailboat perspective: Objects are usually small and on the horizon with a lot of water between.
Varying and often poor lighting of the images.
Varying background: When sailing into the open ocean, the background is uniform. When sailing towards land the background is noisy.

Next steps:

Experimenting with hyperparameters such as batch size, learning rate and image augmentations.
Evaluating model performance on a test set that represents the domain well.
Setting up automated model training and performance monitoring (MLOps).

Neptune’s Eye: Hoisting the sails to get data

2025-10-16T00:00:00+00:00

Introduction

Introducing Neptune’s Eye. This is the beginning of a new open source project. The goal is to develop a real-time object detection system that detects hazards such as other boats, buoys or fishing traps for safer sailing.

Step 1: Collecting Data When building an object detection model, running the actual training is the easy part. The hard part is acquiring a dataset that fits your application. The quality of the data influences the quality of the detection significantly. Finding a dataset for cats, dogs or cars is easy. But finding data for detecting buoys, boats, fishing traps and other hazards that we run into during sailing is not trivial.

Neptune’s Eye AI generated image

Goal

Acquire good data to tain the model. In the first iteration the model should detect ships, buoys and maybe fishing traps. These classes present the main dangers at see. In the future, different ship classes like sailboats, small boats or freighters can trained, as well as new classes like light houses and wind farms.

Getting data

Public data

Roboflow is a widely used plattform for vision data sets. I could not find a dataset that exactly fits my needs. This means I had to create my own dataset from different sources. I could find a good dataset for buoys with about 300 images. This is not a lot but should be enough for initial testing. Buoys vary greatly in shape so a larger dataset will probably be necessary in the future.
The Yolo11 object detection was initially trained on the COCO dataset. Since the detection of boats already worked very well with this pretrained model, I decided to take boat images from the COCO dataset. I manually went through the first couple of hundred of boat images to find those that match the application best. For Neptune Eye, boats will usually appear far away and be quite small. There are hopefully no close ups of boats, since this means a crash.

COCO images of boats vary. Close-up on land or far away on the horizon

Hoisting the sails

The best data however, is the one taken directly from the boat. While sailing in October I took as many pictures of buoys and sailboats that I could get in a 4 days of sailing. This resulted in about 60 images of mostly buoys. I used Label Studio to easily label these images manually. This worked fine for this small dataset. In the future I plan on using semi-automated labelling.

Buoy and sailboat image taken from sailboat

Result

As a result I created a data set with about 900 boat and 300 buoy images. I only have a handful of fishing trap fotos and could not find anything on the internet.

Summary

Collecting data is a more challenging and time consuming task. Since it is winter and the boat is on land I will not be able to collect more real data until next spring. For getting more images of the tricky fishing traps I will need to be more creative. Scraping the internet or AI generated images for example. The dataset is currently not well balanced. Lets see how it works out in the next blog.

Documenting Architecture the Right Way

2025-10-03T00:00:00+00:00

Introduction

Unfortunately, there is no blueprint on how to document architecture. If so, we would just all do it the same way. Too often the architecture documentation is not actually read, used or understood by anyone. This can lead to a slow erosion of the software, by more and more spaghetti finding its way into the code.

The most common problems I see, when documenting code are:

The documentation is not up-to-date. It was done in the beginning of the project and then left to erode.
The documentation is too large and overwhelming. You cannot find what you are looking for.
The documentation is not clear:
- Boxes are connected by lines and arrows but no one knows what they mean. The relationships between the components is unclear.
- Boxes have colors but the colors are not explained.
The documentation is hard to find. It might be in some Enterprise Architect project that no one has access to.
There is no documentation. :D

Goal

Simple tips to help create a architecture documentation that the development team benefits from.

Architecture for the development team

The architecture is not for audits, management or future historians. It is for the people who are building and maintaining the system now. Write it for the development team. Use the documentation when talking about software all the time.

Architecture naming consistent with code

The naming in the documentation must match the naming of files and folders in the code. This way your architecture is found in the actual product (the code). Everyone speaks the same language reducing confusion and expensive misunderstandings. This language should be used in meetings, in the code, and in your nice architecture pictures.

Document important decisions

Document important technical decisions, why they were made and what the alternatives were. A year later some topics might come up again and you will probably have forgotten the exact reasons.

Balancing just enough

The goal is not to document everything. Document just enough for developers to understand the building blocks and how they work together. Be careful when documenting in too much detail. The documentation is more likely to get outdated quicker.
UML is great, since it is a standardized way to document architecture. However, this is also its weakness. It is easy to forget what the different arrows mean, if you don’t use it regularly. It is also very detailed and quickly does not match the code anymore.
Just remember, the documentation is to help your team understand the system.

Collaborative and living document

The architecture documentation must evolve with the code. It is a living part of a product, just like the code. It should be owned and updated by the team.

Visual and clear

Draw simple diagrams. Not every part of the software needs to be in the diagram.
Every line and every box should be labeled. If boxes have different colors, a legend needs to explain these colors.
Use hierarchical diagrams. Start from a top level view and then zoom into each of the components.

C4 and arc42

The C4 model offers a great visual guideline and the arc42 template a textual guideline. Use these two together and adapt them for your project.

Summary

When you treat documentation as a living, developer-first artifact, it stops being a chore and becomes a practical tool for communication, alignment, and onboarding.

YOLO11 Inference on MacBook Air M1 vs. NVIDIA Jetson Orin Nano

2025-09-26T00:00:00+00:00

Introduction

YOLO (You Only Look Once) is a real-time object detection algorithm based on convolutional neural networks. I am using in my project to detect maritime objects such as boats, buoys, lighthouses and windfarms. The NVIDIA Jetson is an embedded computing board specialized for accelerating neural networks. It used alot in robot applications for real-time inference. With 300€ it is quite cheap for its power and is well documentated.

YOLO model detecting a sailboat

Goal

Quick test what inference times are possible on the different hardwares using different model sizes and types.

Setup

NVIDIA Jetson Orin Nano
MacBook Air M1
Model: YoloV11
Size: Nano, Small
Model formats: Open Neural Network Exchange .onnx, TensorRt .engine, Pytorch .pt

For the Jetson Nano there are many tips and tricks to improve the performance. The command jetson_clocks had the most significant improvement on the inference time improving it by factor 2.\

Results

First things first: This is not a fair fight! Running a Deep Neural Network (short DNN) on a standard laptop versus on a dedicated platform for accelerating neural networks. The following measurements show the average inference times.

YOLO11 Nano Inferenece Time\

YOLO11 Small Inferenece Time

Key take-aways:

The NVIDIA Jetson Orin Nano achieves faster inference times.
PyTorch optimizes inference for the M1 GPU making it 2x faster than running on the CPU.
The inference times can be optimized on the NVIDIA Jetson by using half the floating point precision (FP16).
On the M1 you cannot use half precision to optimize the runtime.
ONNX format does not run on M1 GPU but only on the CPU.

Training on Jetson Nano

It is possible to train the Yolo11 model on the Jetson Nano. However, the RAM is not sufficient for large models or batch sizes. Also for a large number of epochs >20 the training time will be several hours or days. I would suggest using the Jetson Nano for a quick check if the training data and model work. For training on large data sets and for larger epochs a GPU with more memory and speed is necessary. The following configurations worked reasonably well:

Yolo11 Nano, Batch size 8
Yolo11 Small, Batch size 1
Epochs <20

Summary

The NVIDIA Jetson Orin Nano has the GPU power to run real-time object detection with YOLO11. With a price of only 300€ this is the go to hardware for our real-time detection of sailboats. However, for developing and testing the application the MacBook Air is fast enough and offers more convenience.

Outlook

In the next steps the model will be fine-tuned to detect other maritime objects such as buoys, light houses and wind farms. I will also take a look at how larger models perform on detecting these objects. For training larger models the NVIDIA Jetson Orin Nano does not have enough RAM. A training on cloud based GPU will be used instead.

Works on my Machine - Part 1

2025-09-22T00:00:00+00:00

Introduction

Programming is the easy and fun part of being a software engineer. We love writing pretty code and finding creative solutions. The hard part is maintaining and documenting the infrastructure. As projects grow and include more and more libraries, source code and tools, so does the complexity of just getting everything to run.

So how do we get a new developer to be coding in the first day?
How do we get our software running on our edge device without spending a week fixing issues?
How do we prevent solutions from only working on our machine?

Story time

Skip the long story and go straight to the best practices.
You are starting to work a new project. You download the repository, read the ReadMe and start with step 1. You quickly
realize the ReadMe is hopelessly outdated. Every colleague you ask has a slightly different setup and two days later you are still stumbling from one problem to the next.
You are still not doing what you do best: Coding!
When it finally does run we are so happy but we forgot about all the steps we did along the way. Maybe we jotted down some notes. Best case we updated parts of the ReadMe. A month later a new employee starts and experiences the same struggle again.

Goal

This blog introduces best practices in order to speed up and be able to reproduce the set-up of the development environment. This is based on my experience working on larger embedded software projects. These tips are not revolutionary. However, your project team will greatly benefit from them.

Best practices

Make It Foolproof: Every Click, Every Command

Document every setup step, as if you were explaining it to someone who has never used a computer. We often leave out little steps that we think are trivial. No steps are trivial! It often feels stupid to write down every single little detail, but others will often get stuck on these little details.
Pro Tip: Use GIFs. Everyone loves GIFs and they show exactly where to click and what should happen.

Gif showing how to use VS Tasks to build the project

Scripting is Documentation

Don’t document the commands you execute to setup your environment in the ReadMe. Write a script that executes them. This is the documentation. Advantages:

Way faster for the next person to run the setup. No need to copy and past everything.
Scripts are fool-proof. Just run them and let the magic happen.
All necessary steps must be in the script. In a ReadMe we sometimes forget to document important steps.
If the scripts don’t run anymore, we fix them. We often forget to update a ReadMe.

Python Build Scripts

For compiling projects use Python build scripts. Python is platform independent and easy to write/read. Again the scripts are also the documentation on how to build the project. A new developer just needs to run the scripts and the project will compile.

Use the pathlib or os package to write platform independent scripts.
Put the scripts build_debug.py and build_release.py into your tools folder.
Use the same scripts locally and in your CI-pipeline. This way they are always automatically tested.

Your build script could look like this. In a separate python file you code the functions to build the targets.

# Build release target.
from build_utilities import BuildTarget, build_project

if __name__ == "__main__":
    build_project(BuildTarget.RELEASE)

Outlook Part Two

Part 2 will cover containers, Poetry for Python, aliases, VS code tasks and Git submodules.