{ "version": "https://jsonfeed.org/version/1", "title": "smp46", "home_page_url": "https://smp46.me", "feed_url": "https://raw.githubusercontent.com/smp46/smp46.github.io/nextjs/public/feeds/feed.json", "description": "Projects, attempts and other things", "author": { "name": "smp46", "url": "https://smp46.me/whoami" }, "items": [ { "id": "https://smp46.me/blog/CyberBattles", "content_html": "

\"Title\"

\n

CyberBattles: An Award Winning Educational Attack and Defence CTF Platform

\n

What is a Cyber? What are Battles? Well in short, it's hundreds of hours worth of work that resulted in something that I don't think otherwise exists (commercially).

\n

CyberBattles is the project that 5 classmates and myself have spent the last ~4 months banging our heads against. Completed as part of our capstone class for Computer Science, DECO3801, at the University of Queensland; DECO3801 or uninspiringly named \"Design Computing Studio 3 - Build\". The idea of the class, as much as many fellow students hate it, is to throw us in the deep-end and give us a taste of collaborative (i.e. real world) software engineering.

\n

34 Boring Briefs and Something Interesting

\n

At the beginning of the Semester we were given a list of 35 project briefs to choose from. Most (read: almost all) were incredibly boring, fun examples include \"Geodatabase Tools for Load Modelling\" and \"Leveraging Digital Technologies to Influence Tourist Dispersal Behaviour\", while these might be interesting to some strange people, they did not tickle my fancy. However, I was in luck as there was a category for \"Cyber Security\" and as an aspiring software-engineering-cyber-security-something this piqued my interest. The most interesting being \"Red vs Blue Team Cybersecurity Simulation\", or a boring description of what would become CyberBattles.

\n

The brief was:

\n

... design and develop a two-team “Capture the Flag” cybersecurity game platform as a learning tool ... implement an interactive system where two opposing teams, the Red Team (attackers) and Blue Team (defenders), compete in a scenario-based simulation to compromise or protect digital assets.

\n

And it contained four success criteria: Team-Based Asymmetric Gameplay, Challenge-Based Capture the Flag Structure, Instructor Visibility and Game Balance, Post-Game Review and Learning Support.

\n

This really piqued my interest, as an active member (and ex-executive) of UQ Cyber Squad I had a decent amount of exposure, if not experience, in capture the flag (CTF) challenges. And I kept hearing endlessly about Attack Defence competitions, which I am far from skilled enough to participate in. So, I got a group of friends together and we put in our bid for the project.

\n

Explaining Words

\n

If the words \"Cybersecurity Simulation\",\"Capture the Flag\" or \"Attack Defence\" don't mean anything to you:

\n

Cyber Security Simulation

\n

This is a fancy way of saying gamified hacking competitions. The idea is that anyone with an interest in Cybersecurity can practice real world skills in a competitive, game-like, environment. While the skills learned in these types of simulations aren't always directly transferable to real-world use cases, they are an exercise in practising problem-solving and learning about the ever changing world of Cybersecurity.

\n

Capture the Flag (CTF)

\n

This is the most common form of \"Cyber Security Simulation\". The idea is a player is given a challenge, whether it be an image file, a website, or a program that contains a \"flag\" which normally resembles a string like this using leetspeak: cybrbtls{g00djobFORf1nd1ngm3}. The player then has to extract this flag through whatever means they can, usually via a form of \"hacking\".

\n

Attack Defence (AD)

\n

This is a type of \"Cyber Security Simulation\" normally of a CTF, where instead of the challenge being against a static or non-player controlled entity like a pre-made website or program. Players are given a matching environment with multiple vulnerable entities (programs, websites, etc.) to exploit. The challenge then becomes not only to try to capture the flags of other teams but also to defend against attacks against their own version of the programs. The way in which a team chooses to defend is entirely up to them, often the entity is accompanied by the code used to make it so players can read, rewrite, and patch it on-the-fly. An important and necessary caveat of this type of CTF challenge is that the service needs to be preserved so that a regular (usually automated) user can still use it for its intended purpose. This prevents a team trying to be smart and just shutting down their services to prevent being hacked.

\n

Why the Brief was Interesting

\n

As I've established above, CTFs, and AD CTFs specifically, have already been done; they aren't new. However, what didn't already exist was an easy to use, publicly accessible platform that did the hard parts of running an AD competition for you.

\n

The Hard Parts

\n

An AD competition is inherently not simple, it requires at the very least:

\n\n

These might sound a little like the aforementioned success criteria given to us as part of the project brief, which they are indeed. At the end of the project, I can now appreciate that who (or what) ever wrote the brief had a good grasp on AD essentials.

\n

A Storied Development

\n

A Team Effort

\n

For most of the team this was our first real team project and our first taste of dipping our toes in the deep-end of real world collaborative software engineering. Enthusiastically we said we'd do everything just like real developers. We set up a Github organisation, a nice looking repo (and domain), a Discord server with a bot that notifies us of each other's progress and most importantly (until we couldn't be bothered): Jira for story boarding. And importantly a timeline for when milestones should be done by.

\n

What is story boarding? Great question, it seems to involve the words \"epics\" and \"stories\" quite often. But in short, it's just an extravagant way of breaking down a project into achievable sub-sections, but with a lot of \"Silicon Valley\"-esque lingo to make it seem cool and innovative. While initially it seemed like a great idea, we found that when nobody was enforcing the use of this story boarding it quickly fell apart. As it became just more work on top of the already large task we had ahead of us.

\n

Divvying Up the Work

\n

As a group of six, we sought to evenly distribute the work amongst ourselves, giving people tasks suited to their existing skill sets. We assigned three people to website/frontend development, as this would be the main way users interact with our project so it should look good. Then my friend Howie, someone with some real experience in playing and winning CTFs, was assigned to challenge development and to provide a template for the scenario networking. And unfortunately, another friend, Lachlan was given the job of code review and repo management. That left me to do the fun (hard) part, making the server that would orchestrate the whole shebang.

\n

The Reality of Group Work

\n

While we set out strong, as group projects are wont to do, the timeline quickly became a time-suggestion which ended up as a time-wish by the end of the semester. As my workload was relatively light for the semester, I had a majority of my side of the project done within a month. And unfortunately as the most experienced in web development on the team, much of my semester ended up being spent on that side of the project.

\n

My Half of the Whole Shebang

\n

So I had the dead simple job of:

\n\n

It was not in fact simple nor straightforward. I started by finding out what kind of software solutions exist to connect all these things together. For the virtualising/encapsulation of the environment I picked Docker, as it is lightweight and I was already fairly experienced in using it. Fortunately, I found a very helpful Node.js module called Dockerode, which provides an easy to work with, programmatic interface for working with the Docker virtual environment. My plan of attack was to plan out each function I would need for the server to operate and work through them step-by-step until I reached a minimum viable product.

\n

The flow of the Orchestration server looks roughly like this:

\n
    \n
  1. On the website, a user creates a new session, selecting the number of teams, maximum members per team and one of the pre-made scenarios. This is then sent to the API listening on the server.
  2. \n
  3. The server creates a WireGuard VPN container which will act as the router. WireGuard then generates VPN configs: one for every team, one for every player, and one additional per team for the admin to use. Then one container is spun up for each team, using the Docker Image designated for the scenario. And finally the last container is added, this acts as the \"Scoring Bot\", pretending to be a real user to check the services work as expected and insert the flags the participants need to steal.
  4. \n
  5. The session is now in the \"lobby\" phase, the admin is presented with their dashboard on the website and they can invite users to join the teams.
  6. \n
  7. Once the admin starts the game on the website. The server gets to work and creates a user account on every team container, for each member of the team, as well as an additional account in every team for the admin. Then the \"Scoring Bot\" is instructed to start its scoring. The bot attempts to use the services as expected (e.g. inserts a flag into a notes app) and if it succeeds the flag is stored in Firebase. If it fails the attempt then a counter is increased and that team's score is affected.
  8. \n
  9. Now the participants and the administrator can access their container via a web shell or via the VPN config provided to them. The web shell uses WebSockets to access a secure bash session directly on their team's container. Whereas the VPN gives the participant direct access to the isolated network, so they can SSH into their team container using their own preferred terminal.
  10. \n
  11. Once the game is finished the relevant session info is saved on the frontend and the Orchestration server goes through and removes every container, deleting all generated configs and finishing back on a clean slate.
  12. \n
\n

The Architecture

\n

Here is a rough diagram of how the whole platform comes together:

\n

\"Plan\"

\n

Challenges and Difficulties

\n

Not Enough Hours in a Day nor Days in a Week

\n

I would argue the hardest part of the project was the workload, as this was a group project it was a considerable task we were given. While the finished platform always felt achievable, it was hidden behind a mountain load of work, which wasn't always shared evenly, but c'est la vie.

\n

Vibing Web Dev

\n

Due to time constraints and lack-of-experience the frontend had a lot of vibe coding put into it. To each their own, sometimes it would be amusing to see PRs full of lengthy code comments, emojis and or my least favourite the AI glowing gradients on everything. But unfortunately, the end result is while the website is complete, the code quality is sub par and impossible to maintain. Especially with the Firebase integration, the website slurps up the user's resources even when it is just idle.

\n

An Award Winning Result

\n

While we can all complain endlessly about what should have been done differently. We did manage to produce a platform that meets, and arguably exceeds, the goals given to us. CyberBattles is now an easy to use and accessible Attack Defence platform. I think this is super important for lowering the bar to entry for this style of competition. Providing a gamified and educational way to learn essential skills in an ever growing industry.

\n

An unexpected result was that the project was nominated by the course coordinator for UQ Illuminate, a showcase of the best projects produced by graduates (both under and post grads) in 2025. Specifically we were nominated for the category of \"Best Cyber Security and Data Privacy Project\". The event was a good chance for Lachlan and myself to show off what the team had put so much work into.

\n

Here's a look at our booth for the night:

\n

\"Showcase\"

\n

To our surprise, the judges picked CyberBattles as the best project for our category. This has been hugely encouraging and a really rewarding experience. Lachlan and I plan on continuing development of the platform and getting it into the hands of educators.

\n

The Future of CyberBattles

\n

We are excited and still surprised at what we managed to produce, but it's not done yet. There is a bit of a plan to carry out before we're ready to hand out the platform, however source code is now available on our repo (click the GitHub icon on this page). The main issue we want to address is that as an open-source platform, depending on Firebase as our database provider is limiting. Due to the nature of the platform, both in the hosting expense and the liability of providing relatively anonymous internet-connected virtual environments, due to that CyberBattles will be transitioned to a database solution that can be hosted alongside the rest of the platform (locally). I will also be personally taking this opportunity to rewrite the website to fit the new database solution, whatever we choose, and add a little more polish.

\n", "url": "https://smp46.me/blog/CyberBattles", "title": "CyberBattles", "summary": "CyberBattles is an award-winnng educational platform that helps people gain experience in real world cybersecurity attacks.", "date_modified": "2025-12-04T00:00:00.000Z", "date_published": "2025-12-04T00:00:00.000Z" }, { "id": "https://smp46.me/blog/Legionnaire", "content_html": "

\"Title\"

\n

Legionnaire: Caffeine, Code, and Cyber Security, a 48 Hour AI-Powered SIEM Hackathon Project

\n

Acryonyms, Acronyms, Acryonyms

\n

A few months ago I found myself sitting at a talk at CrikeyConX in the Brisbane\nConvention Centre. The talk was titled \"SIEM-less security; Panacea or placebo\",\nand it took approximately 10 minutes before my friend Lachlan and I turned to\neach other and admitted we had no idea what a SIEM or an EDR is. After a quick\nsearch, we were suddenly (acronym) experts and the talk began to make a little\nmore sense.

\n

Therefore in case you don't know:

\n\n

\"Crikeycon\"

\n

The key takeaways from the talk were:
\na) SIEMs are expensive (i.e. they're worth a lot of money)
\nb) SIEMs and EDRs are tools to be used by experienced Security Professionals,\nnot one-shot solutions that you can throw money at to solve cyber security.
\nc) Modularity is important in these kinds of products.

\n

That got our cogs turning. Then the UQ Computing Society's yearly weekend\nHackathon comes around and we have an idea, why not build our own SIEM in less\nthan 48 hours. Last year we made a\npoorly written idle game in Rust, so something a\nlittle more serious sounded appropriate. We also learnt, based on last year's\nwinners, that some kind of AI tech (or at least Machine Learning) was a\nrequirement to get any of the judges to even consider our project. So we have\ntwo requirements for our project, some kind of SIEM/EDR product and it has to\nutilise AI. Easy right?

\n

The Architecture Plan

\n

As this was the second time our team had participated in the UQCS Hackathon, we\nknew the biggest hurdle was ensuring everyone had work to do. We had to prevent\n(as much as possible) devs waiting on other devs to get work done. As a result,\na modular approach was decided upon.

\n

Given the goal is to make an enterprise-level product, the Client has to be a\nbackground-only service that provides zero feedback or notifications to the\nuser. It is comprised of four modules:

\n\n

The Control Server is then responsible for:

\n\n

The Web Dashboard connects to the Control Server via a REST-ful API to:

\n\n

\"Plan\"

\n

The Tech Stack

\n

Last Hackathon my team made the mistake of trying to both learn and write in a\nlanguage we had not used before. While we had hoped to use Golang for this\nproject's backend, for speed and reliability, only a couple of members had\nexperience writing it. So, Python was selected as the language of choice.

\n

The most important of the libraries we used are:

\n\n

The Web Dashboard was written using React + Vite + TailwindCSS, to provide an\neasy to develop and run front-end for the users. Firebase is used for login\nauthentication and was intended to be used for backing up logs, however that has\nnot yet been implemented.

\n

Turning Nice Ideas into Code

\n

What Went Well

\n

Honestly, there were no major hurdles or roadblocks this time around.\nDevelopment went relatively smoothly and we managed to produce an end-product\nthat fit almost exactly the scope that we set out to do. The modular approach\nwas a great idea, it allowed each developer to work relatively independently up\nuntil it came time to combine and connect the modules.

\n

Issues

\n

The only major issue encountered was that we hoped to use an existing Dataset\nfor the AI Analysis:\nCIC-IDS2017. The dataset\nprovides over 70gb of data over a week of data collection, with data labelled as\nbenign or categorised as a specific type of attack. While a model was trained on\nthis data, we found it very ineffective for the actual data we were producing.\ni.e. for whatever reason the testing we did produced very different features and\nthus inaccurate results for the model.

\n

As a result we had to do our own data collection. I personally collected the\ndata using the same cicflowmeter python module, a Kali Linux Virtual Machine and\nvarious attack tools.

\n

What About AI?

\n

Don't say it too loudly, our Machine Learning dev is not a fan of using\nbuzzwords to describe what 5 years ago would just be called an Algorithm. But\nthe year is 2025 and we wanted to tick that box. If you do want to understand a\nlittle more about how the classification model works, I will quote our ML Dev\nhere:

\n
\n

The model being used is the XGBClassification algorithm, which is an extremely\noptimised gradient boosting ensemble algorithm. This model was trained and\ntested on real collected data and verified using the CIC-IDS-2017 dataset.\nThroughout the hackathon, the model was trained a variety of times, attempting\nmulticlass classification and binary classification of attacks. The final\nmodel used is a logistic binary classifier trained with L1 and L2\nregularisation, also implementing methods to deal with class imbalances such\nas weight scaling. This classifier predicts the labels of data containing 79\ncolumns of network traffic to either Benign (0) or Attack (1).

\n
\n

What can I reveal? It's that it actually worked surprisingly well. Resource usage\non the Control Server was negligent, and that's even when the model is running\nnearly constantly analysing multiple clients network flows. The main reasons I\nfound the model impressive is the ability to identify malicious activities of\nnetwork connections, without interception, packet inspection or decryption.\nMonitoring is completely passive and transparent, yet still effective.

\n

The Result: Visuals

\n

Here's a short demo of the Web-Dashboard in action:\n

\n

Per Page Module Overview

\n

Graphing and General Statistics:\n\"Dashboard

\n

Program Analysis Logs:\n\"Program

\n

Network Analysis Logs:\n\"Network

\n

System Logs Analysis Logs:\n\"System

\n

Endpoint (Client) Management:\n\"Endpoint

\n

Client Specific Logs and Actions:

\n

\"Client

\n

What I Learned

\n

The most important thing I learned is that 5-6 hours of sleep and a lot of\ncaffeine, is pretty much the equivalent of a full-nights sleep (not true).

\n

What I did learn is that as easy as it is to hate on Python, it is remarkably\nflexible, very easy to work with and the module support is incredible. So who\nknows, maybe my new profile picture will be me working with a Python instead of\nfighting one.

\n

Additionally, some valuable teamwork skills were picked up. This year was a much\nmore productive and effective use of our time. It turns out task management is\nsuper important when trying to work with multiple developers, especially when\nparallelisation of work is a must.

\n

A Future Startup?

\n

Haha no.

\n

In reality this is very much still a hackathon-level project in terms of polish\nand code quality. I intend to work with the team and fix some of the minor\nissues and act on some feedback we achieved. But most importantly regardless of\nthe future, I am proud of the team and myself and think this marks a great final\nproject for our UQCS Hackathon Career before we all graduate.

\n

A little look at our showcase at the UQCS Hackathon 2025:\n\"Showcase\"

\n", "url": "https://smp46.me/blog/Legionnaire", "title": "Legionnaire", "summary": "Legionnaire is an AI-powered SIEM (Security Information and Event Management) platform designed for comprehensive, automated threat detection and response. It operates as a modular and unobtrusive security solution, consisting of a client, a control-server and web dashboard. Built by a team of 6 undergrads in <48 Hours for the UQCS Hackathon 2025.", "date_modified": "2025-08-21T00:00:00.000Z", "date_published": "2025-08-21T00:00:00.000Z" }, { "id": "https://smp46.me/blog/PandorasBox", "content_html": "

\"Title\"

\n

Pandora's Box: Developing an LLM-Powered Web Honeypot in 96 Hours

\n

The Problem

\n

The idea of a Honeypot is to detect and collect information on attackers by\npretending to be an open server. The downside is that Honeypots normally return\na static, generic or no response, which can tip off attackers and prevent\ndefenders from gaining valuable insights.

\n

The Solution

\n

In the age of large language models, why settle for a basic response? Given that\na web request is all readable text, it is relatively simple to feed it to and\nget a response from a Large Language Model. So that is what my team and I, the\naptly named Honeypotters, set out to do.

\n

Does it Already Exist?

\n

Yes, in a way. Through my research I found an existing solution called\nGalah. It largely achieves what we aimed for: using\nan LLM to produce realistic, relevant responses to web requests. However, Galah\nhas a drawback. It depends on external LLMs via online APIs, which simplifies\nthings but introduces a significant issue: latency. Most web servers should\nrespond in under 100 milliseconds, depending on your network connection. Online\nLLMs like ChatGPT and Gemini are large, and their response times can be slow,\nparticularly when using inexpensive or free APIs. More importantly, these times\ncan vary significantly. Neither of these factors is ideal when trying to imitate\na real web server.

\n

What Makes Pandora's Box Different

\n

My team thought, given how good modern LLM tech is and how fast computers are,\nwhy not create a specialised purpose-built one? Which is exactly what we (by\nwhich I mean Brandon, our machine learning major) did. As a base model, we chose\na distilled low-parameter version of GPT2,\ndistilbert/distilgpt2, mostly\nbecause it is relatively fast to train, and very fast to run (with GPU\nacceleration).

\n

One of the largest challenges of training a model is finding relevant existing\ndata and collecting our own. Ideally, you want a large amount of data to train\nwith; however, given our time frame, we could not collect and prepare enough\ndata to be useful. Instead, a large amount of the data was\nsynthesised;\nthis approach gave us lots of data in the exact format needed for training.

\n

The final model we used can be found on\nHuggingface. It was\ntrained in about 15 minutes using 20,000 examples and can produce a result in\nunder a second with an RTX 5070 Ti.

\n

Putting it Together

\n

While the LLM was the honeypot's primary component, we still needed software for\nthe honeypotting operations. We chose Golang for its speed, excellent built-in\nHTTP server support, and straightforward multithreading. Lachlan and I developed\nthis component. It consists of an HTTP server that listens for requests, sends\nthem to the LLM running behind a Python Flask API, receives the response, and\nconverts and sends it back to the client. We also included extensive statistics\ncollection to provide a good user interface.

\n

My biggest contribution was the dashboard and statistics collection. This is a\nminimal Next.js website that displays the request count, uptime, average\nresponse time, an overview of the 10 most requested items and their responses,\nand charts categorising each request. For categorisation, I used Gemini Flash 2\nbecause it's free and the task doesn't demand rapid responses.

\n

\"dashboard_1\"\n\"dashboard_2\"

\n

The Result

\n

What Could Be Better

\n

Our model's biggest limitation was the synthesised data. Ideally, we would have\npreferred to use much more real and unique data. This would have led to a far\nmore varied and creative model. Handling HTTPS would have also been beneficial,\nthough I'm still unsure how to manage certificates for that.

\n

Something Functional

\n

To our surprise, we successfully built what we set out to achieve: a working\nAI-powered honeypot. Sending a request to the server yields a unique response\neach time, as intended. For example, if you visit the server IP in a browser,\nyou'll see one of several variations of different webpages. None of these pages\nare actually stored on the honeypot; they are all generated on the fly by our\nmodel.

\n

If you're interested in giving Pandora's Box a try, you can view the GitHub\nthrough the link on this page. There you will also find our DevPost submission\nstory and video.

\n", "url": "https://smp46.me/blog/PandorasBox", "title": "Pandora's Box", "summary": "Pandora's Box is an AI powered web honey-pot that utilises a fine-tuned version of distilgpt2. Through the use of an LLM, relevant, specific and personalised responses can be made to every incoming request. The project was built with Python, Golang and Typescript.\n🏆 5th Place Winner at LaunchHacks IV Hackathon 🏆", "date_modified": "2025-07-15T00:00:00.000Z", "date_published": "2025-07-15T00:00:00.000Z" }, { "id": "https://smp46.me/blog/FileFerry", "content_html": "

\"Flow

\n

FileFerry: Building a Secure, Peer-to-Peer File Sharing App from Scratch

\n

Motivation

\n

I'm a big fan and user of the file-sharing utility\nmagic-wormhole. An\neasy-to-use utility that allows you to transfer a folder or file between any two\ndevices running a magic-wormhole client, using a phrase to connect. However, the\nclient is, in my opinion, the limiting factor. It is usually a command-line\nutility, requiring both a command line and a computer to run it on. Although it\ncan be run through Termux on Android, that's not quite the user experience I'm\nafter. So what if I could bring magic-wormhole to the browser?

\n

Initial Idea: Literally Bring Magic-Wormhole to the Browser

\n

Being the genius I am (sarcasm) I thought I could literally just bring the\nwormhole-client to the browser. My preferred client is\nwormhole-william, an\nimplementation of magic-wormhole written in Golang. A cool feature of Golang is\nthat anything can be compiled to WASM, WebAssembly. So I thought I could just\nmake a web interface for wormhole-william, compile it to WASM and boom,\nbrowser-based file sharing!

\n

No, that's not how it works :(

\n

Peer-to-Peer in the Browser and its Limitations

\n

While WASM is super cool tech, a browser is still a browser. And that means\nlimitations.\nFor today, the important limitation is \"You cannot access the network in an\nunpermissioned way.\" This means the traditional and established method of TCP\nhole-punching to establish direct network connections between two otherwise\nunconnected peers doesn't work. I guess this is understandable, but it did throw\na spanner in the works. Magic-wormhole works exclusively via TCP\nhole-punching, a fact I discovered only after building a basic prototype in the\nbrowser.

\n

So what can you do in the browser?

\n

WebSockets\nand WebRTC are\nwhat you can do in the browser. WebSockets are our equivalent of a basic TCP\nstream in the browser. The WebSockets API \"makes it possible to open a two-way\ninteractive communication session between the user's browser and a server.\"\nWhich sounds pretty neat, I'm going to need to make some connections beyond HTTP\nrequests. And WebRTC \"enables Web applications and sites to ... exchange\narbitrary data between browsers without requiring an intermediary.\" Sounds like\nexactly what I would need for a browser-based file sharing application, how\neasy. With WebSockets for creating streams and WebRTC as our transfer protocol,\nall it needs is some magic to get the direct connection.

\n

The (Imperfect) Magic: libp2p

\n
\n

libp2p is an open source networking library used by the world's most important\ndistributed systems such as Ethereum, IPFS, Filecoin, Optimism and countless\nothers. There are native implementations in Go, Rust, Javascript, C++,\nNim, Java/Kotlin, Python, .Net, Swift and Zig. It is the simplest solution for\nglobal scale peer-to-peer networking and includes support for pub-sub message\npassing, distributed hash tables, NAT hole punching and browser-to-browser\ndirect communication.

\n
\n

Libp2p is what I used to build FileFerry, and it is awesome. As a whole, libp2p\nis a specification for bringing together a lot of cool networking technologies\ninto a single framework. And look right there in the blurb it says it supports\nJavascript, hole punching and direct browser-to-browser communication.

\n

Okay, so the scope of the project has increased a little... but it seems I have\nthe tools to make my browser-based alternative to magic-wormhole.

\n

Building FileFerry with js-libp2p

\n

This has been a long journey, and let's just say I'm glad Neovim doesn't keep\ntrack of usage by number of hours.

\n

Wrangling js-libp2p

\n

While js-libp2p does handle the magic, it isn't exactly simple nor\nstraightforward. I started with this\nwebrtc browser-to-browser example\nand went from there. Unfortunately, while libp2p has some cool built-in\nprotocols like gossip-sub for chat apps. It doesn't offer a file transfer\nprotocol, so that was my job to implement. But in theory if I can get a stream,\nI should be able to just push some data through it, save it on the other end and\nboom, file-sharing done. Well, in a perfect world maybe, but I found WebSockets\nand WebRTC isn't exactly tailored to shoving large amounts of data through a\nstream as fast as possible. Connection stability was a gigantic headache,\nconnections will drop and handling that is a pain.

\n

Complete Transfers over Incomplete Connections

\n

The general idea seemed easy, if I just track at an application level how far\nthrough a file transfer the app is then if a connection drops, it can reconnect\nand keep on going. And that's how I started. But there are issues:

\n\n

To address the first issue, I implemented a Connection Management class that\nkeeps track of, handles and directs connections. I also made it the Sender's job\nto reconnect upon connection loss. It sounds simple now, but working out how the\nspecific implementation required a lot of reading of the\nlibp2p spec, reading the\nsource code and trial and error.

\n

The second issue was much easier, and dare I say fun. Hashing to produce a\nchecksum. Now most hashing I can think of works by taking a complete file and\nprocessing it all at once. But only the Sender has a complete file, at least\nuntil the transfer is done. Instead of having the Receiver process the whole\nfile again and hash it after receiving, I decided I could do it during the\ntransfer. I could do it during the transfer, this way it would be less of an\nissue if the connection dropped as well. So I picked an algorithm I had actually\nused in the Algorithms and Datastructures class I took at uni, FNV1a because it\nis\nvery fast and relatively secure.\nOkay so now the Sender makes the initial checksum part of the file header and\nthe Receiver can compare its final result against it. Another issue down.

\n

The final issue, I also solved thanks to some networking basics I was taught at\nuni. The stream behaves like a UDP connection, you can write and read data to it\nbut who is to say whether that data did or didn't arrive. So I thought what if I\ntook a page from TCP and implemented an ACKnowledgement system. Basically,\nevery 200 chunks the Sender will stop sending and wait for the Receiver to send\nan acknowledgement that it has received the last batch. This helped especially\nwhen connection drop-outs occurred, often the sender would reconnect and keep\nblasting data while the receiver is still trying to catch up.

\n

A (Poorly Made) Overview

\n

\"Flow

\n

The UI

\n

The nautical theme was picked mostly because I was looking for something\ninteresting. As I'm not a UI designer, it felt easier to make something a little\ndifferent. The site uses purely HTML/Typescript/TailwindCSS. And I'm not ashamed\nto admit Claude Opus was definitely the lead CSS designer, I thought it was\npretty incredible the stuff it can come up with purely in CSS. Zero pre-rendered\nassets (images) are used, it's all CSS, SVGs and text.

\n

The Backend

\n

To bring it all together, I self-host two of the three required back-end\nservers:

\n\n

The Result: A Demo

\n

\"Demo\"

\n

Visit fileferry.xyz to try it yourself!

\n

To Conclude

\n

This turned into a really fun and challenging project, and has definitely\ninspired me to work further with the libp2p framework in the future. Due to the\ncomplexity of the project I spent a long time getting into the weeds, reading\nand trying to understand the source code of js-libp2p. I ran into many problems\nthat neither Google nor ChatGPT could help me with, which made it a very\nrewarding project to complete.

\n

But for now, I am finished with FileFerry and will enjoy my new easy way to\nshare files in the browser.

\n", "url": "https://smp46.me/blog/FileFerry", "title": "FileFerry", "summary": "FileFerry is a browser-based, peer-to-peer file sharing application that allows for secure, direct file transfers between any two browsers. Inspired by the command-line utility magic-wormhole, it uses WebRTC and the libp2p library. The project implement a custom protocol for reliable transfers over unstable connections, including checksum validation and an acknowledgement system, built with TypeScript and TailwindCSS.", "date_modified": "2025-06-23T00:00:00.000Z", "date_published": "2025-06-23T00:00:00.000Z" }, { "id": "https://smp46.me/blog/SmartGarage", "content_html": "

SmartGarage: A DIY Wireless Garage Door Control System with a Side of Machine Learning

\n

Introduction

\n

One day my friend Howie was over and he saw me open my garage door. Naturally he\ngot out the Flipper Zero he always carries in his bag and asks I use my garage\ndoor fob again, he captures and resends it easily. I thought that was odd,\nshouldn't there be, I don't know, maybe at least rolling codes on any modern\ngarage door opener. But I was inspired and thought if it's that easy surely I\ncould automate that with some lower cost hardware.

\n

So I get thinking and I come up with the project you're reading right now. To\ncreate a system to remotely open and close my garage door without physically\nmodifying the opener itself (I live in a rental so unfortunately this was a\nrequirement). But hey that doesn't sound very ambitious and the year is 2024, so\nI have to to add aRtIFicIAlL inTELiGeNce in here somewhere. In all seriousness,\noften I leave the house and five minutes later start wondering if I did close my\ngarage door. So what if instead of wondering I could just check my\nhomepage dashboard, or even get an email\nnotification if the garage door has been open too long. And how can I check if\nthe door is open or not without wiring anything in, easy, I'll just train an\nimage recognition model that can tell me just that.

\n

This project ended up combining hardware hacking, machine learning, and a web\ninterface to create a practical solution using nothing but off the shelf (or the\ninternet) parts and a bit of coding elbow grease.

\n

\"Final

\n

It's hard to show a teaser of the final result because it has so many parts, but\nhere is what was the hardest part of the solution (I swear it's not a bomb).

\n

Initial Attempts and Challenges

\n

I started with the hard part, how do I clone the garage door fob signal and\nresend it on command. The fob uses the 433MHz band for transmission. I happened\nto have a Raspberry Pi Zero W sitting in my drawer, so I go online and find the\nTexas Instruments CC1101 Sub-GHz transceiver (the same chip that is used by the\nFlipper Zero). This should let me capture the signal and sent it right on back.\nMore searching and I see there's plenty of drivers and other projects for this\ntransceiver so it can't be that hard to use right. I order one, a breadboard and\na wiring kit to put it together.

\n

As soon as I got it, I started trying to cobble something together, I try a\nCC1101 driver library, I find\na python interface for it even.\nBut I don't have much luck.

\n

I get to the stage where I can receive some kind of signal, however to be honest\nI know nothing about radio and I'm only a CompSci student not an electrical\nengineer. Even though it is the same chip used by the Flipper Zero, there seems\nto be fair bit of special sauce that goes into being able to pull a signal out\nof the air cleanly, then resend it. Documentation for the drivers didn't make\nmuch sense to me, if there was any at all. So after weeks of trying I decide to\npivot to another approach (definitely not a skill issue... okay maybe a little).

\n

So I have a crack at the other side of the project, training an image\nrecognition model to tell me if the door is open or not. I start with getting\nthe cheapest wireless security camera I can find off of chinese marketplace\nnumber 508 (banggood.com), configuring my firewall to never let\nit phone home (block all its internet access) and begin collecting data.

\n

To do this I write a\nBash script for fetching security camera snapshots,\nand make it into a systemd service on my Debian home server. A few weeks later\nand I have hundreds of thousands of photos that can be categorised as open or\nclosed.

\n

I spent a while then doing some research on how exactly this whole machine\nlearning stuff works. I decided on making something with PyTorch. A little later\nand some long discussions with Professor GPT I have\ntwo python scripts.\nThe first lets me add my own training data onto the MobileNetV2 model (a\nlightweight neural network designed for mobile devices) and configure it to\nprovide a binary output, the second loads my custom model, takes the input of a\npicture and outputs: \"Opened or Closed\". Neat! However, knowing my garage door\nwas actually left open doesn't really help me if I can't remotely close it.

\n

A little over a year goes by, life goes on, and my garage remains dumb :(

\n

However, recently while procrastinating some other programming assignments I\nremembered this project. And I thought, if a fob can open and close the door,\nmaybe I can just automate pressing the button on the fob. Sometimes the simplest\nsolutions are the ones staring you right in the face all along...

\n

The Hardware Hack

\n

So I ordered a couple of generic garage door fobs off eBay that were compatible\nwith my opener. After adding them to the garage door (following the actual\nprocess in the manual), I started thinking about how I could simulate a button\npress with my Raspberry Pi Zero W.

\n

Now, I'm not an electrical engineer by any stretch, but I figured how\ncomplicated could a fob be. I carefully cracked open one of the generic fobs and\nexamined the PCB. After some poking around with a multimeter, I discovered the\nbutton on the fob just bridges two contacts on the PCB. If I could find a way to\nbridge those contacts on command from the Pi, I'd be cooking.

\n

\"fob_contacts\"

\n

Here is the naked fob and the contacts I needed to bridge.

\n

After some research, I figured out I needed:

\n\n

Here's the circuit I ended up with:\n\"circuit_diagram\"

\n

I soldered two thin wires to the contact points either side of the fob's button,\nran them to the relay, and connected everything according to the diagram above.\nThe idea is simple: when GPIO pin 17 goes high, it activates the relay, which\nbridges the contacts on the fob, which sends the signal to open/close the garage\ndoor.

\n

To test, I put together\na basic python script\nthat makes GPIO17 high for half a second. And shockingly, the door opens.\nYippee!

\n

You might be wondering what sleek professional way I put this all together:\n\"final_circuit\"\nI'm honestly not sure what the correct way to package something like this is, am\ndefinitely open to feedback if anyone has any better ideas.

\n

Teaching My Computer to See

\n

After getting the hardware working, I needed to tackle the \"smart\" part of my\nSmartGarage: teaching a computer to recognize whether my garage door was open or\nclosed from camera images. This meant dabbling in machine learning—specifically,\ncomputer vision.

\n

Data Collection: The Boring Bit

\n

Honestly, this was the hardest and most tedious part of making the image\nrecognition model. In order to train an accurate model, I needed a lot of data\nand I needed it categorised.

\n

So I used that\nbash script\nthat saves a picture every minute, or every second during \"peak times\" i.e.\ntimes when the door is mostly likely to be open, to collect a lot of data.

\n

The result:

\n
$ ls ~/garage/training_imgs | wc -l\n158332\n
\n

Now that might look nice - more data is more better right? Not quite. When\ntraining a model I discovered a good dataset is a balanced dataset. And\nbalance is difficult when most of the time the garage door is not open. To\nremedy this I made that bash script collect more often when the door might be\nopen and used a\npython script\nthat creates a lot of permutations of the same pictures.

\n

But how do you categorise all those pictures? Slowly and manually...

\n

The specific software I use is XnView MP\nwhich is just a more effecient image library manager with support for batch\nrenaming. That and moving the data set to a RAMdisk while I'm working with it\nhelped speed things up. As turns out handling over 150 thouseand ~20Kb files\nisn't super easy. Here's a snapshot of the exciting action:

\n

\"sorting_gif\"

\n

The Training Process

\n

During my first attempt at this project, I put together\na set of Python scripts\nusing the MobileNetV2 model with PyTorch. For this revival, I upgraded to\nMobileNetV3, which is meant to be better overall without needing additional\ncompute, and made some adjustments to optimize the images for training.

\n
Why the MobileNet Model?
\n

I picked MobileNet predominantly because it's designed for mobile\ndevices—meaning it's not very computationally expensive. This is important as\nthe model needs to run against an image every 10 seconds, 24/7.

\n

Also, I only have access to my Radeon 6950XT for training, which is a nice\ngaming GPU but in the world of AI, it's not particularly powerful. I tried\ntraining with the ConvNeXtV2 model (a much newer and heavier model), and the\ntraining was estimated to take 125 hours to complete. MobileNetV3, by\ncomparison, takes less than 45 minutes even with ~140,000 images in the dataset.

\n
How do Those Scripts Work?
\n

Click the Githhub icon on this page to view the project itself and all the code.

\n

But the workflow looks a little like this, I clone the repo, mount it alongside\nthe training images in the\nPyTorch docker container\nand do something like this:

\n
root@docker:/train# python3 binaryTrainer.py train\nEnter the path to the training images: /train/training_imgs_sorted/\nWhat is the object you are trying to classify? Garage Door\nEnter the classification names separated by a comma: open,closed\nEnter the model name to save as: may10_bigdata_10_epochs\nEnter the number of epochs: 10\n
\n

The output of that model will be a file called may10_bigdata_10_epochs.pth and\na config.ini, this contains the additional training data needed for\npredictions and the configuration for the other script justPredict.py. That\nsecond script allows me to just pass it a file:

\n
root@docker:/train# python3 justPredict.py testing_imgs/garage5.jpg\nopen\n
\n
Can I Give it a Go?
\n

Please! I tried making the script fairly user-friendly, mostly so I don't have\nto remember the intricacies when I want to update / train a new model.\nCurrently, it is limited to a binary output i.e. a True or False classification.\nBut it does have some nice features like a progress bar and stopping training\nwhen it detects accuracy loss.

\n

Was all this ML stuff necessary? Probably not. Was there a simpler way to\nachieve this? Absolutely. But where's the fun in that? Plus, I learned a fair\nbit about machine learning in the process, which was kind of the point.

\n

Putting It All Together

\n

With the hardware and machine learning components working, I needed a way to tie\neverything together into a cohesive system. Let me illustrate the architecture\nand then I can explain why this is a perfectly sane project (and not at all an\novercomplicated solution to a problem that probably has a $20 commercial\nalternative):

\n
Architecture
\n

The system follows a microservices approach, with each component handling a\nspecific responsibility and communicating via HTTP APIs. The Rust HTTP server\nruns on the Pi Zero W, the rest is running on my homeserver both in and out of\nDocker containers.

\n

\"software_architecture\"

\n
The Components
\n
Rust HTTP API Server
\n

Initially this was just another Python FastAPI, but I switched to Rust, using\nAxum Server, because it needs to be running 24/7 and I don't want my poor little\nPi Zero W running too hard. The server listens on port 3000 for a POST request\nto the /toggle endpoint with the correct authorisation token, when received it\ntriggers the button on the fob and the door opens or closes.

\n
Fetch Video Snapshot Bash Script
\n

Every second a script retrieves a snapshot of the garage security camera feed\nand saves this to a RAM-disk. A RAM-disk is used to prevent excessive wear and\ntear from constant writes to the system drive. A custom systemd service is used\nto trigger this every second, as crontab is limited to once per minute.

\n
Image Recognition Script
\n

This is a variant of justPredict.py script mentioned before, except it reads a\nfile from a specified path and sends its results to the Garage Door Status API.\nAgain, I use a custom systemd service to keep this script running and restart it\non boot.

\n
Garage Door Status API
\n

Super simple Python HTTP API server, using FastAPI, that receives and stores the\ngarage door status from the Image Recognition Script and updates its internal\nlast_opened state if the status changes from closed to open. And then responds\nwith this data in a JSON response when a POST request is sent to\nhttp://garage-api:5000/status.

\n
Homepage Custom API Widget
\n

This widget lives on my homepage and provides the snapshot from the camera, the\nstatus of the door as reported by the Garage Door Status API and the time it was\nlast opened. The preview is using an iframe that just displays the snapshot\nimage, where the iframe html is mounted to the homepage docker container.

\n

\"widget_preview\"

\n
Email Notification Service
\n

This is a bash script that is run every minute with crontab. It checks the\nStatus API for the current status and the last opened status, if it is currently\nopen and the last opened was more than 10 minutes ago it sends a friendly email\nwith a link to the website to close it.

\n
SmartGarage Control Website
\n

This website provides the snapshot of the security camera and has a button that\nsends a POST request through an nginx proxy to the Rust HTTP API Server on the\nPi. The website is hosted via nginx through a Cloudflare Tunnel, using Google\nSSO it is protected against unwanted visitors.

\n

\"website\"

\n

The Result

\n
But at What Cost?
\n

Not too much actually, if we ignore how many hours I put into this, and there's\na few things leftover / will be used when I do something similar again.

\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n
ComponentCost (AUD)
Raspberry Pi Zero W$25.00
Pi GPIO Pins and Case$8.00
Generic Garage Door Fob$8.00
5V Relay$8.00
220 Ohm 0.5 Watt Resistors$0.85
Soldering Iron kit$45.00
Total$94.85
\n
To Conclude
\n

After several months of development, testing, and refinement, I'm happy to\nreport that my SmartGarage system has been running reliably for over a month\nnow. The system successfully:

\n\n

Here's a demo of it in action:\n\"demo\"

\n", "url": "https://smp46.me/blog/SmartGarage", "title": "SmartGarage", "summary": "SmartGarage is a custom-built IoT solution that enables remote garage door control without modifying the original opener. Using a Raspberry Pi, relay circuit, and machine learning image recognition, it provides door status monitoring, remote operation, and automated notifications—all built with off-the-shelf components.", "date_modified": "2025-05-14T00:00:00.000Z", "date_published": "2025-05-11T00:00:00.000Z" } ] }